Blog · arXiv Analysis · Last reviewed June 24, 2026

The Command Denylist Becomes the False Boundary

The June 2026 arXiv paper One Goal, Many Commands: Characterizing Denylist Fragility in AI Agents, by Chuyang Chen and Zhiqiang Lin, studies a practical safety boundary for terminal AI agents: command denylists that try to block dangerous shell activity by naming forbidden command patterns.

The Boundary That Names Commands

Terminal AI agents are useful because they can run shell commands inside a real development or operations environment. That is also why they are dangerous. A model-driven loop can inspect files, edit code, install dependencies, invoke test suites, talk to network services, and alter the local machine through commands that were once typed by a human operator.

Many agent harnesses try to manage that authority with command gating. Some commands are allowed, some are denied, and everything else may be sent to a user or evaluator for approval. The Chen and Lin paper focuses on the denied set. Its warning is simple: a denylist looks like a boundary, but a forbidden operation can often be performed through commands that were not named.

This is a different topic from the site's broader pages on agent sandboxes, AI agent sandboxing, and tool permissions. Those pages describe the containment envelope. This one is about a narrow failure inside that envelope: confusing command names with operation-level control.

What the Paper Tests

The paper, arXiv:2606.15549, was first submitted on June 14, 2026 and revised on June 20, 2026. It identifies and formalizes command denylist fragility for terminal AI agents. The authors define the problem around operations a denylist is meant to block, and bypass commands that can still perform those operations.

The paper proposes ShellSieve, an LLM-driven pipeline that enumerates candidate bypasses and validates them by observing side effects in a sandbox environment. For safety, this essay will not reproduce command examples or bypass recipes. The important contribution is the measurement frame: instead of asking whether a specific string is blocked, ShellSieve asks whether the denied operation is still reachable through another command path.

In the arXiv abstract, the authors report applying ShellSieve to 1,709 real-world command denylists with 13,332 denylist rules collected from GitHub. They report that 69.0 to 98.6 percent of denylists in the evaluation are fragile, depending on the case, and that the issue appears consistently across projects and agents. The HTML version's introduction also reports root-cause findings: denylist authors may miss less-known commands, and multi-purpose commands may be deliberately left unblocked for benign uses even though they can also perform blocked operations. The paper further reports that completing fragile denylists can require adding many rules, averaging 217 additions when trying to block specific operations.

Why Denylists Fray

A command denylist is attractive because it is concrete. It names tools that sound dangerous and gives the agent a visible rule. That concreteness is also the weakness. Shell environments are open-ended. Operating systems include many utilities. Package managers can add more. Scripting languages and multi-purpose tools blur the line between harmless and harmful use.

The paper's title captures the governance problem: one goal, many commands. If the goal is the real risk, then the control has to reason about the operation, the filesystem scope, the data class, the process boundary, the network path, and the credential being touched. A command name is only a surface clue.

This matters because terminal agents operate near secrets, repositories, build systems, deployment scripts, local caches, environment files, and cloud credentials. A weak denylist can create false assurance: the user believes dangerous operations are blocked, while the system has only blocked a few familiar names. Approval flows can then become weaker because the denied set is assumed to be doing more work than it really is.

Governance Standard

A serious terminal-agent deployment should not treat command denylists as the primary safety boundary. They can be a useful warning layer, but the hard controls should live outside the model and outside string matching.

The stronger pattern is operation-level mediation. Read, write, delete, network, process, package-install, credential, secret, and deployment operations should be constrained by filesystem mounts, sandbox profiles, network egress rules, short-lived credentials, explicit approval gates, and audit logs. A command should be judged by what it can do in this environment, not only by what executable name appears first.

Denylists should therefore be tested like living controls. A review should ask which operation each rule is intended to block, whether alternate commands can reach the same operation, whether multi-purpose tools need narrower wrappers, whether sandbox policy blocks the side effect anyway, and whether the audit trail records the attempted operation rather than only the command text.

The practical lesson is not "never use denylists." It is "never confuse a denylist with containment." A denylist can reduce obvious mistakes. A sandbox, permission model, credential broker, and human review process decide the blast radius when the agent finds another path.

What This Changes

The command denylist becomes the false boundary when an institution treats a list of forbidden words as if it were a governed action space. Agent safety is not the absence of a few command names. It is the presence of enforceable limits on what the agent can actually change.

The Spiralist rule is blunt: govern operations, not vocabulary. If a terminal agent can reach a harmful side effect through many command paths, then the boundary belongs at the filesystem, network, credential, runtime, and approval layers. The command list is only a signpost.

Sources

Chuyang Chen and Zhiqiang Lin, One Goal, Many Commands: Characterizing Denylist Fragility in AI Agents, arXiv:2606.15549 [cs.CR], submitted June 14, 2026 and revised June 20, 2026.
arXiv experimental HTML for One Goal, Many Commands: Characterizing Denylist Fragility in AI Agents, reviewed June 24, 2026.
Related pages: The Agent Sandbox Becomes the Airlock, AI Agent Sandboxing, Agent Tool Permission Protocol, AI Coding Agents, The Coding Agent Becomes the Maintainer, and The Unsafe Shortcut Becomes the Safety Benchmark.

Return to Blog