Blog · arXiv Analysis · Published: June 25, 2026

The Missing Stop Condition Becomes the Bill

Infinite agentic loops turn an ordinary user request into repeated model calls, tool calls, state growth, and external side effects when no effective bound covers the feedback path.

The Paper

The source is Xinyi Hou, Shenao Wang, Yanjie Zhao, and Haoyu Wang's When Agents Do Not Stop: Uncovering Infinite Agentic Loops in LLM Agents, arXiv:2607.01641 [cs.SE]. The arXiv record lists version 1 as submitted on July 2, 2026. The authors are at Huazhong University of Science and Technology, and the PDF names Haoyu Wang as the corresponding author.

The paper is useful because it treats the agent loop as an engineering object. Agents are not only prompts wrapped around a model. They are runtime systems that call models, call tools, update state, hand work to other agents, and decide whether to continue. A missing stop rule in that runtime can become an operating cost, a service-availability risk, or a repeated side effect.

The Failure

The paper defines an Infinite Agentic Loop, or IAL, as a structural execution failure in which an agentic feedback path repeatedly triggers model, tool, agent, or workflow execution without an effective stopping bound. The distinction matters. A loop is not automatically a bug. Normal agents need iteration. The failure appears when continuation depends on model outputs, tool observations, exception paths, external state, or delegation decisions, while the actual repeated path is not covered by a turn limit, retry cap, timeout, token budget, state-size guard, or human approval gate.

This is not a mystical autonomy story. It is a mundane systems problem that appears in a new costume. The agent may keep asking the model for a valid plan after parse failures. It may keep calling tools because the model returns another tool request. It may keep routing between agents because a termination message never satisfies the framework. Each cycle can consume API budget, occupy workers, enlarge message history, and repeat external actions.

The Scanner

The authors propose IAL-Scan, a Python static analyzer for downstream LLM agent projects. It normalizes code and framework behavior into a framework-independent Agent IR, builds an Agentic Loop Dependence Graph, and checks whether a reachable feedback path can repeatedly hit costly or state-growing operations without effective bound coverage.

The implementation supports eight agent frameworks: LangChain, LangGraph, CrewAI, AutoGen, LlamaIndex, the OpenAI Agents SDK, Google ADK, and Semantic Kernel. That coverage is important because many loop edges are not visible as simple while statements. Framework APIs can encode graph transitions, tool dispatch, retries, agent reentry, state updates, and handoffs.

Findings

The evaluation uses 6,549 Python LLM agent repositories with at least one GitHub star. The corpus contains 246,748 Python files and 33.41 million lines of Python code. IAL-Scan reports 74 potential findings. Manual review confirms 68 IAL failures across 47 projects, yielding 91.9 percent precision. The first two authors agreed on 94.6 percent of the 74 potential findings before resolving the remaining cases by discussion.

LangGraph and AutoGen account for 45 of the 68 confirmed findings, or 66.2 percent, across 31 projects. The paper does not claim those frameworks are uniquely unsafe; the point is that graph and conversation frameworks often encode feedback through APIs rather than obvious loops. Across the confirmed failures, retry feedback without bounds, tool-call iteration without bounds, and multi-agent chat without turn bounds account for 47 findings, or 69.1 percent.

The scanner is also compared with LLM-based baselines on a 264-project evaluation subset. The pure LLM API baseline covers only 23 of the 68 confirmed failures and produces 183 alerts. A Codex-based coding-agent baseline covers 50 confirmed failures but reaches a timeout or error limit in 75 projects and averages 141.86K tokens and 116.0 seconds per project. IAL-Scan's full configuration covers all 68 true positives with 6 false positives, averaging 4.2K tokens and 31.2 seconds per project.

Stop Receipt

An agent deployment receipt should not merely say which model and tools were enabled. For every loop-capable workflow, it should record the feedback path, continuation controller, bound type, bound value, scope of the bound, timeout, retry cap, token or cost budget, state-growth limit, tool-call limit, human-approval rule, external side-effect class, and the failure behavior when the bound is reached.

The audit question is simple: can this agent reenter a model, tool, workflow transition, or another agent without crossing a documented stop condition? If the answer is unclear, the deployment record is incomplete. A useful agent is still an unbounded process until the stop rule is part of the architecture rather than a wish placed in the prompt.

Limits

The paper is careful about scope. IAL-Scan is a static analyzer, so it over-approximates possible dependencies and may report false positives. It focuses on Python applications built with eight supported frameworks, so other languages, unsupported frameworks, and incomplete framework models may produce false negatives. It also has limited support for highly customized semantics, project-specific schedulers, external-state stopping logic, and semantic checks over natural-language outputs. The LLM-assisted pruning stage is explicitly treated as an optional negative filter whose judgments can be incomplete or unstable.

Those limits do not weaken the governance point. They sharpen it. Agent loops need explicit bounds at the runtime layer where feedback is created. Otherwise the bill, the logs, the worker queue, and the affected external system become the first reliable notice that the agent did not stop.

Sources

Xinyi Hou, Shenao Wang, Yanjie Zhao, and Haoyu Wang, When Agents Do Not Stop: Uncovering Infinite Agentic Loops in LLM Agents, arXiv:2607.01641 [cs.SE].
arXiv HTML for When Agents Do Not Stop, checked for abstract, problem definition, IAL-Scan design, evaluation setup, results, implications, and limitations.
arXiv PDF for When Agents Do Not Stop, checked against the title page, author metadata, framework list, dataset scale, baseline comparison, and limitation statements.

Return to Blog