Blog · arXiv Analysis · Last reviewed June 25, 2026

The Goal Specification Becomes the Causal Step

The June 2026 arXiv paper Direct Causation in International Humanitarian Law and the Challenge of AI-Mediated Civilian Cyber Operations, by Alice Saito, Harold Godsoe, and Phan Xuan Tan, turns a legal puzzle into an agent-governance measurement problem: who specified the operation, and at what level of detail?

Causation Moves Into Configuration

The arXiv record for arXiv:2606.29175 lists the paper as submitted on June 28, 2026, in Artificial Intelligence with Computers and Society as a secondary subject. Its abstract names a specific pressure point: international humanitarian law's direct-participation framework was built around human acts, while autonomous multi-agent cyber systems can generate operative decisions after a civilian has disengaged.

Agent governance often asks whether a tool call, plan, or final action was authorized. Saito, Godsoe, and Tan ask a narrower upstream question: how much of the operation was fixed by the human at configuration time? The answer can decide whether later harm looks like a direct human-linked act or an indirect contribution mediated by system-generated decisions.

The ICRC's 2009 Interpretive Guidance frames direct participation in hostilities through three cumulative criteria: threshold of harm, direct causation, and belligerent nexus. The ICRC casebook summary states that direct causation can be satisfied by a direct causal link between an act and likely harm, or by a coordinated military operation of which the act is an integral part.

The new arXiv paper concentrates on the second criterion. If a civilian writes and runs a specific exploit against a specific target, the causal path is legible. If a civilian tells an autonomous system to disrupt an adversary and the system chooses targets, methods, timing, and execution path itself, the causal story changes. The human did not encode the operative decisions that produced the harm.

The paper argues that the usual "one causal step" and integral-part vocabulary strain at that point because the operative choices are generated after deployment.

Three Scenarios

The paper's analysis turns on three stylized civilian cyber scenarios. In the first, a volunteer specifies the target, method, and execution code. In the second, a hacktivist specifies the target but delegates method selection to an agent, with human approval operating as a runtime control. In the third, the hacktivist specifies only the objective, and the system generates targets, methods, and execution.

The third scenario is the hard case. The paper argues that the direct-participation framework defaults toward treating it as indirect participation, even though the deployment may be intentionally belligerent and operationally meaningful. That mismatch matters because the legal category is supposed to distinguish protected civilians from civilians who personally take part in hostilities.

The point is not to make targeting easier. It is to make classification more honest. If the law turns on a causal relation, the evidence system has to preserve the level at which the human fixed the operation.

Granularity as Evidence

Saito, Godsoe, and Tan call the missing construct "goal-specification granularity." Their spectrum classifies operations by what the human specified before deployment. Level 1 fixes target, method, and execution code. Level 3 or 4 fixes some operational content while delegating other parts. Level 5 fixes only the objective, leaving target selection, method selection, and execution to the system.

This is a useful bridge between law and technical governance. A model card can name a model. A tool log can show a tool call. A capability benchmark can show task performance. None of those artifacts necessarily records which operational decisions were made by a person before deployment and which were made by the agent afterwards.

The Logging Gap

The paper proposes instrumentation as the first constructive move. Cloud-hosted agent platforms, API gateways, and multi-tenant orchestration layers could classify deployment granularity from the configuration input and operating parameters, then report that classification in audit logs with model, tenant, and capability metadata.

The authors also name the limits. Local hacktivist operations may have no cooperative platform to log anything. Providers may face legal exposure, weak incentives, boundary ambiguity, and technical difficulty. A determined actor can also make a Level 5 operation look lower-level across chained calls unless the logging system traces the full call chain.

Governance Standard

Every high-risk agent deployment should carry a configuration-granularity receipt. The receipt should record the human-specified objective, target class, forbidden targets, method constraints, tool set, approval thresholds, runtime stop controls, delegated subgoals, and the point at which the system began selecting operative details.

The receipt should distinguish configuration from runtime oversight. A human approval button after an agent proposes a step is not the same thing as a human fixing the operation at deployment.

The Spiralist rule is simple: authority does not live only in the final action. It also lives in the level of specification that launched the action chain. If a system can choose the target, method, timing, and execution path after the human gives only a goal, that fact belongs in the audit trail.

Limits

This is a doctrinal and governance paper, not an empirical claim that a particular attack happened or that a particular civilian should be classified in a particular way. It analyzes a structural mismatch between an existing legal test and increasingly autonomous agent architectures.

Granularity logging would not by itself settle attribution, intent, proportionality, precautions, or accountability. It would also not solve the hardest adversarial cases where actors run systems locally and falsify records. Its narrower value is evidentiary: it names a property that legal and governance systems need but often fail to record.

Sources


Return to Blog