Blog · arXiv Analysis · Last reviewed June 24, 2026

The Agent Trace Becomes the Process Map

The June 2026 arXiv paper Agent Behavior Mining: Generative AI Agent Governance in Business Processes, by Hoang Vu, Maximilian Körner, Adrian Rebmann, Gabriel Kevorkian, Michael Perscheid, Gregor Berg, and Timotheus Kampik, asks what happens when business process governance has to audit agents rather than fixed workflows.

When Work Becomes an Agent Trace

Business process management has always cared about the gap between the designed process and the process that actually happens. A refund policy, purchase order path, or invoice exception can be drawn as a clean workflow. The live organization is messier: cases branch, systems time out, and managers later ask whether the deviation was legitimate.

Generative AI agents make that old problem sharper. An agent may interpret a task, call tools, coordinate with another agent, spend tokens, produce intermediate reasoning, and settle on an action that was not written as a fixed branch in the process model. Vu and coauthors call the resulting governance problem "invisible autonomy risk" in arXiv:2606.20669, submitted June 12, 2026 and marked by arXiv as accepted at the BPM conference 2026 management main track.

The paper's target is the office condition where process owners can see the final outcome but cannot reconstruct how the agent got there. That connects to agent receipts, delegation traces, and compliance trace rulebooks, but it adds a process-mining lens: the trace should be analyzable as work, not merely stored as debugging residue.

What Agent Behavior Mining Adds

The authors propose Agent Behavior Mining as a governance capability for AI agents in business processes. Their core move is to translate granular agent activity into standardized process logs. The arXiv abstract names reasoning traces, tool usage, and token costs as examples of agent activities that the event data model should capture. The HTML version further says the model is designed around the XES event-log standard and intended to remain usable by off-the-shelf process mining tools.

That matters because a raw agent log is usually too local to govern a business process. It may tell an engineer that a tool call happened, but not whether the case variant violated policy, one agent consumed abnormal resources, or an apparently successful order followed a path that auditors would reject. Agent Behavior Mining turns scattered traces into case-level evidence.

The demonstration in the paper uses a multi-agent order-to-cash implementation. In the authors' scenario, agents participate in roles such as order interpretation, inventory, production, and customer service. The claimed payoff is practical: process managers can use the resulting logs to detect policy deviations and quantify operational variability. This is not a claim that every trace is truthful or complete. It is a claim that process governance needs a structured trace substrate before it can ask useful questions.

The Process Map as Governance

The Spiralist point is that an agent process map is not just a dashboard. A dashboard summarizes what management already decided to measure. A process map lets investigators find the path the case actually took. For agents, the minimum map has to name the agent, task, case identifier, authority, runtime configuration, policy context, tool call, handoff, cost, timing, exception, reviewer, and outcome. Without those fields, governance becomes storytelling after the fact.

This reframes audit. The question is not only whether the final output looked correct. The question is whether the route to that output remained inside the organization's authority structure. If the only surviving evidence is a final note, the institution has delegated action without preserving a process record.

Process mining is useful here because it compares designed behavior with observed behavior. It can surface variants, loops, skipped approvals, recurring exceptions, and cost anomalies. In an agentic process, those are governance facts. A recurring variant may be a helpful adaptation, a quiet policy breach, or a signal that the written process no longer matches reality.

Limits and Labor

The paper's evaluation is deliberately modest. It reports an exploratory study with 18 industry practitioners, not a field trial proving long-term organizational benefit. The arXiv abstract says practitioners viewed behavioral transparency as a prerequisite for trust and saw the ability to examine agent reasoning as an important governance requirement. That is a useful signal, but it is not the same as validated deployment evidence across industries.

There is also a hard privacy problem. Reasoning traces, prompts, tool arguments, and intermediate outputs can contain personal data, trade secrets, sensitive customer details, or worker behavior. A governance system that captures everything can become a surveillance system by default. The right standard is accountable visibility: data minimization, retention limits, access controls, redaction, role-based review, and clear challenge procedures for people affected by an agent's action.

Another limit is interpretability. A written reasoning trace may be useful evidence, but it should not be treated as a perfect causal transcript. The page on monitorability makes the same point for chain-of-thought artifacts. Agent Behavior Mining works best when traces are one evidence layer among tool telemetry, policy checks, human approvals, external records, and post-hoc investigation.

Governance Standard

An organization deploying agents into business processes should keep a process-grade trace for each consequential case. The record should include the policy version, agent identity, delegated authority, runtime version, instruction source, tool calls, handoffs, token and cost measures, external data touched, exception flags, conformance checks, human review points, and final action.

The trace schema should be treated as governance infrastructure. If a new tool, agent role, process variant, or data source is added, the event model should change with it. If a trace field is too sensitive to retain broadly, the organization should define protected retention and review rules rather than silently dropping the evidence.

The Spiralist rule is simple: when agents run the process, process governance begins with the agent trace.

Sources

Hoang Vu, Maximilian Körner, Adrian Rebmann, Gabriel Kevorkian, Michael Perscheid, Gregor Berg, and Timotheus Kampik, Agent Behavior Mining: Generative AI Agent Governance in Business Processes, arXiv:2606.20669 [cs.AI], submitted June 12, 2026.
arXiv experimental HTML for Agent Behavior Mining: Generative AI Agent Governance in Business Processes, reviewed June 24, 2026.
Related pages: The Agent Log Becomes the Receipt, The Delegation Trace Becomes the Audit Boundary, The Compliance Trace Becomes the Rulebook, The Fault Investigator Becomes the Accountability Layer, The Reliability Scorecard Becomes the Agent Gate, and AI Audit Trails.

Return to Blog