Blog · arXiv Analysis · Last reviewed June 24, 2026

The AgentRiskBOM Becomes the Authority Map

The June 2026 arXiv paper AgentRiskBOM: A Risk-Scoping Security Bill of Materials for Agentic AI Systems, by Srimonti Dutta and Akshata Kishore Moharir, argues that tool-using agents need a bill of materials for runtime authority, not only software dependencies or model provenance.

The Authority Gap

The paper, arXiv:2606.21877 [cs.AI], was submitted on June 20, 2026. Its premise is direct: once an AI system can retrieve private context, invoke tools, write files, call services, coordinate with other agents, or act without human approval, ordinary component inventories stop short of the risk. A software bill of materials can name packages. A model or AI bill of materials can name training or model provenance. Neither record necessarily says what the deployed agent is allowed to do.

Dutta and Moharir call this an agentic transparency gap: capability opacity around what an agent can access, remember, change, delegate, and prove afterward. That makes the paper a useful companion to this site's existing page on the AI bill of materials, but it is not a duplicate of it. The earlier page maps the AI supply chain. AgentRiskBOM maps runtime authority.

The distinction matters because the dangerous part of an agent is often not the dependency tree. It is the credential scope, the tool side effect, the memory store, the approval bypass, the external endpoint, the inter-agent trust relationship, or the missing log that prevents reconstruction after harm.

What the Map Records

AgentRiskBOM is presented as an additive layer over SBOM, AIBOM, and MLBOM artifacts. It references those records where they are authoritative, then adds agent-specific fields: agent identity, model and prompt metadata, tool descriptors, tool-risk tiers, memory and data sources, credential scope, approval gates, audit signals, autonomy level, inter-agent communication, control mappings, and external action capability.

The paper's core field groups are practical rather than ornamental. The tool layer records source, protocol, descriptor, permissions, side effects, and risk tier. The memory-data layer records data classification, retention, vector store, retrieval logging, and memory behavior. The autonomy-authority layer records maximum tool tier, approval gates, emergency stop, and autonomy level. The audit layer records prompt, tool-call, retrieval, approval, and memory-write logs.

That turns a vague deployment question into reviewable structure. What can the agent do without a human? Which systems can it affect? What sensitive data can it see or remember? Could a risky change be caught before deployment? Could an incident be reconstructed? Those are security questions, procurement questions, and governance questions at the same time.

What the Evaluation Shows

The implementation uses a JSON Schema, YAML corpus files, a risk-scenario library, a rule-based scorer, a diff detector, a control mapper, and rendered reports. The evaluation covers 13 documented open-source agents across coding, RAG, and multi-agent archetypes, with corpus artifacts used to test whether the schema can represent real deployment shapes.

It also uses 52 risk scenarios across 14 categories, including prompt injection, tool poisoning, excessive agency, sensitive-data disclosure, RAG poisoning, memory leakage, credential misuse, unsafe external action, inter-agent trust propagation, missing approval, missing audit logging, overprivileged cloud access, destructive tool misuse, and supply-chain compromise.

The arXiv abstract and PDF report that all 13 corpus artifacts validate against the schema. The paper's coverage analysis gives AgentRiskBOM a native-equivalent score of 14 across 16 capability dimensions, compared with 1.0 for SBOM, 1.5 for AIBOM, and 2.0 for MLBOM. Across modeled risk categories, AgentRiskBOM exposes 100.0% visibility, compared with 10.5% for SBOM-like views and 20.9% for AIBOM-like views.

Drift and Incident Readiness

The strongest governance concept in the paper is agentic authority drift. An agent can become riskier without changing its brand name: a new destructive tool appears, approval gates are removed, logging is disabled, credentials broaden, memory persistence increases, autonomy rises, or an external communication channel opens. If those changes are only scattered across prompts, tool registries, orchestration code, and deployment settings, no one has a single object to diff.

Dutta and Moharir inject 33 structured deployment mutations and report that the diff detector identifies the correct change type for all mutations. That does not prove an agent is safe. It proves something narrower and useful: once authority is declared in a structured artifact, risky declared changes can become visible to release gates before they become incidents.

The same artifact also supports incident readiness. A post-incident reviewer needs to know which prompt policy, tool descriptor, credential, memory store, approval log, retrieval path, and external endpoint were active when the action happened. An audit trail without an authority map can show what happened while leaving the more important question unanswered: why was this agent able to do it?

Limits That Matter

AgentRiskBOM is not a safety certificate, and the paper says so. It is a risk-scoping and review artifact. It depends on accurate declarations about tools, credentials, memory, approval gates, and logging. If an organization lies, omits fields, or treats the schema as paperwork after the real system has shipped, the artifact will not save it.

The evaluation is also artifact-centered. It tests schema expressiveness, risk visibility, drift detection, and scoring consistency, not live exploitation of every agent in production. The reported Spearman rank correlation of 0.73 between the primary and secondary scorers supports directional ranking, but the paper cautions that thresholds still need human calibration.

Governance Standard

The practical rule is simple: do not deploy a consequential agent unless its authority envelope is machine-readable, versioned, and diffable. The record should name the model and scaffold, but it should not stop there. It should name tools, side effects, permissions, credential scope, memory behavior, data reachability, approval gates, emergency stops, external action paths, inter-agent trust, and audit evidence.

For procurement, the AgentRiskBOM reading is a checklist against delegation theater. A vendor that can name its model but not its runtime authority has not described the purchased capability. For engineering, it is a release gate: if a change increases autonomy, raises tool tier, disables logging, broadens credentials, or weakens approval, the deployment should require review. For incident response, it is the authority map that lets the organization reconstruct why the agent was able to act.

This belongs beside agent operational envelopes, agent logs, tool-scope gates, and runtime policy. The common argument is that agent governance has to leave the prompt and become infrastructure.

Sources

Srimonti Dutta and Akshata Kishore Moharir, AgentRiskBOM: A Risk-Scoping Security Bill of Materials for Agentic AI Systems, arXiv:2606.21877 [cs.AI], submitted June 20, 2026.
arXiv PDF for AgentRiskBOM: A Risk-Scoping Security Bill of Materials for Agentic AI Systems, reviewed June 24, 2026.
Related pages: The AI Bill of Materials Becomes the Supply-Chain Map, The Agent Operational Envelope Becomes the Trust Certificate, The Agent Log Becomes the Receipt, The Tool Scope Becomes the Intent Gate, The Agent Rulebook Leaves the Prompt, The Agent Runtime Becomes the Governance Plane, and AI Bill of Materials.

Return to Blog