Traceability and Repair

Agent Audit and Incident Review

A protocol for making AI agent work reviewable after the run is over. Prompt hardening sets the instruction boundary. Tool permissioning sets the action boundary. Audit practice makes the boundary visible.

The institution should assume that any AI agent with tools will eventually do something surprising. The question is not whether surprise can be eliminated. The question is whether the surprise leaves enough evidence for correction, care, and accountability.

An agent run without a trace is rumor. An agent run with a trace can be reviewed.

The Rule

No consequential agent action is complete until it can be reconstructed.

The record must be good enough for a reviewer to answer:

If the answer depends on memory, vibes, or a copied chat fragment, the workflow is not mature enough for consequential use.

Minimum Run Record

Every agent-assisted workflow should create a run record.

Record:

Field Purpose
Run ID Unique identifier for the agent run
Date and operator Who initiated the run
Workflow Research, drafting, build, support, CRM, archive, media, finance
Agent identity Model, app, account, service account, or vendor
Task brief The original human request
Permission class The class from Agent Tool Permission Protocol
Allowed tools Tools the agent was permitted to use
Sources touched Public URLs, local docs, internal records, restricted records
Tool calls Search, read, edit, send, deploy, database, shell, MCP, connector
Human gates Approvals requested, approved, denied, or skipped
Outputs Drafts, files, messages, summaries, tickets, commits, deployments
Exceptions Errors, refusals, guardrail trips, suspected injection, drift
Reviewer Person responsible for closing the run

For low-risk public drafting, this can be a small note in the work log. For agent runs with write, send, publish, payment, CRM, archive, or shell access, it should be a durable record.

Trace Requirements

Where the platform supports traces, preserve the structure of the run rather than only the final answer.

Capture:

Do not capture more sensitive material than needed. Tracing can itself become a data store. If a trace includes testimony, private contact records, donor records, care-circle notes, minor material, credentials, or incident records, the trace inherits the classification of the most sensitive material it contains.

Default position: traces for public research may contain source text and tool summaries. Traces for restricted workflows should avoid raw content where a reference, hash, record ID, or redacted excerpt is enough.

Redaction Standard

The audit trail should not become a second breach.

Redact or avoid storing:

When redaction changes the review value of the record, note that a fuller restricted record exists and identify who may access it.

Incident Triggers

Escalate an agent run to incident review when any of these occur:

When in doubt, open a small incident note. A small note can be closed. An unlogged failure cannot be repaired.

Incident Review Form

Use this form for agent incidents.

Incident ID:
Date opened:
Reporter:
Workflow:
Agent/system:
Operator:
Permission class:
Tools involved:
Data involved:
External destination:
What happened:
Expected behavior:
Actual behavior:
Human gates present:
Human gates missed:
Immediate containment:
People affected:
Records preserved:
Root cause:
Policy change:
Prompt change:
Permission change:
Tool change:
Reviewer:
Date closed:

Do not turn incident review into blame theater. The purpose is to preserve evidence, contain harm, repair what can be repaired, and lower the chance of repeat failure.

Weekly Agent Review

Each week that agents are used, review a small sample.

Review:

Ask:

  1. Did the run stay inside the original task?
  2. Did every tool call match the allowlist?
  3. Was the permission class correct?
  4. Were approvals specific enough?
  5. Did the trace omit necessary evidence?
  6. Did the trace store unnecessary sensitive data?
  7. Did the output cite sources honestly?
  8. Did the agent claim authority it did not have?
  9. Did the operator rely on the agent past the verification boundary?
  10. What should change before the next run?

The review should produce changes to prompts, permissions, tool registers, or training notes. If review produces no changes for months, the review is probably too passive.

Guardrail Feedback Loop

Audit is not an archive of embarrassment. It is a tuning loop.

For each material failure, decide whether the correction belongs in:

Do not fix a permission failure only with better wording. If an agent had a tool it should not have had, remove or narrow the tool. If an agent touched data it should not have touched, change access. If an agent repeatedly drifts when reading untrusted material, add blocking checks before the tool call, not only after the final answer.

Public Correction Rule

When agent-assisted work creates a public error, correct the public record.

Correction notes should state:

Do not use “AI error” as a way to avoid responsibility. The institution published the work; the institution corrects it.

Retention

Suggested retention:

Record Default retention
Low-risk public drafting run 90 days
Public research trace used for publication 1 year
Consequential action run 3 years
Finance, donor, legal, or governance agent run Match the governing record schedule
Incident review Permanent or board-defined archival term
Restricted testimony trace Avoid if possible; otherwise follow testimony consent and privacy policy

Retention must follow Privacy and Data Stewardship. Traces should not silently outlive the record class they contain.

Spiralism Policy

Spiralism agents with tool access must leave a reviewable run record. Any agent that can publish, deploy, send, delete, modify records, change permissions, make purchases, contact outsiders, or access restricted material must have explicit trace, approval, and incident-review handling before use.

This protocol pairs with:

Sources Checked