Wiki · Concept · Last reviewed June 25, 2026

OpenInference

OpenInference is a semantic-convention and instrumentation project for AI application observability, built on OpenTelemetry so LLM calls, agent steps, tool invocations, retrieval operations, and related workloads can be represented as distributed traces.

Category: Concept Published: June 25, 2026 Modified: June 25, 2026 Last reviewed: June 25, 2026 Tags: AI agents, observability, OpenInference, OpenTelemetry, tracing, audit trails

Definition

OpenInference is an OpenTelemetry-aligned vocabulary for AI application traces. Its specification defines how an AI run should be represented when the run includes model calls, tool calls, retrieval, embeddings, reranking, guardrails, prompt rendering, evaluation, or agent control flow. OpenTelemetry supplies the tracing model and transport compatibility; OpenInference adds AI-specific span kinds and attribute names.

The project is especially relevant to agentic systems because ordinary request logs rarely explain how a model moved from a user instruction to a tool action. A trace can connect a root request to child spans for model calls, retrieval steps, tool execution, evaluation, and final response, making the path inspectable after the fact.

Snapshot

Core artifact: semantic conventions for AI application traces, plus instrumentation libraries in the public Arize-ai/openinference repository.
Built on: OpenTelemetry tracing and OTLP-compatible trace export.
Span kinds: LLM, Agent, Chain, Tool, Retriever, Reranker, Embedding, Guardrail, Evaluator, and Prompt.
Not the same as: the OpenTelemetry GenAI semantic conventions, a vendor dashboard, an audit trail by itself, or proof that an AI system is safe.
Main governance value: gives incident reviewers and operators common names for model calls, tool boundaries, retrieved context, token use, and errors.

How It Works

An OpenInference trace is a tree of spans. The root span usually represents the whole agent turn, chatbot request, or pipeline invocation. Child spans represent operations inside that run: a language-model call, a vector-store retrieval, a tool execution, an embedding generation, or an evaluator checking an output.

The key classifier is openinference.span.kind. It tells a tracing backend whether a span is an LLM operation, an agent reasoning block, a retrieval step, a tool call, or another AI-specific unit of work. OpenInference attributes then carry structured context such as input and output values, model names, invocation parameters, token counts, retrieved documents, tool arguments, tool results, embedding metadata, and prompt-template information.

This is useful because AI failures often occur between components. A bad answer may come from a weak retrieval query, stale source material, an unsafe tool result, a malformed tool argument, a model refusal, a guardrail rewrite, or a human approval gap. A flat log transcript blurs those boundaries. A structured trace keeps them separate enough to diagnose.

Governance and Safety

OpenInference can support AI Agent Observability and AI Audit Trails, but it should be treated as evidence plumbing rather than governance by itself. A trace helps answer what happened; it does not decide whether the system was authorized to act, whether the action was lawful, or whether the model's output was reliable.

The privacy problem is direct. The OpenInference configuration page lists controls for hiding inputs, outputs, input messages, output messages, tool definitions, prompt text, input images, embedding vectors, and embedding text. When content is hidden, the specification uses a redaction placeholder so downstream consumers know the field was intentionally withheld. That design acknowledges the central tradeoff: observability must preserve enough evidence for review without turning prompts, files, tools, images, and embeddings into a surveillance archive.

For serious deployments, OpenInference traces should be tied to retention classes, access controls, redaction policy, sampling rules, incident procedures, and system inventory records. Debug traces, security evidence, user-facing receipts, legal holds, and aggregate metrics should not all share one storage path.

Limits

OpenInference is not a safety certification. A complete-looking trace can still omit decisive context if instrumentation is partial, sampling drops the relevant spans, a tool runs outside the trace boundary, or sensitive fields are redacted without preserving useful references. The trace vocabulary also does not settle conflicts with the newer OpenTelemetry GenAI semantic-conventions work; teams need to document which convention, version, and mapping they use.

The harder institutional risk is trace theater. A dashboard can display impressive spans while no one has authority to interrupt an unsafe run, roll back side effects, notify affected people, or change the deployment. The convention helps memory; governance still needs owners, thresholds, and consequences.

Source Discipline

Use OpenInference primary sources for claims about its span kinds, attributes, trace model, and privacy configuration. Use OpenTelemetry sources for claims about traces, spans, context propagation, and OTLP compatibility. Use product documentation only for vendor support claims. Do not cite an OpenInference trace as proof of model correctness or regulatory compliance; cite it as structured evidence about a specific run.

Spiralist Reading

OpenInference is grammar for delegated machine work. It marks the difference between the model's sentence, the retrieved source, the tool call, the guardrail check, and the action that left the system. For Spiralism, that matters because responsibility needs handles. A trace is not the truth, but it can stop a machine act from dissolving into fog.

Open Questions

Which OpenInference span kinds should be mandatory for high-stakes agent workflows?
How should systems map between OpenInference and OpenTelemetry GenAI conventions as both ecosystems evolve?
What fields should user-facing run receipts expose without leaking secrets, personal data, or internal instructions?
How should traces represent multi-agent delegation when one agent invokes another through a tool or protocol?

Sources

OpenInference, OpenInference Specification, reviewed June 25, 2026.
OpenInference, Semantic Conventions, reviewed June 25, 2026.
OpenInference, Traces, reviewed June 25, 2026.
OpenInference, Configuration and redaction controls, reviewed June 25, 2026.
Arize-ai, openinference repository, reviewed June 25, 2026.
OpenTelemetry, Traces, reviewed June 25, 2026.

Return to Wiki