Wiki · Concept · Last reviewed June 25, 2026

Kubernetes PodDisruptionBudget

Kubernetes PodDisruptionBudget lets application owners declare how many selected pods may be down during voluntary disruption.

Definition

Kubernetes PodDisruptionBudget, often shortened to PDB, is a policy/v1 API object that defines the maximum disruption allowed for a selected collection of pods. Kubernetes documentation lists the feature as stable since v1.21. A PDB is normally used for replicated workloads where maintenance should not remove too much capacity at once.

The core idea is narrow: a PDB limits voluntary disruptions, not every cause of downtime. Kubernetes documentation distinguishes involuntary disruptions, such as node failure, kernel panic, network partition, cloud provider failure, or node-pressure eviction, from voluntary disruptions such as node draining, cluster scale-down, or removing a pod to make room for another workload. PDBs shape the latter class when tools use the Eviction API.

How It Works

A PodDisruptionBudget selects pods with a label selector. The selector should usually match the selector of the Deployment, ReplicaSet, ReplicationController, or StatefulSet that owns the application. The intended number of pods is computed from the managing workload's desired replicas, discovered through pod owner references.

The budget can express either minAvailable or maxUnavailable, but not both. minAvailable describes how many selected pods must remain available after an eviction. maxUnavailable describes how many selected pods may be unavailable after an eviction. Both fields can be integers or percentages. Kubernetes rounds percentages up, which matters for small replica counts.

The PDB status reports fields such as currentHealthy, desiredHealthy, disruptionsAllowed, and expectedPods. Kubernetes currently treats a pod as healthy for this purpose when the pod has a Ready condition with status True. The unhealthyPodEvictionPolicy field controls when running but not yet healthy pods may be evicted. The default corresponds to IfHealthyBudget; AlwaysAllow permits eviction of unhealthy running pods even when the PDB criteria would otherwise block it.

When kubectl drain removes a node from service, safe evictions honor graceful termination and respect the PDBs that apply. The drain can block if the next eviction would violate a budget. This is intentional: a budget is a pacing rule for maintenance, not just metadata.

Agent Context

AI systems increasingly run as fleets: model-serving replicas, embedding workers, retrieval services, safety classifiers, queue consumers, tool servers, browser agents, and observability collectors. During a node upgrade or autoscaler compaction, simultaneous eviction of too many replicas can turn a minor maintenance action into user-visible failure or a gap in monitoring.

A PDB lets the service owner encode a disruption tolerance directly into the cluster. A quorum-like vector database might allow only one replica to be voluntarily disrupted. A stateless model endpoint might accept a percentage drop in capacity. A restartable batch evaluation job might not need a PDB at all, because replacement work can continue later.

Governance Use

A governance-grade PDB record should preserve the object name, namespace, selector, target workload, owner, replica count, minAvailable or maxUnavailable, unhealthy-pod eviction policy, expected operational consequence, exception path, and last review date. For AI infrastructure, it should identify whether the protected service supports live inference, safety review, incident response, human-facing work, or background research.

Review should ask what the budget protects and what it delays. A strict PDB can preserve service availability, but it can also block node maintenance or security patching until replacement capacity is available. A loose PDB may keep maintenance moving while exposing users or dependent agents to reduced capacity. The policy should name that tradeoff instead of hiding it behind a YAML field.

Limits

A PDB does not prevent direct deletion of a Deployment or pod, and Kubernetes documentation warns that not all voluntary disruptions are constrained by PDBs. It does not protect against node failure, all rollout behavior, data loss, dependency failure, bad model outputs, or unsafe tool use. Involuntary disruptions can still count against the budget after they happen, but the PDB cannot stop them.

PDBs should therefore sit beside Kubernetes PriorityClass, ResourceQuota, RuntimeClass, NetworkPolicy, Pod Security Standards, rollout strategy, service-level objectives, and incident runbooks.

Source Discipline

Claims about PDB semantics should cite Kubernetes disruption documentation, the task page for configuring a PDB, the PodDisruptionBudget API reference, safe node-drain documentation, and the kubectl reference when discussing command behavior. Claims about AI service impact should be labeled as deployment analysis, not as claims that Kubernetes understands model importance or public risk.

The useful evidence is operational: replica count, observed disruptions allowed, drain behavior, replacement startup time, readiness probe design, and whether the protected service actually carries live responsibility.

Spiralist Reading

Spiralism reads PodDisruptionBudget as an ethics of interruption. The cluster asks: how much absence can this service bear while the machinery is repaired?

For agent systems, the answer is not purely technical. Availability is a promise made to users, operators, dependent agents, and sometimes harmed parties waiting for a safety system to remain awake.

Sources


Return to Wiki