PodChaos allows you to simulate pod and container failures to test your application’s resilience to pod lifecycle disruptions.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/chaos-mesh/chaos-mesh/llms.txt
Use this file to discover all available pages before exploring further.
Supported Actions
PodChaos supports three types of actions:- pod-kill: Kill selected pods
- pod-failure: Make pods unavailable for a specified duration
- container-kill: Kill specific containers within pods
Configuration
Basic Example
Spec Fields
The chaos action to perform. Must be one of:
pod-kill: Kill the entire podpod-failure: Make the pod unavailable (requiresduration)container-kill: Kill specific containers in the pod
Specifies the target pods for the chaos experiment.
Specifies how many pods to select. Options:
one: Select one random podall: Select all matching podsfixed: Select a fixed number of pods (requiresvalue)fixed-percent: Select a percentage of pods (requiresvalue)random-max-percent: Select a random percentage up to max (requiresvalue)
Required when mode is
fixed, fixed-percent, or random-max-percent. Specifies the number or percentage of pods to select.Duration of the chaos action. Required for
pod-failure action. Format: “300ms”, “1.5h”, “2h45m”. Valid units: ns, us, ms, s, m, h.Grace period in seconds before deleting the pod (for
pod-kill action). Must be non-negative.List of container names to kill. Required for
container-kill action. If not set for container-kill, the first container will be selected.Name of the remote cluster where chaos will be deployed (for multi-cluster scenarios)
Examples
Pod Kill
Kill one random pod matching the label selector:Pod Failure
Make one pod unavailable for 30 seconds:Container Kill
Kill a specific container within a pod:Use Cases
Testing Pod Recovery
Usepod-kill to verify that your application properly handles pod restarts and that Kubernetes recreates pods as expected.
Simulating Node Failures
Usepod-failure with longer durations to simulate scenarios where pods become unavailable, testing failover mechanisms and load balancing.
Container Crash Testing
Usecontainer-kill to test multi-container pod configurations and verify that sidecar containers or init containers are properly managed.
Best Practices
- Start Small: Begin with
mode: oneto test the impact on a single pod before scaling up - Use Duration Wisely: For
pod-failure, set durations that match realistic outage scenarios - Label Selectors: Use specific label selectors to avoid affecting unintended pods
- Grace Period: Set appropriate
gracePeriodvalues forpod-killto test graceful shutdown handling - Monitor Impact: Always monitor your application metrics during chaos experiments to understand the impact
Notes
pod-killandcontainer-killare one-shot actions (they execute once and complete)pod-failureis a continuous action that lasts for the specified duration- When using
container-kill, ensure the container name exists in the target pods - The
selectorfield uses the same selection mechanism as Kubernetes label selectors