Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/chaos-mesh/chaos-mesh/llms.txt

Use this file to discover all available pages before exploring further.

Chaos Mesh supports 14 different chaos types for comprehensive fault injection across Kubernetes infrastructure and applications.

Overview

Each chaos type is implemented as a Kubernetes Custom Resource Definition (CRD) with its own controller and injection mechanisms. CRD Definitions: api/v1alpha1/
Controller Implementations: controllers/chaosimpl/
Daemon Implementations: pkg/chaosdaemon/

Pod-Level Chaos

PodChaos

Inject faults into pod lifecycle operations.

PodChaos

CRD: podchaos_types.go
Actions: pod-kill, pod-failure, container-kill
Actions:
Kills entire pods by deleting them from Kubernetes.Use Cases:
  • Test pod restart and recovery
  • Validate StatefulSet resilience
  • Test readiness probes
Parameters:
  • gracePeriod: Seconds before forced deletion (default: 0)
Example:
action: pod-kill
gracePeriod: 30
Source: api/v1alpha1/podchaos_types.go:43-54

Network Chaos

NetworkChaos

Inject network-level faults including latency, packet loss, and partitions.

NetworkChaos

CRD: networkchaos_types.go
Actions: netem, delay, loss, duplicate, corrupt, partition, bandwidth
Actions:
Add latency to network packets.Parameters:
  • latency: Delay duration (e.g., “100ms”)
  • jitter: Variation in delay (e.g., “10ms”)
  • correlation: Correlation percentage (0-100)
Example:
action: delay
delay:
  latency: "100ms"
  jitter: "10ms"
  correlation: "50"
Implementation: pkg/chaosdaemon/tc_server.go, pkg/chaosdaemon/netem/
Drop network packets randomly.Parameters:
  • loss: Percentage of packets to drop (0-100)
  • correlation: Correlation percentage
Example:
action: loss
loss:
  loss: "25"
  correlation: "25"
Duplicate network packets.Parameters:
  • duplicate: Percentage of packets to duplicate
  • correlation: Correlation percentage
Example:
action: duplicate
duplicate:
  duplicate: "10"
  correlation: "25"
Corrupt packet data.Parameters:
  • corrupt: Percentage of packets to corrupt
  • correlation: Correlation percentage
Example:
action: corrupt
corrupt:
  corrupt: "5"
Block network traffic between pods.Parameters:
  • direction: to, from, or both
  • target: Target pod selector
Example:
action: partition
direction: both
target:
  selector:
    namespaces: ["default"]
    labelSelectors:
      app: "target-app"
Limit network bandwidth.Parameters:
  • rate: Bandwidth limit (e.g., “1mbps”)
  • limit: Queue size in bytes
  • buffer: Token bucket buffer size
Example:
action: bandwidth
bandwidth:
  rate: "1mbps"
  limit: 20000
  buffer: 10000
Combine multiple network chaos effects.Parameters: Merge of delay, loss, duplicate, corrupt specsExample:
action: netem
delay:
  latency: "100ms"
loss:
  loss: "10"
Source: api/v1alpha1/networkchaos_types.go:48-73

I/O Chaos

IOChaos

Inject file system I/O faults.

IOChaos

CRD: iochaos_types.go
Actions: latency, fault, attrOverride, mistake
Actions:

latency

Add delay to I/O operations.Parameters:
  • delay: Duration (e.g., “100ms”)
  • path: File path pattern
  • methods: I/O methods to affect
  • percent: Percentage of operations (0-100)
action: latency
delay: "100ms"
path: "/var/data/*"
percent: 50

fault

Return errors on I/O operations.Parameters:
  • errno: Error code (e.g., 5 for EIO)
  • path: File path pattern
  • methods: I/O methods to affect
action: fault
errno: 5
path: "/data/file.txt"

attrOverride

Override file attributes.Parameters:
  • attr: Attribute overrides
  • path: File path pattern
action: attrOverride
attr:
  perm: 0000

mistake

Inject incorrect data into I/O.Parameters:
  • mistake: Mistake specification
  • path: File path pattern
action: mistake
path: "/data/*"
Implementation: FUSE-based interception (pkg/chaosdaemon/iochaos_server.go) Source: api/v1alpha1/iochaos_types.go:50-51

Stress Chaos

StressChaos

Generate CPU or memory stress on pods.

StressChaos

CRD: stresschaos_types.go
Stressors: CPU, Memory
Stressor Types:
Stress CPU cores.Parameters:
  • workers: Number of CPU workers
  • load: Load percentage per worker (0-100)
  • options: Additional stress-ng options
Example:
stressors:
  cpu:
    workers: 2
    load: 50
Source: api/v1alpha1/stresschaos_types.go:183-196
Implementation: Uses stress-ng (pkg/chaosdaemon/stress_server_linux.go)

Time Chaos

TimeChaos

Simulate clock skew by offsetting system time.

TimeChaos

CRD: timechaos_types.go
Mechanism: vDSO clock interception
Parameters:
  • timeOffset: Offset duration (e.g., “-1h”, “30m”, “5s”)
  • clockIds: Clock IDs to affect (CLOCK_REALTIME, CLOCK_MONOTONIC, etc.)
Example:
timeOffset: "-1h"
clockIds:
  - CLOCK_REALTIME
  - CLOCK_MONOTONIC
Use Cases:
  • Test time-sensitive logic
  • Validate timeout handling
  • Test distributed system clock drift
  • Certificate expiration testing
Implementation: Syscall interception (pkg/chaosdaemon/time_server_linux.go) Source: api/v1alpha1/timechaos_types.go:48-56

HTTP Chaos

HTTPChaos

Manipulate HTTP requests and responses.

HTTPChaos

CRD: httpchaos_types.go
Target: Request or Response
Actions:
  • Abort: Return error responses
  • Delay: Add latency
  • Replace: Modify request/response body
  • Patch: Modify headers
Matching Criteria:
  • port: Target port
  • path: URI path pattern
  • method: HTTP method (GET, POST, etc.)
  • code: Response status code
  • requestHeaders: Request header matchers
  • responseHeaders: Response header matchers
Example:
target: Request
port: 8080
path: "/api/*"
method: GET
abort: true
Implementation: Transparent proxy (pkg/chaosdaemon/httpchaos_server.go) Source: api/v1alpha1/httpchaos_types.go:40-86

DNS Chaos

DNSChaos

Inject DNS resolution errors.

DNSChaos

CRD: dnschaos_type.go
Actions: error, random
Actions:
Return DNS resolution errors.Example:
action: error
patterns:
  - "google.com"
  - "github.*"
Pattern Matching:
  • Exact match: "google.com"
  • Wildcard suffix: "github.*" (matches github.com, github.io)
  • Placeholder: "chaos-mes?.org" (matches chaos-mesh.org)
Implementation: Custom DNS server (pkg/chaosdaemon/dns_server.go) Source: api/v1alpha1/dnschaos_type.go:26-34

JVM Chaos

JVMChaos

Inject faults into JVM applications using Byteman.

JVMChaos

CRD: jvmchaos_types.go
Actions: latency, return, exception, stress, gc, ruleData, mysql
Actions:
Add delay to Java method invocations.Parameters:
  • class: Java class name
  • method: Method name
  • latency: Delay in milliseconds
action: latency
class: "com.example.Service"
method: "getData"
latency: 1000
Override method return values.Parameters:
  • class: Java class name
  • method: Method name
  • returnValue: Value to return
action: return
class: "com.example.Service"
method: "isEnabled"
returnValue: "false"
Throw exceptions from methods.Parameters:
  • class: Java class name
  • method: Method name
  • exception: Exception to throw
action: exception
class: "com.example.Service"
method: "process"
exception: "java.io.IOException"
Generate CPU or memory stress within JVM.Parameters:
  • cpuCount: Number of CPU cores to stress
  • memoryType: “stack” or “heap”
action: stress
cpuCount: 2
memoryType: "heap"
Trigger garbage collection.
action: gc
Execute custom Byteman rule.Parameters:
  • ruleData: Raw Byteman rule
action: ruleData
ruleData: |
  RULE custom rule
  CLASS com.example.Service
  METHOD process
  AT ENTRY
  IF true
  DO traceln("Method called")
  ENDRULE
Inject faults into MySQL JDBC operations.Parameters:
  • mysqlConnectorVersion: “5” or “8”
  • database: Database name pattern
  • table: Table name pattern
  • sqlType: SQL type (select, insert, update, delete, replace)
  • exception: Exception to throw
  • latency: Delay in milliseconds
action: mysql
mysqlConnectorVersion: "8"
database: "test"
table: "users"
sqlType: "select"
latency: 1000
Implementation: Byteman agent (pkg/chaosdaemon/jvm_server.go) Source: api/v1alpha1/jvmchaos_types.go:44-69

Kernel Chaos

KernelChaos

Inject faults into kernel functions using BPF.

KernelChaos

CRD: kernelchaos_types.go
Mechanism: BPF-based fault injection
Fail Types:
  • 0: slab allocation failures (kmalloc)
  • 1: page allocation failures
  • 2: bio (block I/O) failures
Parameters:
  • failtype: What to fail (0, 1, or 2)
  • headers: Required kernel headers
  • callchain: Specific call chain to target
  • probability: Percentage (0-100)
  • times: Maximum failure count
Example:
failKernRequest:
  failtype: 0  # slab allocation
  probability: 50
  times: 100
  callchain:
    - funcname: "ext4_mount"
      predicate: 'STRNCMP(name->name, "bananas", 8)'
Use Cases:
  • Test memory allocation failures
  • Validate error handling in file system operations
  • Test resilience to kernel-level faults
Source: api/v1alpha1/kernelchaos_types.go:58-98

Block Chaos

BlockChaos

Inject delays into block device I/O operations.

BlockChaos

CRD: blockchaos_types.go
Actions: delay
Parameters:
  • volumeName: Name of the volume
  • delay.latency: I/O delay duration
  • delay.jitter: Jitter amount
  • delay.correlation: Correlation percentage
Example:
action: delay
volumeName: "data-volume"
delay:
  latency: "100ms"
  jitter: "10ms"
  correlation: "50"
Use Cases:
  • Test application behavior under slow disks
  • Validate timeout handling
  • Test I/O-bound applications
Implementation: pkg/chaosdaemon/blockchaos_server_linux.go Source: api/v1alpha1/blockchaos_types.go:35-73

Cloud Provider Chaos

AWSChaos

Inject faults into AWS infrastructure.

AWSChaos

CRD: awschaos_types.go
Actions: ec2-stop, ec2-restart, detach-volume
Actions:
Stop EC2 instances.Parameters:
  • awsRegion: AWS region
  • ec2Instance: Instance ID
  • secretName: AWS credentials secret
action: ec2-stop
awsRegion: "us-west-2"
ec2Instance: "i-1234567890abcdef0"
Source: api/v1alpha1/awschaos_types.go:43-76

GCPChaos

Inject faults into GCP infrastructure.

GCPChaos

CRD: gcpchaos_types.go
Actions: node-stop, node-reset, disk-loss
Actions:
  • node-stop: Stop Compute Engine instances
  • node-reset: Reset Compute Engine instances
  • disk-loss: Detach persistent disks
Parameters:
  • project: GCP project ID
  • zone: GCP zone
  • instance: Instance name
  • deviceNames: Disk device names (for disk-loss)
Example:
action: disk-loss
project: "my-project"
zone: "us-central1-a"
instance: "my-instance"
deviceNames: ["disk-1", "disk-2"]
Source: api/v1alpha1/gcpchaos_types.go:44-77

AzureChaos

Inject faults into Azure infrastructure.

AzureChaos

CRD: azurechaos_types.go
Actions: vm-stop, vm-restart, disk-detach
Actions:
  • vm-stop: Stop Azure VMs
  • vm-restart: Restart Azure VMs
  • disk-detach: Detach managed disks
Parameters:
  • subscriptionID: Azure subscription ID
  • resourceGroupName: Resource group name
  • vmName: VM name
  • diskName: Disk name (for disk-detach)
  • lun: Logical unit number (for disk-detach)
Example:
action: disk-detach
subscriptionID: "sub-123"
resourceGroupName: "my-rg"
vmName: "my-vm"
diskName: "data-disk"
lun: 0
Source: api/v1alpha1/azurechaos_types.go:43-68

Physical Machine Chaos

PhysicalMachineChaos

Inject faults into physical machines or VMs outside Kubernetes.

PhysicalMachineChaos

CRD: physical_machine_chaos_types.go
Actions: 40+ actions across network, disk, process, JVM, and more
Action Categories:
  • stress-cpu: CPU stress
  • stress-mem: Memory stress
  • disk-read-payload: Disk read stress
  • disk-write-payload: Disk write stress
  • disk-fill: Fill disk space
  • network-corrupt: Packet corruption
  • network-duplicate: Packet duplication
  • network-loss: Packet loss
  • network-delay: Network latency
  • network-partition: Network partition
  • network-dns: DNS chaos
  • network-bandwidth: Bandwidth limitation
  • network-flood: Network flooding
  • network-down: Disable network interface
  • process: Kill or signal processes
  • jvm-exception: Throw exceptions
  • jvm-gc: Trigger GC
  • jvm-latency: Method latency
  • jvm-return: Override return values
  • jvm-stress: JVM stress
  • jvm-rule-data: Custom Byteman rules
  • jvm-mysql: MySQL JDBC faults
  • clock: Clock offset
  • redis-expiration: Set key expiration
  • redis-penetration: Cache penetration
  • redis-cacheLimit: Limit cache size
  • redis-restart: Restart Redis Sentinel
  • redis-stop: Stop Redis Sentinel
  • kafka-fill: Fill Kafka topic
  • kafka-flood: Flood Kafka
  • kafka-io: Kafka I/O faults
  • http-abort: Abort HTTP requests
  • http-delay: HTTP latency
  • http-config: HTTP config manipulation
  • http-request: Send HTTP requests
  • file-create: Create files
  • file-modify: Modify file permissions
  • file-delete: Delete files
  • file-rename: Rename files
  • file-append: Append to files
  • file-replace: Replace file content
  • vm: VM-level operations
  • user_defined: Execute custom commands
Source: api/v1alpha1/physical_machine_chaos_types.go:22-69
PhysicalMachineChaos requires installing chaosd agent on target machines. It operates outside the Kubernetes cluster.

Common Fields

All chaos types share common fields:

Selector

Defines which pods/resources to target. See Selectors for details.

Mode

How many targets to affect:
  • one: Random single target
  • all: All matching targets
  • fixed: Fixed number (set value)
  • fixed-percent: Fixed percentage (set value)
  • random-max-percent: Random up to max percentage (set value)

Duration

How long the chaos lasts:
duration: "30s"  # 30 seconds
duration: "5m"   # 5 minutes
duration: "1h"   # 1 hour
Omit for indefinite duration (manual deletion required).

Remote Cluster

Target pods in remote clusters:
remoteCluster: "cluster-2"

Status Tracking

All chaos experiments track status with: Conditions (api/v1alpha1/common_types.go:45-52):
  • Selected: Targets have been selected
  • AllInjected: All targets are injected
  • AllRecovered: All targets are recovered
  • Paused: Experiment is paused
Records (api/v1alpha1/common_types.go:78-88):
  • Track injection/recovery events
  • Count successful operations
  • Record timestamps and error messages

Architecture

How chaos types are implemented

Selectors

Target selection mechanisms

Components

Component responsibilities