AzureChaos allows you to simulate Microsoft Azure infrastructure failures by manipulating Virtual Machines and managed disks through the Azure API.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/chaos-mesh/chaos-mesh/llms.txt
Use this file to discover all available pages before exploring further.
Actions
AzureChaos supports the following actions:- vm-stop: Stop an Azure Virtual Machine
- vm-restart: Restart an Azure Virtual Machine
- disk-detach: Detach a managed disk from a Virtual Machine
Spec Fields
The Azure chaos action to perform.Options:
vm-stop, vm-restart, disk-detachDefault: vm-stopAzure subscription ID where the resources are located.
Name of the Azure resource group containing the Virtual Machine.
Name of the Virtual Machine to target.
Name of the managed disk to detach. Required when action is
disk-detach.Logical Unit Number (LUN) of the data disk. Required when action is
disk-detach.The LUN identifies which disk attachment to detach (VMs can have multiple data disks).Duration of the chaos action. For
vm-stop and disk-detach, resources are affected for this duration then recovered. Not applicable to vm-restart (oneshot action).Name of the Kubernetes secret containing Azure service principal credentials. If not specified, uses the default Azure credential chain.
Remote cluster name where the chaos will be deployed.
Azure Credentials Setup
You need to provide Azure credentials to Chaos Mesh.Create Service Principal
- Create a service principal:
- Note the output values:
appId(Client ID)password(Client Secret)tenant(Tenant ID)
Create Kubernetes Secret
Create a secret with the service principal credentials:Required Azure Permissions
The service principal needs the following permissions:Microsoft.Compute/virtualMachines/start/actionMicrosoft.Compute/virtualMachines/powerOff/actionMicrosoft.Compute/virtualMachines/restart/actionMicrosoft.Compute/virtualMachines/readMicrosoft.Compute/disks/readMicrosoft.Compute/virtualMachines/write(for disk operations)
Contributor or Virtual Machine Contributor role on the resource group provides these permissions.
Alternative: Managed Identity (AKS)
If running on AKS, you can use Managed Identity:- Enable managed identity on your AKS cluster
- Grant the identity appropriate permissions on the target resources
- Don’t specify
secretNamein the AzureChaos spec
Examples
Stop Virtual Machine
Restart Virtual Machine
Detach Managed Disk
Implementation Details
AzureChaos uses the Azure SDK to:- Authenticate using service principal credentials or managed identity
- Call Azure Compute APIs to manipulate resources:
vm-stop: Calls PowerOff API, then Start after durationvm-restart: Calls Restart API (oneshot)disk-detach: Removes disk from VM configuration, then reattaches after duration
api/v1alpha1/azurechaos_types.go:43-102
Oneshot Behavior
Thevm-restart action is marked as a oneshot action, meaning:
- It executes once immediately
- No recovery action is performed
- The
durationfield is ignored - The experiment completes after the restart command is sent
api/v1alpha1/azurechaos_types.go:28
Finding LUN for Disk Detach
To find the LUN of a data disk:Important Notes
- Ensure your Azure credentials have appropriate permissions
- Be cautious when targeting production VMs
- The VM must be in a state that allows the requested operation
- For
disk-detach, ensure you’re not detaching the OS disk (only data disks can be detached from running VMs) - You need both
diskNameandlunfor disk detach operations - Test in non-production environments first
- Azure API rate limits and quotas apply