Job Title: Developer – Azure Chaos Studio Specialist
Job Type:Contract
Experience Level:Mid Level
Department: Cloud Engineering / Site Reliability Engineering (SRE)
Job Summary:
We are seeking a skilled and proactive Developer with hands-on experience in Azure Chaos Studio to join our cloud engineering team. The ideal candidate will be responsible for designing, implementing, and managing chaos engineering experiments to improve the resilience and reliability of our cloud-based applications and infrastructure.
Key Responsibilities:
- Design and execute chaos experiments using Azure Chaos Studio to simulate real-world outages and failures.
- Collaborate with development, DevOps, and SRE teams to identify critical systems and define failure scenarios.
- Analyze experiment results and provide actionable insights to improve system resilience.
- Automate chaos testing as part of CI/CD pipelines.
- Monitor and report on system behavior during and after chaos experiments.
- Develop custom fault injection scripts and integrate with other Azure services.
- Stay updated with the latest features and best practices in Azure Chaos Studio and chaos engineering.
Required Skills & Qualifications:
- Proven experience with Azure Chaos Studio and Azure ecosystem.
- Strong programming/scripting skills in Python, PowerShell, or C#.
- Experience with Azure DevOps, ARM templates, or Bicep.
- Familiarity with resilience engineering, fault tolerance, and disaster recovery principles.
- Experience with monitoring tools like Azure Monitor, Application Insights, or Log Analytics.
- Understanding of microservices architecture and distributed systems.
- Excellent problem-solving and communication skills.