Deployment & Automation
- Implement CI/CD pipelines using tools such as GitHub Actions, AWS CodePipeline, and Jenkins
- Automate infrastructure provisioning through Infrastructure-as-Code (IaC) using Terraform, CloudFormation, or AWS CDK.
- Develop automation scripts and self-service tools to enhance operational efficiency.
2. Leveraging Dynatrace Observability Platform
Demonstrated expertise in
- Standardized installation through automation
- Integrating with CI/CD pipelines
- Enforcing Tagging and Metadata standards
- Use of environment-aware configuration
- Implement distributed tracing with appropriate context propagation.
- Optimize alerts, create dashboards, alerts, and anomaly detectors
Skills/Experience:
Incident Management & Response
- Proficient in ITIL framework and ITSM tools such as ServiceNow.
- Production on-call responder with strong troubleshooting capabilities.
- Develop RCA documentation, and Knowledge articles
- Apply SRE principles, including SLIs, SLOs, and error budgets.
2. Capacity Planning & Performance
- Implement operational cost optimization initiatives.
- Configure and maintain auto-scaling policies and thresholds.
- Develop Resiliency Test plans and support Performance testing.
Security & Compliance Implementation
- Manage service accounts and access permissions
- Create, deploy, and manage digital certificates.
- Respond to security incidents and execute remediation tasks effectively.
Education & Experience
- Bachelors degree in Computer Science, Engineering, or related field
- 2 to 4 years of experience in DevOps, SRE, or infrastructure roles
- Mid-level proficiency in Python or other scripting languages.
- Mid-level proficiency in Configuration management tool including Ansible.