Overview
On Site
Accepts corp to corp applications
Contract - W2
Contract - Independent
Skills
Aws
Dynatrace
CI/CD
site reliability
Job Details
Job Title: SRE Lead Dynatrace Expert
Location: Atlanta, GA [Hybrid] - F2F interview must
Contract
Job Description:
We are seeking an experienced Site Reliability Engineering (SRE) Lead with strong expertise in Dynatrace to lead observability, reliability, and performance initiatives across enterprise platforms. The ideal candidate will drive monitoring strategy, lead incident management, and mentor SRE teams to ensure high system availability and resilience.
Key Responsibilities:
- Lead the design, implementation, and optimization of Dynatrace across applications, infrastructure, cloud, and Kubernetes platforms
- Define and enforce SRE best practices, SLIs, SLOs, SLAs, and error budgets
- Own end-to-end observability strategy including APM, Infrastructure, Logs, RUM, and Synthetic Monitoring
- Act as the technical escalation point for production issues and major incidents
- Perform root cause analysis (RCA) and drive permanent fixes
- Partner with Engineering, DevOps, Cloud, and Security teams to improve platform reliability
- Drive automation for monitoring, alerting, and incident response
- Build executive dashboards, metrics, and reports using Dynatrace
- Mentor and guide junior SREs and monitoring engineers
- Participate in on-call leadership and incident reviews
Required Skills & Qualifications:
- Excellent years of IT experience with 4+ years in SRE / Reliability Engineering
- Deep hands-on experience with Dynatrace (APM, Infra Monitoring, Logs, RUM, Synthetics)
- Strong understanding of Linux/Unix systems
- Experience with cloud platforms (AWS / Azure / Google Cloud Platform)
- Hands-on experience with Kubernetes and Docker monitoring
- Strong scripting skills (Python, Shell, PowerShell)
- Experience with CI/CD pipelines and DevOps tools
- Strong troubleshooting and performance engineering skills
Nice to Have:
- Dynatrace certification
- Experience with Prometheus, Grafana, Splunk
- ITIL / SRE certifications
- Experience in large-scale enterprise or financial services environments
Soft Skills:
- Strong leadership and mentoring abilities
- Excellent communication and stakeholder management
- Ability to handle high-pressure production environments
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.