Overview
HybridCandidate needs to be in the office 3-4 Days every week
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - 12 Month(s)
Skills
Dynatrace
application performance monitoring (APM)
AIOps
Splunk
ELK
Grafana
Prometheus
MTTR
SLI/SLO
Terraform
Jenkins
TeamCity
Octopus
Bamboo
U-Deploy
DevOps pipelines
AWS
Azure
GCP
UML
Job Details
Role: Site Reliability Engineer (SRE)
Locations: Fort Mill, SC (Hybrid Onsite)
Duration: 12+ Months Contract
Note: Candidate needs to be in the office 3-4 Days every week.
Updated JD:
Role Overview:
- We are looking for a Site Reliability Engineer with deep expertise in Dynatrace and a strong background in observability, automation, and cloud operations.
- This role focuses on designing and implementing highly reliable, scalable solutions while driving proactive monitoring and operational excellence.
Key Responsibilities:
- Lead the design and implementation of full-stack observability solutions with Dynatrace as the primary platform.
- Configure Dynatrace for application performance monitoring (APM), infrastructure monitoring, and intelligent alerting.
- Build advanced dashboards and integrate Dynatrace with event management systems to enable proactive incident prevention and root cause analysis.
- Collaborate with teams to optimize Dynatrace usage for AIOps-driven insights and automated anomaly detection.
- Provide oversight for production operations to maximize reliability and automation.
- Develop and evolve SRE best practices, runbooks, and tooling to ensure high availability and resilience.
- Implement data-driven operational strategies to improve decision-making and reduce MTTR.
- Hands-on experience with Dynatrace, Splunk, ELK, Grafana, Prometheus, and (future) ThousandEyes.
- Build and manage CI/CD pipelines and Infrastructure as Code (IaC) solutions using Terraform, Jenkins, TeamCity, Octopus, Bamboo, and U-Deploy across hybrid/multi-cloud environments.
- Develop and manage DevOps pipelines in AWS, Azure, and Google Cloud Platform using Terraform and cloud-native tooling.
- Strong developer background with the ability to understand application layers and infrastructure interactions.
- Define and document standard operating procedures, architecture diagrams, and system documentation using Jira, Confluence and UML.
- Identify areas for process and efficiency improvement within Platform Services Operations; recommend and implement solutions.
- Drive automation initiatives across all operational processes.
- Proactively monitor system capacity and health indicators; provide analytics and forecasts for scaling.
Preferred Qualifications
- Expert-level experience with Dynatrace, including dashboard creation, alert configuration, and integration with other observability tools.
- Strong knowledge of AIOps, performance tuning, and proactive incident management.
- Familiarity with hybrid/multi-cloud environments and modern DevOps practices.
- Excellent problem-solving skills and ability to work in a fast-paced, collaborative environment.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.