Site Reliability Engineer (SRE)
Contract W2
12 Months
No Travel Required
Remote
$55 - $60/hr


Ohm Systems, Inc
Fitment
Dice Job Match Score™
🧠 Analyzing your skills...
Job Details
Skills
- Google Cloud Platform
- Data Collection
- DevOps
- FOCUS
- GitHub
- Google Cloud
- Continuous Delivery
- Continuous Integration
- Dashboard
- Reliability Engineering
- SRE
- OCP
- Programming Languages
- Provisioning
- Kubernetes
- Management
- Microsoft Azure
- Orchestration
- Cloud Computing
- IaaS
- Incident Management
- Java
- Ansible
- AppDynamics
- Budget
- Capacity Management
- Splunk
- ARM
- Grafana
- Jenkins
- Python
- Root Cause Analysis
- Stacks Blockchain
- Terraform
Summary
Role Summary
- We are seeking a Site Reliability Engineer (SRE) with strong expertise in observability, monitoring, and distributed tracing to join our SRE team.
- The ideal candidate will help us design, build, and scale an observability framework that provides end-to-end visibility into our systems and applications.
- A strong focus will be places on OpenTelemetry, as we continue to standardize our telemetry pipeline across logs, metrics, and traces.
Key Responsibilities
- Design, implement, and maintain observability solutions using OpenTelemetry, Prometheus, Grafana, AppDynamics, and Splunk.
- Build and manage telemetry pipelines (metrics, logs, traces) ensuring reliable data collection, transformation, and export.
- Lead initiatives to improve incident detection, response, and post-incident analysis with a strong emphasis on RCA (Root Cause Analysis).
- Define and maintain SLIs, SLOs, and error budgets to measure and improve system reliability.
- Partner with development and operations teams to instrument applications and services for better monitoring and tracing coverage.
- Develop dashboards, alerts, and visualizations to provide actionable insights into system health and performance.
- Contribute to automation and self-healing practices that improve uptime and reduce operational toil.
- Stay current with trends in observability and advocate best practices across the engineering organization.
Requirements
- 7+ years of SRE/ Devops/ Cloud/ Infrastructure engineering experience with a focus on monitoring and observability.
- Hands-on experience with OpenTelemetry SDKs, collectors, and exporters.
- Proficiency with observability stacks such as Prometheus, Grafana, Loki, Tempo, Elastic Stack, or Splunk Observability (Splunk/AppDynamics).
- Strong knowledge on cloud platforms (Google Cloud Platform, or Azure).
- Hand-on experience on container orchestration using Kubernetes (OCP, GKE, AKS)
- Familiarity with CI/CD pipelines like (Jenkins and Github actions), infrastructure as code (Terraform/Ansible/ARM/CloudFormation).
- Experience provisioning infrastructure and capacity planning.
- Hands-on skills in programming languages like Java, and Python.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 10111826
- Position Id: 9013184
- Posted 2 hours ago
Company Info
Ohm Systems, Inc. specializes in IT and Healthcare staffing services, dedicated to linking highly skilled professionals with our public and private clients across the United States. Our track record showcases our commitment to delivering outstanding staffing and consultancy solutions to our clients. We prioritize diversity and inclusivity and take pride in being an employer that promotes equal opportunities and affirmative action. Our goal is to foster an inclusive work environment that embraces individuals from all backgrounds, irrespective of their gender, race, or orientation.


Create job alert
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs