Site Reliability Engineer

Columbus, OH, US • Posted 1 hour ago • Updated 1 hour ago

Contract W2

Contract Corp To Corp

On-site

Depends on Experience

Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

Site Reliability Engineer
Grafana
Dynatrace
Infrastructure as Code
Terraform

Summary

As a Site Reliability Engineer (SRE) Level II, you will play a key role in maintaining the availability, scalability, and performance of critical infrastructure and services. You will be responsible for building and automating solutions that enhance system reliability and support continuous delivery. In this role, you will handle more complex operational tasks and incidents, provide mentorship to junior SREs, and collaborate with development teams to ensure systems are designed for reliability from the ground up.
Incident Management :
complex incidents, and ensure service uptime.
Lead troubleshooting efforts for high-impact production issues, providing detailed root cause analysis (RCA) and preventative measures.
Participate in on-call rotations, acting as an escalation point for Level 1 SREs during major incidents.
Automation & Infrastructure as Code (IaC):
Develop and maintain automation scripts and infrastructure using tools like Terraform, Ansible, or CloudFormation.
Implement automation solutions to eliminate manual tasks and improve system reliability, scalability, and performance.
Performance & Scalability:
Analyze system performance and recommend optimizations for scalability and reliability.
Support capacity planning efforts by monitoring system metrics, traffic
patterns, and usage trends to predict future resource needs.
System Design & Architecture:
Collaborate with software engineering teams to influence the design of new services and applications, ensuring they are scalable, reliable, and resilient from the start.
Contribute to architectural decisions, ensuring alignment with best practices in fault tolerance, redundancy, and recovery.
Monitoring & Observability:
Build and maintain robust monitoring, alerting, and observability solutions to proactively detect and resolve issues before they impact end users.
Optimize existing monitoring tools (e.g., Prometheus, Grafana, Datadog, Dynatrace) and build custom dashboards for better visibility into system health.
Security & Compliance:
Ensure systems and infrastructure are secure, compliant, and aligned with organizational policies and industry best practices.
Assist with vulnerability management, system patching, and implementing security measures to protect the integrity and availability of services.
Continuous Improvement:
Lead efforts to continuously improve operational processes, tools, and workflows.
Implement and enforce best practices in deployment, monitoring, and incident management to improve overall system reliability and reduce downtime.
Basic Qualifications
Bachelor s degree in computer science, Information Technology, or a related field, or equivalent work experience.
3 years of experience in site reliability engineering, application monitoring, systems administration, or related roles.
Proven track record of managing complex infrastructure, troubleshooting production issues, and optimizing system performance
Preferred Qualifications
Strong experience with Linux/Unix administration and proficiency in scripting (e.g., Python, Bash, Go).
5 years of experience in site reliability engineering, DevOps, systems administration, or related roles.
Experience with containerization and orchestration technologies like Docker and Kubernetes.
Familiarity with distributed systems and microservices architecture.
Excellent problem-solving and troubleshooting skills, especially in diagnosing production issues in high-scale environments.
Microsoft Office experience
Experience working in multi-platform environment
Ability to balance both development and support roles
Experience in working on projects that involve business segments
Strong analytical, strong troubleshooting skills and excellent communication skills
Strong interpersonal skills, focus on customer service, and the ability to work well with other IT, vendor, and business groups

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10119106
Position Id: 8931319
Posted 1 hour ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Infrastructure Engineer

Columbus, Ohio

•

Today

Location: Columbus, OH Salary: $53.00 USD Hourly - $57.00 USD Hourly Description: Job Description In this contingent role, you will support Infrastructure Engineering by contributing to medium-complexity technical initiatives and participating in large-scale planning efforts. You will analyze engineering challenges, evaluate multiple technical variables, and partner with cross-functional teams to deliver high-quality infrastructure solutions. This role requires a strong understanding of inf

Contract

USD 53.00 - 57.00 per hour

SNOW Dev Administrator / Developer / Onsite / Columbus / Ohio

Columbus, Ohio

•

Today

This is a long-term contract ServiceNow Developer/Administrator opportunity located in Columbus, Ohio (onsite 5 days/week) supporting a large-scale public sector digital transformation initiative. This role is heavily focused on the ServiceNow platform, including administration, scripting, upgrades, and platform optimization across ITSM environments. This is a high-impact role where you'll essentially own the "run" side of the ServiceNow platform while still contributing to ongoing development e

Easy Apply

Contract

$50 - $60

Manager IT Software Engineering

Worthington, Ohio

•

Today

Location: Worthington, OH Salary: $160,000.00 USD Annually - $175,000.00 USD Annually Description: Engineering Manager (Full-Stack) - Job Description Minimum Qualifications High proficiency in C# and .NET for building scalable backend services.Strong experience with HTML, CSS, and JavaScript for modern, responsive web development.Hands-on experience with front-end frameworks such as Vue.js, Knockout, Bootstrap, or equivalent.Demonstrated ability to design, build, and maintain large-scale in

Contract

USD 160,000.00 - 175,000.00 per year

Mainframe Sys Engineer

Columbus, Ohio

•

Today

Location: Columbus, OH Description: Mainframe Application Configuration & Automation Engineer Location: Columbus, OH 6+ Months Contract to hire Role Overview We are seeking a Mainframe Application Configuration & Build Specialist with strong expertise in automation and workload scheduling. The ideal candidate will be responsible for configuring enterprise applications, automating tasks with Python, and integrating/monitoring batch scheduling workloads using AAI (VMware/Broadcom) or equival

Contract

Search all similar jobs