Site Reliability Engineer

Hybrid in Laurel, MD, US • Posted 30+ days ago • Updated 9 days ago

Contract W2

No Travel Required

On-site

Depends on Experience

Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

Amazon Web Services
Bash
Google Cloud Platform
Docker
DevOps
IaaS
Python
SAFE
Terraform
Performance Tuning
Management
Kubernetes
Load Testing
Linux
Microsoft Azure
Reliability Engineering
Regulatory Compliance
Scalability
Unix
System Security

Summary

Job Description

Position: Site Reliability Engineer (SRE)

Role Summary

We are looking for a skilled Site Reliability Engineer (SRE) to ensure the reliability, availability, performance, and scalability of critical systems. The SRE will work closely with development and operations teams to build resilient infrastructure, automate operations, and improve system observability while maintaining strong SLAs/SLOs.

Key Responsibilities

• Design, build, and maintain highly available, scalable, and reliable systems.

• Define and manage SLIs, SLOs, and SLAs to ensure system reliability and performance.

• Automate infrastructure provisioning and configuration using Infrastructure as Code (Terraform, CloudFormation).

• Implement and manage CI/CD pipelines to enable safe and frequent deployments.

• Monitor system health using tools like Prometheus, Grafana, Datadog, Splunk, ELK.

• Handle incident response, on-call rotations, root cause analysis (RCA), and post-mortems.

• Improve system resilience through capacity planning, load testing, and chaos engineering.

• Collaborate with engineering teams to improve application reliability and reduce operational toil.

• Manage cloud infrastructure on AWS / Azure / Google Cloud Platform.

• Ensure system security, compliance, and best practices are followed.

• Support production deployments, upgrades, and performance tuning.

Required Skills & Experience

• 3+ years of experience as an SRE / DevOps / Production Engineer.

• Strong knowledge of Linux/Unix systems and networking fundamentals.

• Proficiency in scripting or programming (Python, Go, Bash).

• Experience with containers and orchestration (Docker, Kubernetes).

• Hands-on experience with monitoring, logging, and alerting tools.

• Strong understanding of cloud platforms (AWS, Azure, or Google Cloud Platform).

• Experience implementing high availability, fault tolerance, and disaster recovery strategies.

• Excellent problem-solving and troubleshooting skills.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10477291
Position Id: 8870172
Posted 30+ days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Bethesda, Maryland

•

Today

Job Description Marriott International, Inc: Senior Network Engineer - Site Reliability Engineering- Bethesda, MD, DUTIES: Design and implement the network configuration, security and monitoring strategy and platforms for the next-gen operations for all Marriott International (MI) networks including the Property Networks, Datacenter/Cloud Networks,and Corporate Networks. Work with architecture, engineering, operational, and application experts to set standards and architecture for all aspects

Full-time

USD 130,900.00 - 171,700.00 per year

DevOps and Site Reliability Engineer

Hybrid in McLean, Virginia

•

Today

Job Number: R0230184 DevOps and Site Reliability Engineer The Opportunity: Everyone is trying to "harness the cloud," but not everyone knows how. As a DevOps engineer, you're eager to develop, manage, and secure a container platform that meets your client's needs and takes advantage of cloud capabilities. We need you to develop container management software to solve some of our clients' toughest challenges. As a senior platform DevOps engineer at Booz Allen, you can use your technical skills to

Full-time

USD 77,600.00 - 176,000.00 per year

Senior Site Reliability Engineer

McLean, Virginia

•

Today

Your work days are brighter here. We're obsessed with making hard work pay off, for our people, our customers, and the world around us. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we're shaping the future of work so teams can reach their potential and focus on what matters most. The minute you join, you'll feel it. Not just in the products we build, but in how we show up for each other. Our culture is rooted in integrity, empathy, and shared enthus

Full-time

USD 147,400.00 per year

Sr. Site Reliability Engineer - AWS Geospatial Technology

Vienna, Virginia

•

Today

Overview Join us to work collaboratively with our talented team of dynamic and passionate engineers to deliver capabilities that enable our customers to make a difference. You'll deploy and operate ArcGIS Velocity and ArcGIS Workflow Manager SaaS solutions. You will also have the opportunity to design, deploy, and operate next-generation real-time and big data GIS software-as-a-service (SaaS) capabilities for thousands of cloud users worldwide. Our teams have a broad mix of experience levels a

Full-time

USD 82,160.00 - 138,320.00 per year

Search all similar jobs

Site Reliability Engineer

Dice Job Match Score™

Job Details

Skills

Summary

Job Description

Position: Site Reliability Engineer (SRE)

Role Summary

Key Responsibilities

Required Skills & Experience

Similar Jobs