site reliability engineer Jobs in district of columbia

Refine Results
1 - 20 of 226 Jobs

SRE

Oraapps Inc

Reston, Virginia, USA

Contract

We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with a strong background in observability, telemetry, and monitoring to join our development team. In this role, you will be responsible for implementing, and maintaining observability solutions using OpenTelemetry (Otel) and Splunk, ensuring the reliability, scalability, and performance of our systems. Key Responsibilities Design and implement observability strategies using OpenTelemetry for distributed tracing, metr

Site Reliability Engineer / SRE

Unisys

Reston, Virginia, USA

Full-time, Contract

Description: We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with a strong background in observability, telemetry, and monitoring to join our development team. In this role, you will be responsible for implementing, and maintaining observability solutions using OpenTelemetry (Otel) and Splunk, ensuring the reliability, scalability, and performance of our systems.Key Responsibilities * Design and implement observability strategies using OpenTelemetry for distributed

SRE

Ajace Inc

Reston, Virginia, USA

Full-time

Key Responsibilities: 1. Cloud Infrastructure & Automation: Design, implement, and manage cloud-based infrastructure using platforms like AWS, Azure, or Google Cloud Platform. Utilize Infrastructure-as-Code (IaC) tools such as Terraform, CloudFormation, and Ansible to automate deployments and configurations. Create robust automation targeted at anomaly detection, toil reduction, recovery processes, and self-healing mechanisms, and optimize cloud costs. 2. DevSecOps & CI/CD: Deep understanding of

Site Reliability Engineer

Motion Recruitment Partners, LLC

Arlington, Virginia, USA

Full-time

Site Reliability Engineer As the Senior or Staff SRE on the Platform Engineering team, you'll be joining at a foundational stage and play a key role in building and shaping a secure, resilient, and high-performance platform that powers engineering capabilities. The company is located in New York and will remain 100% remote. What You Will Be Doing: Drive Platform Excellence: Continuously improve the platform's reliability, scalability, and deployment efficiency through innovative solutions and r

FLEX Senior Systems Engineer - SRE

Marriott International

Bethesda, Maryland, USA

Full-time

Job Description The Senior Systems Engineer - Site Reliability Engineering (SRE) is responsible for the reliability, scalability, and performance of mission-critical cloud and on-prem services that support millions of Marriot customers globally. This role involves overseeing incident management, driving automation efforts, and working closely with cross-functional teams to ensure alignment between SRE strategy and business objectives. Partners closely with Product Teams, Applications teams, Inf

Site Reliability Engineer, Kubernetes Platform (Starshield)

SpaceX

Washington, District of Columbia, USA

Full-time

SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars. SITE RELIABILITY ENGINEER, KUBERENTES PLATFORM (STARSHIELD) At SpaceX we're leveraging our experience in building rockets and spacecraft to deploy the Starshield constellation. Starshield is the world's largest US gov

SRE DevOps Engineer - Mexico

Spiceorb

Remote

Contract

Hello, SpiceOrb is looking for SRE DevOps Engineer in Mexico, We are looking for the consultant who are able to work in Mexico Role: SRE DevOps Engineer Location: Mexico(Remote) Duration: 12+ Months Contract Job Description: Responsibilities: Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance. Proactively gather and analyze both metric and log data from systems and applications to perform anomaly detection, performance tuning, c

Site Reliability Engineer

Motion Recruitment Partners, LLC

Fort Meade, Maryland, USA

Full-time

A mission-focused technology start-up based out of Arlington is seeking a Site Reliability Engineer (SRE) to support the deployment and stability of AI-powered cyber applications running in secure AWS enclave environments at Fort Meade. This role is ideal for candidates with deep DevOps, AWS, and containerization expertise who are passionate about maintaining high-performance systems that support national security objectives. You'll serve as the operational bridge between the deployment site and

Site Reliability Engineer (SRE) - Senior

Electronic Consulting Services, Inc (ECS Federal)

Arlington, Virginia, USA

Full-time

Job Description ECS is seeking a Site Reliability Engineer (SRE) - Senior to work in our Arlington, VA office. Please Note: This position is contingent upon contract award. Program Description ECS is seeking talented professionals to join our successful and growing team in building the next-generation Threat Intelligence Enterprise Service (TIES) solution. The TIES Program is the Cybersecurity and Infrastructure Security Agency's (CISA) dynamic approach to fulfilling its federally mandated cy

Lead Systems Engineer (Datadog, AWS & ServiceNow Integration)

Lumen Solutions Group Inc.

Washington, District of Columbia, USA

Contract

Job Description:Lead Systems Engineer (Datadog, AWS & ServiceNow Integration)Job Summary We are seeking a seasoned Lead Systems Engineer with deep expertise in Datadog, AWS, and ServiceNow integration. In this leadership role, you will oversee the design, implementation, and maintenance of comprehensive monitoring, observability, and incident management solutions for cloud-based infrastructure and applications. You will play a key role in guiding the team to ensure operational excellence, system

Lead Observability Engineer Sumo Logic & SRE Location :Remote

NeoTech Solutions

US

Third Party, Contract

Role : Lead Observability Engineer Sumo Logic & SRE Location : Remote Hire type : Contract JD: Experience: 10+ years (with 3+ years in Sumo Logic & Cloud-native observability) Job Summary: We are seeking a highly skilled Lead Observability Engineer to lead a critical implementation of Sumo Logic for a client migrating from Dynatrace. This role requires deep expertise in Sumo Logic, Site Reliability Engineering (SRE) practices, and Kubernetes (EKS) observability. The ideal candidate will de

Site Reliability Engineer

Zachary Piper Solutions, LLC

Remote

Full-time

Piper Companies is seeking a Remote Site Reliability Engineer to join a leading cybersecurity and cloud consulting firm. The Site Reliability Engineer will play a key role in building and maintaining secure, scalable infrastructure while supporting automation, compliance, and operational excellence across client environments. Responsibilities of the Site Reliability Engineer include: Develop and deploy automation scripts, tooling, and infrastructure to meet client needsManage patching processes

Site Reliability Engineer

Madison-Davis, LLC

Remote

Contract

Role: Drive the technical implementation of monitoring and alerting strategies across enterprise-scale applications and infrastructure.Collaborate directly with development teams to ensure each new initiative includes the correct telemetry, log tagging, and alert payloads.Act as a liaison to Level 2 and Level 3 support teams to maintain and enhance monitoring dashboards used by the enterprise command center (EMC).Standardize alert formats to ensure consistency with SRE policies and support downs

Site Reliability Engineer

Nightwing

Sterling, Virginia, USA

Full-time

Nightwing provides technically advanced full-spectrum cyber, data operations, systems integration and intelligence mission support services to meet our customers' most demanding challenges. Our capabilities include cyber space operations, cyber defense and resiliency, vulnerability research, ubiquitous technical surveillance, data intelligence, lifecycle mission enablement, and software modernization. Nightwing brings disruptive technologies, agility, and competitive offerings to customers in th

Site Reliability Engineer

Akamai Technologies

Cambridge, England, United Kingdom

Full-time

Do you enjoy collaborating with teams to solve complex challenges? Do you have a passion for cutting edge technologies and tackling system problems? Join our highly skilled Site Reliability team Our team monitors and measures the reliability of our suite of Compute products and platform. In collaboration with Engineering and Product teams, we improve the performance and reliability of the products we support. Partner with the best You will apply statistical data analysis and an understandin

Site Reliability Engineer

General Dynamics

Remote or Aurora, Colorado, USA

Full-time

Basic Qualifications Bachelor's degree in Computer Science, a related field or equivalent experience is required plus a minimum of 5 years of relevant experience; or Master's degree plus 3 years of relevant experience. CLEARANCE REQUIREMENTS: Department of Defense TS/SCI security clearance is required at time of hire. Applicants selected will be subject to a U.S. Government security investigation and must meet eligibility requirements for access to classified information. Due to the nature of

Site Reliability Engineer (with strong dev background)

Artech, LLC

Remote

Contract

Summary Our organization builds and provides systems and infrastructure that fuel our core services. We are the foundation on which our software developers build the products that our customers love. We are looking for passionate and dedicated Site Reliability Engineers to continue our focus on providing our customers the highest quality Services experience. Our services have to scale globally, stay highly available, and "just work. If you love designing, engineering and running systems and inf

Site Reliability Engineer II - Real-Time

Esri

Vienna, Virginia, USA

Full-time

Overview Join us to work collaboratively with our talented team of dynamic and passionate engineers to deliver capabilities that enable our customers to make a difference. You'll deploy and operate ArcGIS Velocity and ArcGIS Workflow Manager SaaS solutions. You will also have the opportunity to design, deploy, and operate next-generation real-time and big data GIS software-as-a-service (SaaS) capabilities for thousands of cloud users worldwide. Our teams have a broad mix of experience levels a

Site Reliability Engineer

McKesson Corporation

Remote or Columbus, Ohio, USA

Full-time

McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Here, we focus on the health, happiness, and well-being of you and those we serve - we care. What you do at McKesson matters. We foster a culture where you can grow, make an impact, and are empowered to bring new ideas. Together, we thrive as we shape the future of health for patien

Principal Site Reliability Engineer (Prisma Access)

PaloAlto Networks

Reston, Virginia, USA

Full-time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are. Who We Are We take our mission of