Cloud Site Reliability Engineer - Azure/AWS (34084) - Myticas LLC

Overview

Remote

On Site

Contract - W2

Skills

Recovery

Management

Real-time

Scripting

Cloud Computing

Auditing

Access Control

Operational Excellence

Innovation

Reliability Engineering

DevOps

Systems Engineering

Linux

IaaS

Amazon Web Services

Microsoft Azure

Kubernetes

Orchestration

Ansible

Incident Management

Workflow

Continuous Integration

Continuous Delivery

Lifecycle Management

Communication

Collaboration

Value Engineering

Apache Kafka

Job Details

Cloud Site Reliability Engineer - AWS & Azure

Responsibilities

Oversee the design and improvement of infrastructure using SRE best practices, including IaC, recovery automation, and systems that detect and resolve issues independently.
Manage and fine-tune critical services across both cloud and on-prem environments: Kubernetes clusters, CI/CD pipelines, artifact registries, and custom workloads.
Enhance observability through intelligent logging, metrics, tracing, and alerting. Ensuring systems are transparent and actionable in real time.
Champion automation by eliminating repetitive tasks, from deployment workflows to security audits, through scripting and tooling.
Elevate the developer experience for 80+ engineers and researchers by streamlining secure, reliable workflows across hybrid and cloud-native platforms.
Take ownership of IAM governance across platforms like Azure AD and AWS IAM. Implement lifecycle automation, auditing, and access controls.
Foster a culture of operational excellence with strong practices around security, incident management, and resilience engineering.
Act as a trusted partner to developers and researchers, enabling their speed and innovation without compromising stability.

Experience

Experience in Site Reliability Engineering, DevOps, or Systems Engineering within fast-paced, technically demanding environments.
Strong background in Linux systems and cloud infrastructure, with hands-on experience in AWS (primary) and Azure environments.
Solid command of Kubernetes and container orchestration in production environments.
Expertise in Infrastructure as Code tools such as Ansible, building reproducible, scalable infrastructure is second nature to you.
Deep experience in observability and incident response: you know how to set up effective monitoring, handle incidents, and lead blameless post-mortems.
A security-first mindset, especially when it comes to protecting distributed systems and developer workflows.
Proven ability to support and optimize CI/CD pipelines, container image builds, and artifact lifecycle management.
Strong communication and collaboration skills. You build trust across teams and advocate for thoughtful, scalable solutions.
Bonus if you've worked with event-driven architectures using technologies like Kafka.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Cloud Site Reliability Engineer - Azure/AWS (34084)

Job Details

About Myticas LLC

Share