Apply Now

Senior Site Reliability Engineer

Dallas, TX, US • Posted 1 day ago • Updated 8 hours ago

Full Time

On-site

Fitment

Dice Job Match Score™

🤯 Applying directly to the forehead...

Job Details

Skills

Network
Cost Reduction
Decision-making
WINS
Motivation
Innovation
Leadership
Recruiting
Pivotal
Regulatory Compliance
Cloud Computing
FOCUS
Disaster Recovery
Business Continuity Planning
Capacity Management
System On A Chip
SAFE
Mentorship
Operational Excellence
Computer Science
Information Technology
Amazon Web Services
Google Cloud
Google Cloud Platform
Grafana
Management
Budget
Incident Management
Terraform
Continuous Integration
Continuous Delivery
DevOps
GitHub
Health Care
HIPAA
Scripting
Python
Bash
Windows PowerShell
Communication
Collaboration
CHAOS
Testing
Microsoft Azure
Kubernetes
Health Insurance
Insurance
Life Insurance
SAP BASIS
Law
ProVision

Summary

About Lantern

Lantern is the specialty care platform connecting people with the best care when they need it most. By curating a Network of Excellence comprised of the nation's top specialists for surgery, cancer care, infusions and more, Lantern delivers excellent care with significant cost savings to employers and their workforces. Lantern also pairs members with a dedicated care team, including Care Advocates and nurses, for the entirety of their care journey, helping them get back to good health, back to their families and back to work. With convenient access to specialists nationwide, Lantern means quality care is within driving distance for most. Lantern is trusted by the nation's largest employers to deliver care to more than 6 million members across the country. Learn more about us at lanterncare.com.

About You:

You use LOGIC in your decision making and understand that progress is critical to making change. You focus on the execution of your content while balancing a fast-paced environment and you take the time to celebrate both the small & big wins.
INCLUSION is a core tenant of your personal beliefs. A diverse and inclusive environment is incredibly important to you. You understand and desire to be a part of a diverse team with different experiences and perspectives & you cherish the differences in each individual that you interact with.
You have the GRIT, drive and ambition to tackle big problems. Big problems require big ideas and a team that supports new ideas.
You care deeply for your customers are driven to keep HUMANITY in all decisions. Your customers aren't just the individuals using your product. They are the driving factor in your motivation to make a change.
Integrity guides you in life. Focusing on the TRUTH vs. giving people the answers they want to hear.
You thrive in a Team Environment. Collaboration is key in innovation and creating change.

These pillars of LIGHT are a reminder to our team that we are making a difference by providing guidance and support in navigating the often complex and confusing landscape of healthcare. We hope that through this LIGHT, individuals can find their way to the best care, resources, and support they need to get back to life.

If this sounds like you, we would love to connect to speak further about career opportunities at Lantern.

Please apply to our role & someone from our Talent Acquisition Team will reach out to help you navigate our interview process.

Lantern is seeking an experienced Senior Site Reliability Engineer to champion the reliability, availability, and performance of our Azure-based healthcare platform. In this pivotal role, you will define and implement SRE practices, drive incident management processes, build observability frameworks, and ensure our systems meet stringent uptime and compliance requirements. You will collaborate with platform engineers, application developers, and security teams to embed reliability into every layer of our infrastructure. This role is ideal for an SRE expert with deep experience in production operations, monitoring, incident response, and automation in cloud environments.

You will work on the Platform Engineering team, partnering with application developers, infrastructure engineers, and security teams to establish SRE best practices across Lantern. Your focus will be on building resilience, reducing toil through automation, and creating a culture of reliability that ensures our healthcare platform delivers consistent, high-quality service to our users.

Location: Hybrid - at least 3 days/wk in our Dallas, TX offices

On-Call: This position requires being on-call 1 week per month

Responsibilities:

Define and track SLOs/SLIs/error budgets for critical healthcare services
Build and maintain observability platforms (monitoring, logging, alerting, tracing) using Datadog and Azure Monitor
Lead incident management processes using Rootly, including on-call rotations, runbooks, and post-incident reviews
Automate operational toil through Infrastructure-as-Code (Terraform) and custom tooling
Design and implement disaster recovery and business continuity strategies
Collaborate with development teams to improve service reliability through architecture reviews and chaos engineering
Optimize system performance, capacity planning, and cost efficiency for Azure infrastructure
Ensure production systems meet HIPAA, SOC 2, and other regulatory requirements
Maintain and improve CI/CD pipelines to support safe, rapid deployments
Mentor junior engineers and foster a culture of reliability and operational excellence

Requirements:

Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field, or equivalent practical experience.
4+ years in SRE, DevOps, or production operations roles
3+ years with Microsoft Azure (AWS/Google Cloud Platform a plus)
Strong experience with observability tools (Datadog, Azure Monitor, Prometheus, Grafana, or similar)
Experience defining and managing SLOs/SLIs and error budgets
Proven incident management and on-call experience (Rootly or similar incident management platforms)
Hands-on with Infrastructure as Code (Terraform) and CI/CD (Azure DevOps, GitHub Actions)
Experience in regulated environments (healthcare/HIPAA preferred)
Strong scripting skills (Python, Bash, PowerShell)
Excellent communication and collaboration skills
If you don't meet every requirement listed, we still encourage you to apply.

Strong Candidates Will:

Deep experience with chaos engineering and reliability testing
Experience with Azure Kubernetes Service and containerized workloads
Relevant certifications (Azure, SRE, Kubernetes)

Benefits

Medical Insurance
Dental Insurance
Vision Insurance
Short & Long Term Disability
Life Insurance
401k with company match
Flexible Time Off
Paid Parental Leave

Lantern does not discriminate on the basis of race, sex, color, religion, age, national origin, marital status, disability, veteran status, genetic information, sexual orientation, gender identity or any other reason prohibited by law in provision of employment opportunities and benefits.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: RTX162c0a
Position Id: bb7cf6c3b6bb997ec6116ff8d7d4c171
Posted 1 day ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Senior Cloud Platform Engineer

Dallas, Texas

•

Today

About Lantern Lantern is the specialty care platform connecting people with the best care when they need it most. By curating a Network of Excellence comprised of the nation's top specialists for surgery, cancer care, infusions and more, Lantern delivers excellent care with significant cost savings to employers and their workforces. Lantern also pairs members with a dedicated care team, including Care Advocates and nurses, for the entirety of their care journey, helping them get back to good he

Full-time

Senior Site Reliability Engineer

Coppell, Texas

•

3d ago

Senior Site Reliability Engineer combination of deep operational expertise and hands-on engineering ability. The majority of your time (~70%) will be focused on owning the reliability, availability, scalability, and operational excellence of the cloud infrastructure and SaaS platforms powering our business. The remaining ~30% puts you directly in the platform engineering flow: building automation, improving deployment pipelines, and driving reliability initiatives from conception through produc

Easy Apply

Full-time

Depends on Experience

Senior Software Development Engineer (Site Reliability)

Richardson, Texas

•

Today

We're building a world of health around every individual - shaping a more connected, convenient and compassionate health experience. At CVS Health , you'll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger - helping to simplify health care one person, one family and one community at a time. Position Summary The Site Reliability Engineer (SRE) is

Full-time

USD 92,700.00 - 203,940.00 per year

US|Software Engineer II

Coppell, Texas

•

Today

Job#: 3037126 Job Description: US|Software Engineer II Location: Coppell, Texas (Onsite) Duration: 6-month Contract-Hire Compensation: $65-$70/hr on W2 Role Overview This position is for an Application Support Engineer who will specialize in maintaining and providing technical support for applications that are beyond the development stage and are running in daily operations. The role involves creating, designing, deploying, and supporting software solutions. This individual will work close

Easy Apply

Full-time

$65 - $70 per hour

Search all similar jobs

More jobs at Marquam Group in Dallas, TX