Apply Now

Site Reliability Engineer

Blue Bell, PA, US • Posted 2 days ago • Updated 8 hours ago

Full Time

On-site

Fitment

Dice Job Match Score™

📊 Calculating match score...

Job Details

Skills

IT Security
Scalability
Operational Excellence
Problem Solving
Conflict Resolution
Provisioning
Dashboard
Production Support
DevOps
Software Engineering
Scripting
FOCUS
Terraform
Ansible
Cloud Computing
Amazon Web Services
Google Cloud
Google Cloud Platform
Kubernetes
Scripting Language
Python
Bash
Software Development
Change Management
Dynatrace
Continuous Integration
Continuous Delivery
Incident Management
Communication
Collaboration
Artificial Intelligence
Documentation
Audio Engineering
Apache Kafka
Java
Customer Facing
JIRA
Git
Workflow
Auditing

Summary

Job Description

Locations and Workstyle:

Blue Bell, PA: Primarily remote; candidates should be within commuting distance of the Blue Bell office and able to work onsite as needed. Option to come onsite more frequently if desired.
Irving, TX and Boca Raton, FL: Hybrid schedule - onsite a minimum of four days per week, with one remote day. Five days onsite may be required based on business needs.

What You'll Do:

Work closely with Infrastructure and Development teams to keep the ADT platform running and customers protected, while collaborating with cross-functional partners (IT, Security, DevOps, Engineering) to improve operational health and apply SRE best practices
Support the reliability, availability, scalability, and performance of large-scale distributed systems
Drive operational excellence through problem-solving, performance improvements, and resilient production environments
Use tools such as Terraform, Ansible, Kubernetes, and Dynatrace to support mission-critical applications
Work within cloud environments (AWS, Google Cloud Platform) and Kubernetes-based infrastructure, with guidance on complex design decisions
Identify performance bottlenecks and reliability gaps, and implement improvements
Build and maintain infrastructure as code (Terraform, Ansible) for provisioning, configuration, patching, and releases
Contribute to observability and monitoring (Dynatrace, Prometheus), including dashboards, alerts, runbooks, and tuning
Support software releases, including validation, rollback planning, and post-change verification across ADT+ and legacy platforms
Provide production support, including on-call participation, incident response, remediation follow-through, and support for customer-impacting issues during major incidents

What You'll Need:

3+ years of experience in SRE, DevOps, platform engineering, software engineering, or related roles with production and on-call responsibility
Background in systems or operations with progression toward engineering work (automation, scripting, IaC, observability)
Focus on production operations and reliability for distributed applications
Experience with infrastructure as code (Terraform, Ansible), including building and maintaining environments
Experience working in cloud environments (AWS and/or Google Cloud Platform)
Familiarity with Kubernetes in production environments
Proficiency in at least one programming or scripting language (Python, Java, Bash, or similar), including working with existing codebases
Understanding of software development and change management practices
Experience with monitoring and observability tools (Dynatrace, Prometheus, or similar)
Ability to diagnose and resolve production issues with sound judgment around risk, rollback, and escalation
Experience with CI/CD pipelines and automation tools
Familiarity with incident response and post-incident follow-up
Strong communication skills and ability to collaborate across teams
Comfortable learning complex systems and seeking guidance when needed
Comfortable using AI tools to accelerate investigation, automation, and documentation while maintaining sound engineering judgment

Preferred Qualifications:

Experience with Kafka, Java/JVM ecosystems, or large customer-facing platforms
Experience with security remediation at scale (patch SLAs, CVE response, OS upgrades)
Experience working with Jira-driven workflows and cross-team escalation
Familiarity with Harness, enterprise Git workflows, and audit-driven change controls

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10507796
Position Id: 3094e4fc0902f6f1fa2057c281860d07
Posted 2 days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Lead Site Reliability Engineer

Philadelphia, Pennsylvania

•

Today

Location: Philadelphia, PA Salary: $150,000.00 USD Annually - $180,000.00 USD Annually Description: We are seeking a Lead Site Reliability Engineer (SRE) who combines deep technical expertise with strong leadership and client-facing capabilities. This is a high-impact role responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure and kiosk platform. You will lead a team of engineers while remaining hands-on, owning uptime, SLAs, and incident management,

Contract

USD 150,000.00 - 180,000.00 per year

Cloud Site Reliability Engineer (SRE)

Berkeley Heights, New Jersey

•

Today

Location: Berkeley Heights, NJ Salary: $70.00 USD Hourly - $80.00 USD Hourly Description: Job Title: Cloud Site Reliability Engineer (SRE) Location: Berkeley Heights, NJ / Alpharetta, GA /Frisco, TX (Onsite 5 Days) Duration: Contract To Hire Job Description: Position Overview: We are seeking a Cloud Site Reliability Engineer (SRE) to drive the reliability, scalability, and performance of our cloud-based infrastructure. The ideal candidate combines software engineering expertise with advanced s

Contract

USD 70.00 - 80.00 per hour

AppOpps Engineer

Berkeley Heights, New Jersey

•

16d ago

AppOpps Engineer Berkeley Heights: NJ (5days Onsite) Contract Job Summary AppOpps Engineer We are seeking a AppOpps Engineer Resource having 8+ years of professional experience ensuring the reliability, scalability, and performance of Google Cloud-based services through automation, monitoring, and proactive engineering. Key responsibilities include managing infrastructure as code (Terraform), optimizing GKE/Kubernetes, incident response, and implementing SLIs/SLOs to minimize manual toil. This

Easy Apply

Full-time

$50 - $55

Staff Site Reliability Engineer

Remote or California

•

Today

Description POSITION DESCRIPTION: We are looking for a Staff Site Reliability Engineer (SRE) to join our Grid & Energy Services team to ensure our systems are highly available, scalable, and reliable in production environments. This role will focus on system performance, observability, and operational excellence across cloud-based and distributed systems supporting energy infrastructure. LOCATION & WORK ARRANGEMENT This role is remote, however, the candidate(s) will be expected to relocate at t

Full-time

USD 140,000.00 - 180,000.00 per year

Search all similar jobs

More jobs at ADT LLC in Blue Bell, PA