Site Reliability Engineer || Remote || Contract

Overview

Remote

Depends on Experience

Contract - W2

Contract - Independent

Skills

Agile

Amazon Web Services

AppDynamics

Build Automation

Cloud Computing

Collaboration

Communication

Conflict Resolution

Continuous Delivery

Continuous Integration

Debugging

DevOps

Good Clinical Practice

Google Cloud Platform

High Availability

Incident Management

Java

Kubernetes

Management

Microservices

Microsoft Azure

Operational Efficiency

Performance Monitoring

Problem Solving

Python

Reliability Engineering

Root Cause Analysis

Scalability

Software Architecture

Software Engineering

Splunk

Telecommunications

Terraform

Job Details

Role- Site Reliability Engineer

Experience- 8 Years

Location: Remote

Job Type- Contract

Key Responsibilities:

Develop and maintain reliable, scalable, and secure systems in Java, Go, and Python.
Design, implement, and manage Kubernetes clusters and associated microservices.
Build automation and monitoring tools to enhance system reliability and operational efficiency.
Utilize observability tools such as Splunk and AppDynamics for proactive incident detection and resolution.
Collaborate with development and operations teams to ensure end-to-end system reliability.
Perform root cause analysis, contribute to postmortems, and implement long-term fixes.
Participate in on-call rotations and drive incident response and resolution.
Support application deployment and integration in a large-scale telecom environment.

Required Skills & Qualifications:

5+ years of experience as a Site Reliability Engineer.
Strong proficiency in Java and Go (must-have); experience with Python is a plus.
Hands-on experience with Kubernetes, CI/CD, containerization, and service mesh.
Expertise in observability tools: Splunk, AppDynamics, or similar platforms.
Solid background in software engineering and application architecture.
Experience in application performance monitoring, scalability, and high availability.
Knowledge of telecom systems and domain-specific challenges is highly preferred.
Strong problem-solving and debugging skills across distributed systems.
Excellent communication and collaboration abilities in a remote work environment.

Nice to Have:

Experience with cloud platforms (AWS, Google Cloud Platform, or Azure).
Familiarity with Infrastructure as Code (Terraform, Helm).
Exposure to agile and DevOps best practices.
experience with telecom service environments.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share