SRE Engineer / DevOps SRE Engineer

Overview

On Site

Contract - W2

Contract - 31 day((s))

Skills

DevOps

Terraform

Ansible

Jenkins

Docker

Kubernetes

Automation

Infrastructure as Code (IaC)

Architecture

Ruby

cloud computing

Quality Management

metrics

Python (Programming Language)

Communication Skills

Microsoft Azure

Stakeholder Management

Team Working

Amazon Web Services

Analytical Thinking

Computer Programming

Java (Programming Language)

Software Engineering

Testing Skills

Problem Solving

Reliability

Continuous Integration

Administration of Computer Systems

Containerisation

Networking Skills

Perseverance

Self Motivation

Backlogs

C++ (Programming Language)

Incident Response

Instrumentation

Reliability Engineering

Software Requirements Analysis

Spinnaker

Job Details

Job Role: SRE Er./ Devops SRE ER.
Location: Dallas, Texas (Hybrid)
Duration: FTE
Exp: 10+ Years

Job Description
Job Summary
The Site Reliability Engineer (SRE) role bridges software engineering and systems administration. Beyond ensuring the reliability and performance of platforms, the role also focuses on working with Development and Architecture teams to address:

quality (gates and measurement criteria)
foundational architecture and stack components
metrics, trackers, and baselines
automated operations

Key Responsibilities and Skills of an SRE:

Automation - automate tasks (scripts and triggers and workflow automations) for deployment, monitoring, and incident response (improve efficiency and reduce manual effort)
Monitoring and Observability design instrumentation and identify KPIS/Metrics and identify Events/Alerting to track system health and identify potential issues proactively.
Incident Response - responsible for responding to and resolving incidents that have exceeded L1/L2 thresholds. Work with L3 teams to ensure minimal downtime and a quick return to normal operations as well as identifying and following up on problem backlogs and shift left initiatives.
Infrastructure as Code (IaC) - Use tools like Terraform or Ansible to manage infrastructure as code, enabling repeatable and scalable deployments.
Collaboration - Work closely with architecture, development, QA and Testing, and Operations teams to understand system requirements and contribute to the overall resilience of the software/platform.
Problem-Solving - They possess strong analytical and problem-solving skills to diagnose and resolve complex issues.
Communication - Communicate effectively with both technical and non-technical stakeholders, translating technical details into actionable insights.
Soft Skills - Ability to work in a team, manage their time effectively, and be proactive in identifying and addressing potential problems.

Technical Skills:

Programming - Experience with languages like Python, Java, C/C++, or Ruby can be beneficial along with IaC languages (Ansible, Terraform, and Cloud Native).
Cloud Platforms - Knowledge of cloud platforms like AWS, Azure, or Google Cloud Platform is highly valued.
Containerization - Familiarity with container technologies like Docker and Kubernetes is essential.
Networking and System Administration - Strong understanding of networking and system administration principles is crucial.
CI/CD - Experience with CI/CD tools like Jenkins, Harness, or Spinnaker is valuable

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About DRC Systems USA LLC

Share