Apply Now

SRE Operations Engineer

Dallas, TX, US • Posted 3 hours ago • Updated 3 hours ago

Contract W2

Contract Independent

No Travel Required

On-site

$52 - $58/hr

Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

API
Amazon EC2
Amazon Web Services
Apache Kafka
Artificial Intelligence
Bash
Cloud Computing
Computer Networking
Dashboard
Data Link Layer
DevOps
Good Clinical Practice
Google Cloud Platform
Incident Management
Kubernetes
Grafana
Firewall
Documentation
JIRA
Physical Layer
IT Service Management
Microsoft Azure
Network Security
NOC
NoSQL
Python
Proxies
Scripting
Windows PowerShell
ServiceNow
SQL
Storage
Splunk
Linux
IT Operations
Database
DNS
Virtual Machines
Workflow
WAF

Summary

L1 SRE Operations Engineer in Dallas, TX. Start date is 5/11 and will last for 1 YEAR. This is a temp to perm role

can be based in Dallas TX or Overland Park, KS

Pay Rate is 52.00-58.00/hr

Skills (EXPERT/ADVANCED/NONE):
System & Infrastructure Monitoring
Runbook Execution
Incident Triage & Communication
Kubernetes (Cloud or onprem) operations knowledge
Scripting (Python, Bash, PowerShell)
Networking & Security Awareness
Documentation & Knowledge Capture

Questions (must reply YES to ALL):
Do you have System & Infrastructure Monitoring experience?
Do you have Runbook Execution experience?

Description:
SRE Operations Engineer

The L1 SRE is the first line of defense in monitoring, triaging, and executing standardized operational tasks for all enterprise applications running on standard patterns and platforms like Kubernetes, APIs, WAF, databases, API Proxy (Gloo, APIGEE), Kafka, and Cloud (AWS/Azure/Google Cloud Platform). They will followrunbooks, leverage automation, and escalate appropriately to minimize downtime.

Skills
Mandatory Skills (Must-Have)
1. System & Infrastructure Monitoring
Expectation: Ability to use monitoring dashboards (e.g., Grafana, Datadog, Splunk, Argos, AIOps) toidentify anomalies, follow alert workflows, and escalate when thresholds are breached.
Example: When a Kubernetes pod crash-loop is flagged in Prometheus, L1 should validate it againstrunbooks, check pod logs, and escalate if restart attempts fail.
2. Runbook Execution
Expectation: Strictly follow documented steps to resolve standard incidents, escalate when stepsdo not apply or fail.
Example: Use a provided runbook to restart a failed API proxy service; if error persists beyonddocumented steps, escalate to L2.
3. Incident Triage & Communication
Expectation: Perform first-line triage of alerts, gather logs/metrics, categorize severity, and notifystakeholders in clear, concise language.
Example: For a database connection timeout, collect error logs, verify service reachability, andprovide a detailed incident note to L2 before escalation.
4. Kubernetes (Cloud or onprem) operations knowledge
Expectation: Ability to check pod status, understand logs, and verify service endpoints usingkubectl and monitoring tools.
Example: Run kubectl get pods -n to verify if deployments arehealthy.
5. Scripting (Python, Bash, PowerShell)
Expectation: Able to read and make small edits to scripts to automate repetitive checks.
Example: Modify a Bash script to include an additional log path in a health check.
6. Networking & Security Awareness
Expectation: Understand troubleshooting (ping, netstat, curl, traceroute) and know when issuesmay be related to firewall, WAF, or proxy.
Example: For an unreachable service, confirm DNS resolution and connectivity before escalating toL2.
7. Documentation & Knowledge Capture
Expectation: Accurately record steps taken during incidents, suggest runbook updates where gapsexist.
Example: After handling an alert for disk usage, note missing cleanup steps in the runbook and flagfor update.

Preferred Skills (Nice-to-Have)
1. Cloud Platform Familiarity (AWS, Azure, Google Cloud Platform)
Expectation: Understand basics of cloud services (VMs, load balancers, storage) and how tonavigate a cloud console.
Example: Use AWS Console to check EC2 instance health status when a service alert is triggered.
2.Database Basics (SQL/NoSQL)
Expectation: Run simple queries to validate DB connectivity and health.
Example: Execute
SELECT 1; to verify a database is reachable.
3. Automation & Self-Service Mindset
Expectation: Identify repetitive manual steps and propose candidates for automation.
Example: Flag that manual log collection during outages could be replaced with a script.
4. Exposure to Incident Management Tools (xMatters, ServiceNow, Jira, etc.)
Expectation: Comfortable working within ITSM/incident workflows.
Example: Log incident details in ServiceNow with accurate categorization and timestamps.
5. AI/Chatbot-Assisted Ops (emerging skill)
Expectation: Use AI assistants to search runbooks or suggest remediation steps.
Example: Ask an AI ops assistant to summarize logs before escalation.

Qualifications
2–5 years in IT operations, NOC, or SRE/DevOps engineer role.
Kubernetes 101, Linux 101, Networking 101
Understanding of cloud-ready applications
Understanding of observability tools (Prometheus, Grafana, ELK, Splunk, etc.).
Strong troubleshooting mindset, ability to follow structured workflows. Eg: 5 Why?s and Fishbone

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: csema
Position Id: bh19038
Posted 3 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Dallas, Texas

•

Today

C&DE-CMT-SRE Operations Engineer with - Kubernetes, APIs, WAF, databases, API Proxy (Gloo, APIGEE), Kafka, and Cloud (AWS/Azure/Google Cloud Platform) - Onsite work - Dallas TX or Overland Park KS Requisition Name: C&DE-CMT-SRE Operations Engineer Start Date: 5/11/2026 Duration: 55 Weeks Services Location: TX/Dallas Max Rate: $57 phr on W2 or 63 $ phr on Corp- Corp all inc Description Of Services: SRE Operations Engineer The L1 SRE is the first line of defense in monitoring, triaging, and execut

Easy Apply

Contract, Third Party

Depends on Experience

Senior OpenShift Engineer

Irving, Texas

•

Today

Outstanding long-term contract opportunity! A well-known Financial Services Company is looking for a Senior OpenShift DevOps Engineer in Charlotte, NC, Chandler, AZ. or Irving, TX (hybrid 3 days on-site). We are seeking a highly experienced Senior OpenShift Engineer to guide the operational reliability, governance, and performance of our Kubernetes and OpenShift platforms. This role leads operational strategy, incident readiness, stability engineering, automation initiatives, and day to day oper

Easy Apply

Contract

$65 - $71.23

Site Reliability Engineer -Dallas, TX- Only Locals

Hybrid in Dallas, Texas

•

Today

7+ years of experience in Site Reliability Engineering, DevOps, Cloud Infrastructure, or Production Operations roles. Strong experience operating workloads in cloud environments such as Microsoft Azure, AWS, or Google Cloud. Hands-on experience with Kubernetes, Docker, CI/CD pipelines, and Infrastructure as Code tools. Strong scripting and automation skills using Python, Bash, PowerShell, Go, or similar languages. Experience with observability and monitoring platforms such as Datadog, Grafana, P

Easy Apply

Contract

Depends on Experience

DevOps Architect

Dallas, Texas

•

Today

DevOps Architect in Frisco, TX. Start date is 5/15 and will last for 5+ MONTHS. This is a temp to perm role LOCAL CANDIDATES ONLY -- Dallas local preferred with expectation to work in Dallas office. Pay Rate is 73-88/hr Skills (EXPERT/ADVANCED/NONE): DevOps Architect Questions (must reply YES to ALL): 12+ years DevOps/Platform Engineering 10+ years in DevOps/Cloud engineering Proven experience in enterprise-scale toolchain integration, Design, implement migration of the DevOps tool from On-P

Easy Apply

Contract

73 - 88

Search all similar jobs

SRE Operations Engineer

Dice Job Match Score™

Job Details

Skills

Summary

Similar Jobs