-SRE Operations Engineer with - Kubernetes, APIs, WAF, databases, API Proxy (Gloo, APIGEE), Kafka, and Cloud (AWS/Azure/Google Cloud Platform) - Onsite work - Dallas TX or Overland Park KS

Dallas, TX, US • Posted 1 hour ago • Updated 1 hour ago
Contract W2
Contract Independent
Contract Corp To Corp
On-site
Depends on Experience
Company Branding Image
Fitment

Dice Job Match Score™

🤯 Applying directly to the forehead...

Job Details

Skills

  • SRE Operations Engineer
  • Kubernetes
  • APIs
  • WAF
  • databases
  • API Proxy
  • Gloo
  • APIGEE
  • Kafka
  • AWS
  • Azure
  • GCP

Summary

C&DE-CMT-SRE Operations Engineer with - Kubernetes, APIs, WAF, databases, API Proxy (Gloo, APIGEE), Kafka, and Cloud (AWS/Azure/Google Cloud Platform) - Onsite work - Dallas TX or Overland Park KS

Requisition Name: C&DE-CMT-SRE Operations Engineer

Start Date: 5/11/2026

Duration: 55 Weeks

Services Location: TX/Dallas

Max Rate: $57 phr on W2 or 63 $ phr on Corp- Corp all inc

Description Of Services:
SRE Operations Engineer

The L1 SRE is the first line of defense in monitoring, triaging, and executing standardized operational tasks for all enterprise applications running on standard patterns and platforms like Kubernetes, APIs, WAF, databases, API Proxy (Gloo, APIGEE), Kafka, and Cloud (AWS/Azure/Google Cloud Platform). They will followrunbooks, leverage automation, and escalate appropriately to minimize downtime.

Skills

Mandatory Skills (Must-Have)

  1. System & Infrastructure Monitoring

Expectation: Ability to use monitoring dashboards (e.g., Grafana, Datadog, Splunk, Argos, AIOps) toidentify anomalies, follow alert workflows, and escalate when thresholds are breached.

Example: When a Kubernetes pod crash-loop is flagged in Prometheus, L1 should validate it again strunbooks, check pod logs, and escalate if restart attempts fail.

  1. Runbook Execution

Expectation: Strictly follow documented steps to resolve standard incidents, escalate when stepsdo not apply or fail.

Example: Use a provided runbook to restart a failed API proxy service; if error persists beyond documented steps, escalate to L2.

  1. Incident Triage & Communication

Expectation: Perform first-line triage of alerts, gather logs/metrics, categorize severity, and notify stakeholders in clear, concise language.

Example: For a database connection timeout, collect error logs, verify service reachability, and provide a detailed incident note to L2 before escalation.

  1. Kubernetes (Cloud or on-prem) operations knowledge

Expectation: Ability to check pod status, understand logs, and verify service endpoints using kubectl and monitoring tools.

Example: Run kubectl get pods -n to verify if deployments are healthy.

  1. Scripting (Python, Bash, PowerShell)

Expectation: Able to read and make small edits to scripts to automate repetitive checks.

Example: Modify a Bash script to include an additional log path in a health check.

  1. Networking & Security Awareness

Expectation: Understand troubleshooting (ping, netstat, curl, traceroute) and know when issues may be related to firewall, WAF, or proxy.

Example: For an unreachable service, confirm DNS resolution and connectivity before escalating toL2.

  1. Documentation & Knowledge Capture

Expectation: Accurately record steps taken during incidents, suggest runbook updates where gapsexist.

Example: After handling an alert for disk usage, note missing cleanup steps in the runbook and flagfor update.

Preferred Skills (Nice-to-Have)

  1. Cloud Platform Familiarity (AWS, Azure, Google Cloud Platform)

Expectation: Understand basics of cloud services (VMs, load balancers, storage) and how tonavigate a cloud console.

Example: Use AWS Console to check EC2 instance health status when a service alert is triggered.

2.Database Basics (SQL/NoSQL)

Expectation: Run simple queries to validate DB connectivity and health.

Example: Execute

SELECT 1; to verify a database is reachable.

  1. Automation & Self-Service Mindset

Expectation: Identify repetitive manual steps and propose candidates for automation.

Example: Flag that manual log collection during outages could be replaced with a script.

  1. Exposure to Incident Management Tools (xMatters, ServiceNow, Jira, etc.)

Expectation: Comfortable working within ITSM/incident workflows.

Example: Log incident details in ServiceNow with accurate categorization and timestamps.

  1. AI/Chatbot-Assisted Ops (emerging skill)

Expectation: Use AI assistants to search runbooks or suggest remediation steps.

Example: Ask an AI ops assistant to summarize logs before escalation.

Qualifications

2 5 years in IT operations, NOC, or SRE/DevOps engineer role.

Kubernetes 101, Linux 101, Networking 101

Understanding of cloud-ready applications

Understanding of observability tools (Prometheus, Grafana, ELK, Splunk, etc.).

Strong troubleshooting mindset, ability to follow structured workflows. Eg: 5 Why?s and Fishbone

Deliverables:
Monitor system health, alerts, dashboards, and logs across cloud and on-prem infrastructure. Ability to isolate functional issue with application versus platform Execute standardized runbooks for incident resolution, deployments, and routine tasks. Perform initial triage of incidents and escalate to L2/L2+ as needed to mitigate the issue to get tobypass. Document new issues, gaps in runbooks, and automation opportunities. Provide excellent communication to stakeholders during incidents. Support onboarding of new applications into the operations framework.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10370547
  • Position Id: 8964643
  • Posted 1 hour ago

Company Info

About Vinsari LLC

Vinsari is a leading IT solution provider with vast experience in designing, developing and implementing mission-critical solutions. Vinsari has been successful in helping clients achieve their strategic objectives including: 
• Improving company revenues and cutting costs by optimization of workflow 
• Reaching operational excellence 
• Developing ideas into viable solutions 
• Increasing customer satisfaction 

In addition to being a premier provider of solutions for complex information technology issues faced by businesses today, Vinsari' solutions list include the following: 
• Application Development and Management 
• Web and User Experience Design 
• Quality Assurance 
• Database Administration 
• Data Warehousing and Business Intelligence 
• Infrastructure Management 
• Professional IT Staffing 
• Virtualization 
• Information Security 

We view our business as successful only when our clients are successful. We regard it as a privilege to serve our customers and are committed to doing whatever it takes to ensure they are 100 percent satisfied. All of us at Vinsari gauge our success by our customers' success. We judge ourselves individually and collectively based on how our customers are fulfilling their management objectives. 

Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote or Hybrid in Dallas, Texas

Today

Easy Apply

Contract, Third Party

Depends on Experience

Remote or Dallas, Texas

9d ago

Easy Apply

Contract, Third Party

Depends on Experience

Hybrid in Charlotte, North Carolina

7d ago

Easy Apply

Contract, Third Party

$60 - $70

Redmond, Washington

Yesterday

Easy Apply

Third Party, Contract

Depends on Experience

Search all similar jobs