AI-Driven SRE / DevOps Engineer – AWS, Bedrock, Kubernetes

Hybrid in Ashburn, VA, US • Posted 1 day ago • Updated 1 day ago
Full Time
No Travel Required
Able to Sponsor
Hybrid
Depends on Experience
Company Branding Image
Fitment

Dice Job Match Score™

🤯 Applying directly to the forehead...

Job Details

Skills

  • AI/ML
  • LLM
  • DevSecOPS

Summary

Job Title

AI-Driven SRE / DevOps Engineer – AWS, Bedrock, Kubernetes

Location

Ashburn, VA (Hybrid / Onsite)

Job Type

Contract / C2C / W2

Experience

5+ Years


Job Description

We are seeking a Senior AI-Driven Site Reliability Engineer (SRE) / DevOps Engineer to design and build intelligent automation systems that improve platform reliability and operational efficiency.

The ideal candidate will combine cloud infrastructure expertise with AI-powered automation, enabling faster incident detection, root cause analysis, and automated remediation for production systems.


Key Responsibilities

AI-Driven SRE Automation

  • Design and develop AI agents for incident detection, root cause analysis (RCA), auto-remediation, and post-incident reporting.

  • Integrate LLM capabilities using Amazon Bedrock and Claude for:

    • Log summarization

    • Anomaly detection

    • Intelligent alert correlation

    • Change impact analysis

    • Runbook automation

AI Developer Tooling

  • Implement AI-assisted developer workflows using:

    • Cursor

    • GitHub Copilot

  • Build internal AI-enhanced platforms to support SRE and DevOps teams.

Cloud Infrastructure

  • Deploy and manage cloud infrastructure using Infrastructure-as-Code tools such as Terraform and AWS CloudFormation.

  • Manage scalable cloud environments on Amazon Web Services.

Reliability & Observability

  • Implement AI-powered observability pipelines.

  • Develop self-healing infrastructure frameworks.

  • Improve MTTR using predictive and generative AI models.

  • Work with monitoring platforms such as Prometheus, Grafana, and Amazon CloudWatch.


Required Qualifications

  • 5+ years of experience in Site Reliability Engineering / DevOps

  • 2+ years of experience supporting AWS production workloads

  • Hands-on experience with:

    • Amazon Bedrock

    • AI agent frameworks such as LangChain

    • Programming with Python, Go, or TypeScript

  • Strong experience with:

    • Kubernetes (preferably Amazon EKS)

    • CI/CD tools such as GitHub Actions or Jenkins

    • Monitoring tools (Prometheus, Grafana, CloudWatch)

  • Background in platform engineering

  • Experience building automation frameworks for infrastructure operations


Preferred Qualifications

  • Experience building AI agents for infrastructure automation

  • Exposure to Generative AI and LLM-based workflows

  • Experience designing observability and reliability platforms


Contact Information

Vamshi Vanam
📧

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91171479
  • Position Id: AT519
  • Posted 1 day ago

Company Info

About Astratek

At Astratek, we provide cutting-edge IT services, including AI, cybersecurity, risk management, strategic planning, data analytics and data transmission, to drive your business forward.

About_Company_OneAbout_Company_Two
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

It looks like there aren't any Similar Jobs for this job yet.

Search all similar jobs