Devops Engineer

Overview

On Site
Depends on Experience
Full Time

Skills

scripting languages (Python
Bash).
cloud provider (AWS
Azure
or GCP).
CI/CD pipelines
containerization and orchestration (Docker
Kubernetes).

Job Details

Sr Devops Engineer

Job Type: Full-time employment

Location : Onsite McLean, VA

Job Description:

Looking for a strong Devops guy and AWS is highly preferred.

The client is AI provides AI-driven solutions specifically for revenue cycle management in the dental and behavioural health sectors. Their platform uses AI agents to automate tasks like billing, collections, and accounting, aiming to improve efficiency, reduce errors, and boost cash flow. They emphasize easy integration with existing systems and offer support to their clients. The goal is to free up healthcare finance teams to focus on other important areas and help address challenges.

Role Overview:

  • We are seeking a highly experienced and skilled Infrastructure & Site Reliability Engineer to join our team and take full ownership of the infrastructure, site reliability, and the entire production system for our cutting-edge Agentic AI Platform. You will be responsible for designing, building, maintaining, and scaling our critical systems, ensuring their reliability, performance, security, and cost-efficiency. This role requires a deep understanding of system architecture, automation, and a proactive approach to preventing and resolving production issues.
  • Responsibilities:
  • Own the design, implementation, and management of scalable, reliable, and secure cloud infrastructure across the entire production environment on platforms like AWS, Azure, or Google Cloud Platform.
  • Be responsible for the overall site reliability and performance of the platform, implementing SLOs/SLAs and ensuring high availability.
  • Develop and maintain robust CI/CD pipelines for automated building, testing, and deployment of our AI platform components.
  • Implement and manage infrastructure as code (IaC) using tools like Terraform or CloudFormation.
  • Design, set up, and maintain comprehensive monitoring, logging, alerting, and tracing systems to gain deep visibility into system health and performance.
  • Proactively identify and resolve complex infrastructure and production issues, often before they impact users.

Qualifications:

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
  • Minimum of 7+ years of professional experience in Infrastructure Engineering, Site Reliability Engineering (SRE), DevOps, or a related role with significant production system ownership.
  • Extensive experience designing, building, and managing infrastructure on at least one major cloud provider (AWS, Azure, or Google Cloud Platform).
  • Proven experience with infrastructure as code tools (Terraform, CloudFormation, etc.).
  • Strong experience designing and implementing robust CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, CircleCI, etc.).
  • Deep experience with containerization and orchestration (Docker, Kubernetes).
  • Proficiency in scripting languages (Python, Bash).
  • Extensive experience with monitoring, logging, alerting, and tracing tools (Prometheus, Grafana, ELK stack, Datadog, New Relic, etc.).
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About eSolutionsFirst, LLC