Job Title: DevOps Engineer Location: Rockville, MD or Tysons Corner, VA Duration: 6+ months
The DevOps/Infrastructure Engineer builds and maintains cloud infrastructure and deployment pipelines for a generative AI platform. This role is responsible for environment provisioning, container orchestration, CI/CD automation, and infrastructure scaling to support a microservice-based architecture. The DevOps/Infrastructure Engineer takes a broad perspective to infrastructure problems and exercises independent judgment in selecting techniques and evaluation criteria to obtain results.
Infrastructure & Cloud Architecture
Provision and manage AWS cloud environments across development, staging, and production tiers
Deploy and operate containerized services and tool servers within a microservice architecture
Manage container orchestration with blue-green and canary deployment strategies
Support infrastructure for platform dependencies including search engines, caching layers, and relational databases
Exercise independent judgment in evaluating and selecting infrastructure approaches and tooling
Build and manage infrastructure on AWS, leveraging services such as ECS, ECR, IAM, VPC, S3, and CloudWatch
DevOps & CI/CD
Maintain and extend CI/CD pipelines using shared library patterns for service builds, testing, and deployment across a microservice fleet
Collaborate with developers on container optimization, startup configuration, and environment management
Identify gaps between system components and designs and deliver solutions that enable team autonomy
Develop actionable insights from analyzing infrastructure trends and DevOps best practices, communicating recommendations to management
Monitoring, Scaling & Reliability
Implement infrastructure monitoring, alerting, and scaling strategies for AI workloads and supporting services
Ensure platform reliability through capacity planning, performance testing, and incident response processes
Optimize resource utilization and cost across the cloud environment
Quality & Testing
Implement and maintain automated infrastructure testing including smoke tests, health checks, and deployment verification
Design and execute load testing and performance benchmarking for platform services and AI workloads
Ensure CI/CD pipelines enforce quality gates including linting, security scanning, and test execution before deployment
Validate infrastructure-as-code changes through automated testing and peer review processes
Mentorship & Collaboration
Guide team members in infrastructure best practices, deployment patterns, and operational procedures
Partner within and across teams to meet shared goals and priorities around platform stability and delivery velocity
Champion collaborative resolution of infrastructure issues and contribute to internal process improvement initiatives
Security & Compliance
Assist with adherence to technology policies and comply with all security controls
Implement secure coding practices, particularly in handling personally identifiable information (PII) and sensitive regulatory data
Participate in threat modeling and security discussions for API and infrastructure components
Understand and apply CLIENT's security standards and best practices for regulated financial environments