Role Overview
The client is seeking an AWS AI Engineer to design, deploy, and manage scalable AI/ML and Kubernetes-based solutions in a production environment. This role focuses on EKS operations, LLMOps, and building robust AI platforms with governance and performance optimization. The job is open to candidates within the 100 miles radius of Herndon, VA.
Key Responsibilities
Deploy, manage, and troubleshoot Kubernetes clusters, including disconnected installations
Lead EKS cluster deployment, upgrades, and production support activities
Implement and manage LLMOps workflows, including monitoring and scaling generative AI systems
Build and maintain agentic workflows using frameworks such as LangChain, CrewAI, or AutoGen
Design and optimize RAG pipelines with vector databases for low-latency performance
Implement AI governance practices, including guardrails and cost optimization strategies
Develop Internal Developer Platforms (IDP) for AI experimentation and deployment
Required Skills (with Years)
Kubernetes Administration & Troubleshooting 5+ years
AWS EKS Deployment & Operations 5+ years
LLMOps / Generative AI Systems 4+ years
Agentic Frameworks (LangChain, CrewAI, AutoGen) 3+ years
Vector Databases & RAG Optimization 3+ years
AI Governance & Security Guardrails 3+ years
Preferred Skills
Internal Developer Platform (IDP) Development for AI
Cost Optimization Strategies for LLM Consumption
Production-level AI System Monitoring and Scaling
Preferred Certifications
AWS Certified Solutions Architect (Active Certification Required)
AWS Certified AI Foundations or AWS Certified AI Professional
Certified Kubernetes Administrator (CKA) (Active Certification Required)