Role: AI Engineer Level III
Location: Washington DC, Onsite
Position Summary
As a senior AI Engineer, you will architect and lead the delivery of scalable, secure GenAI systems with enterprise-grade performance. Your focus will include RAG pipelines, agentic orchestration, and cloud-native ML infrastructure across Azure and AWS. You'll own solution architecture, direct engineering execution, and align technical delivery to strategic business outcomes.
Key Responsibilities
AI Solution Architecture & Delivery
- Lead end-to-end design of RAG pipelines using Azure AI/Search and vector DBs (Redis, FAISS, HNSW).
- Deliver multi-turn, retrieval-grounded conversational systems with robust prompt lifecycle, guardrails, and telemetry.
- Drive integration of multi-modal LLMs (Azure OpenAI, Claude, Llama, OSS models) with dynamic model routing for cost and safety.
AI Infrastructure Leadership
- Architect and deploy Model Context Protocol (MCP) servers with RBAC, versioning, audit logging, validation, and rate limiting.
- Build policy-compliant agent ecosystems using Azure AI Agent Service: registry, broker, telemetry, governance enforcement.
- Manage high-throughput inferencing pipelines using Azure Batch and distributed AI data flows with AWS EMR.
Enterprise Data & Feature Pipelines
- Oversee RAG data ingestion and enrichment: doc normalization, PII redaction, metadata tagging, SLA/SLO monitoring, lineage.
- Lead vectorization workflows with drift monitoring and quality gates.
- Architect and optimize Azure Data Factory, Databricks, and AWS EMR data engineering for scalable AI features.
Agentic AI Systems
- Engineer and govern secure tool-calling and multi-agent orchestration using Semantic Kernel, AutoGen, Microsoft Agent Framework, CrewAI, Agno, LangChain.
- Enforce MCP-based controls for heterogeneous agents across runtimes, ensuring safety and traceability.
Model Operations & Governance
- Evaluate, fine-tune, and optimize models for quality, safety, cost, and latency using A/B and offline evaluation suites.
- Define CI/CD pipelines for AI workloads including automated tests, scans, safety tools, and trace logging.
- Ensure security posture of AI/LLM workloads via threat modeling and secure software practices.
Engineering & Leadership Core
- Strong CS fundamentals: distributed systems, concurrency, networking, complexity.
- Expert-level SDLC: clean architecture, SOLID, layered testing, DevSecOps.
- Secure AI app development: sandboxed tools, secrets hygiene, RBAC.
- Performance engineering: latency profiling, cost tuning (token, embedding, GPU), vector DB indexing.
- Lead agile ceremonies, cross-functional delivery, and roadmap execution with RACI clarity.
Cloud AI Tech Stack
Azure: Azure OpenAI, Azure AI/Search, AML, AKS, Azure Batch, ADF, Azure Databricks, Azure Functions, API Management, Key Vault, App Insights
AWS: SageMaker, Bedrock, Lambda, API Gateway, Comprehend, S3, CloudWatch, EMR, EKS
Vector DBs: Azure AI Search, Redis, FAISS/HNSW
Frameworks: Semantic Kernel, AutoGen, Microsoft Agent Framework, CrewAI, Agno, LangChain
Inference: Docker/Ollama, vLLM, Triton, quantized Llama (GGUF), edge inference, GPU provisioning
Qualifications
Education: Bachelor's in CS, Engineering, or related; Master's preferred
Experience: 8+ years in software engineering, 2+ in applied GenAI (RAG, agent systems, model safety/eval)
Required Skills/Abilities:
- GenAI architecture mastery: RAG, vector DBs, embeddings, transformer internals, multi-modal pipelines.
- Agentic systems: Azure AI Agent Service patterns, MCP servers, registry/broker/governance, secure tool-calling.
- Languages: C# and Python (production-grade), .Net, plus TypeScript for service/UI when needed.
- Azure & AWS services (see Knowledge Requirements) with hands-on implementation and operations.
- Model ops: eval suites, safety tooling, fine-tuning, guardrails, traceability.
- Business & delivery: solution architecture, stakeholder alignment, roadmap planning, measurable impact.
Desired Skills/Abilities (not required but a plus):
- LangChain, Hugging Face, MLflow; Kubernetes + GPU scheduling; vector search tuning (HNSW/IVF).
- Responsible AI: policy mapping, red-team playbooks, incident response for AI.
- Hybrid/multi-cloud deployments using Azure Arc and AWS Outposts; CI/CD for AI workloads across Azure DevOps and AWS CodePipeline.
Certifications (Required)
- Azure AI Fundamentals (AI-900) & Data Fundamentals (DP-900)
- Responsible AI Certifications
- AWS Machine Learning Specialty
- TensorFlow Developer
- Kubernetes CKA or CKAD
- SAFe Agile Software Engineering
Preferred:
- Azure AI Engineer Associate (AI-102)
- Azure Data Scientist (DP-100)
- Azure Solutions Architect Expert (AZ-305)
- Azure Developer Associate (AZ-204)
Ready to lead AI at scale? Apply now and architect the future of enterprise intelligence.