Sr AI Platform Engineer - Sr Director Level (must be within commuting distance to Centreville, VA)

Hybrid in Centreville, VA, US • Posted 3 hours ago • Updated 3 hours ago
Full Time
No Travel Required
On-site
Depends on Experience
Company Branding Image
Fitment

Dice Job Match Score™

🔢 Crunching numbers...

Job Details

Skills

  • API
  • Access Control
  • Amazon Web Services
  • Artificial Intelligence
  • Budget
  • Cloud Computing
  • Collaboration
  • Communication
  • Computer Science
  • Continuous Delivery
  • Continuous Integration
  • Data Security
  • Design Controls
  • Docker
  • Evaluation
  • FISMA
  • FedRAMP
  • Good Clinical Practice
  • Google Cloud Platform
  • Hardening
  • ISO 9000
  • IT Management
  • Incident Management
  • Kubernetes
  • Leadership
  • LlamaIndex
  • Legal
  • Machine Learning (ML)
  • Management
  • Mentorship
  • Microsoft Azure
  • Microsoft Certified Professional
  • NIST 800-53
  • NIST SP 800 Series
  • OCI
  • Optimization
  • Oracle Cloud
  • Orchestration
  • Partnership
  • Privacy
  • Prompt Engineering
  • Python
  • RMF
  • Reasoning
  • Regulatory Compliance
  • Reliability Engineering
  • Return On Investment
  • Risk Management Framework
  • Root Cause Analysis
  • Routing
  • SAFE
  • Scalability
  • Semantics
  • Stacks Blockchain
  • System Administration
  • System Integration Testing
  • System On A Chip
  • Systems Design
  • Terraform
  • Use Cases
  • Vector Databases
  • Workflow
  • AI Platform
  • AI
  • Sr Director
  • Director

Summary

Sr AI Platform Egineer (Sr Director Level)

Type: W2 With Benefits or W2 hourly - No C2C, No Relocation avilable

Location: Centreville, VA - Hybrid

Senior AI Platform Engineer (Sr Director Level)

Agentic AI Enterprise Infrastructure, Governance & Observability

Senior AI Platform Engineer (Sr. Director Level)

Top 5 Technical Skills:

  1. + years of software/platform engineering experience, including 2+ years building and operating LLM-based or AI/ML systems in production.
  2. Strong programming skills in Python.
  3. Hands-on experience building agentic systems or LLM applications: orchestration, tool use, RAG, prompt engineering, and evaluation.
  4. Cloud AI services: hands-on experience building and deploying with managed AI services on AWS and/or Azure (e.g., managed model, agent, and search/retrieval offerings), with the judgment to design cloud-agnostic where it matters.
  5. Multi-cloud flexibility: comfortable working across cloud platforms, primarily AWS and Azure, with Oracle Cloud (OCI) or Google Cloud Platform used as needed, and able to avoid hard vendor lock-in in platform design.
  6. Deep production/LLMOps/AgentOps experience: CI/CD, containerization (Docker), orchestration (Kubernetes), and infrastructure-as-code (Terraform, Bicep, or equivalent).
  7. Experience operating production systems: on-call/SRE practices, incident response, and reliability engineering against SLAs/SLOs.
  8. Demonstrated technical leadership and mentorship at a senior/staff level.
  9. Proven track record with observability and reliability: logging, distributed tracing, metrics, alerting, and SLAs/SLOs for live systems.
  10. Experience designing for governance, auditability, security, or compliance in regulated or enterprise environments.
  11. Strong API and systems design skills; comfort with distributed, scalable architectures and event-driven systems.
  12. Excellent communication and the ability to explain complex technical and risk trade-offs to technical and non-technical audiences.
  13. Bachelor's or Master's in Computer Science, Engineering, or a related field, or equivalent practical experience.

Top 3 Soft Skills:

  1. Cloud AI/ML platforms: depth in the AWS and/or Azure AI stacks: managed agent, model-catalog, evaluation, and retrieval/search services, plus ML platforms, API gateways, and cloud identity for governance.
  2. Experience with agent frameworks and tooling (e.g., LangChain/LangGraph, LlamaIndex, Semantic Kernel, Model Context Protocol, OpenAI/Anthropic SDKs).
  3. Vector databases and embeddings (e.g., Pinecone, Weaviate, FAISS, pgvector, Azure AI Search) and retrieval system design.
  4. LLM evaluation/observability tooling (e.g., LangSmith, Arize, Langfuse, OpenTelemetry for LLMs).
  5. Familiarity with AI governance frameworks and standards (e.g., NIST AI RMF, ISO/IEC 42001, SOC 2 in an AI context).
  6. Federal/government regulatory experience: working in or alongside U.S. federal compliance regimes (e.g., NIST 800-53, NIST 800-171, CMMC, FedRAMP, FISMA), including Azure Government or other government-cloud environments.
  7. Experience with fine-tuning, model routing, cost optimization, or self-hosted/open-weight model deployment.

Job Description:

We are looking for a Senior AI Platform Engineer to build and own the enterprise infrastructure that lets our teams design, deploy, and operate AI agents at scale. This is a dual-mandate role: you will build the platform (the reusable infrastructure, frameworks, governance controls, and observability tooling that make agentic AI safe and reliable in production), and you will use that platform to build agentic solutions for real enterprise use cases. You sit at the intersection of platform engineering, LLMOps/AgentOps, and applied AI. You care as much about auditability, traceability, and reliability as you do about capability. Your north star is moving the organization beyond isolated LLM demos toward governed, observable, production-grade agent systems that deliver measurable business value. Our platform is built on major cloud providers, primarily AWS and Azure, and is designed to remain flexible across clouds (including Oracle Cloud and others) as the business requires. .

Role on the Team:

Senior role. will be interfacing with senior leadership in the company

Why It Will Fun:

You will define the foundation for how an entire organization builds and trusts AI agents, not just one application but the governed, observable platform underneath all of them. If you want your work deployed, trusted, and used in real customer and enterprise environments rather than left in notebooks and demos, this is that role.

Platform & Infrastructure

  • Build the agent platform: design and operate the shared infrastructure, SDKs, and orchestration layer that engineering teams use to build, deploy, and run AI agents across the enterprise.
  • Agent orchestration: implement multi-agent and single-agent runtimes: planner/supervisor patterns, routing, tool use, intent classification, reasoning/planning chains, memory, retrieval (RAG), guardrails, and human-in-the-loop checkpoints.
  • Integration layer: connect agents securely to enterprise systems, APIs, data stores, and tools (e.g., via Model Context Protocol / MCP and custom connectors).
  • Scalability & reliability: design for scale and resilience: latency, throughput, cost, graceful degradation, retry, and failure isolation across distributed services.
  • Cloud-flexible foundation: build on managed AI and cloud services across AWS and Azure, while keeping the platform portable enough to run workloads on Oracle Cloud (OCI) or other providers as use cases demand.
  • Cost & FinOps: manage and optimize AI platform spend: token/inference cost, model routing and right-sizing, and usage budgeting across teams.

Governance, Auditability & Observability

  • Governance: implement guardrails, policy enforcement, access controls, prompt/response filtering, and approval workflows so agents operate within defined boundaries.
  • Auditability & traceability: ensure every agent decision, tool call, and data access is logged, traceable, and explainable for compliance, security, and incident review.
  • Observability: build monitoring, tracing, evaluation, and alerting for agent behavior: quality, drift detection, model/prompt versioning, cost, token usage, hallucination/failure detection, and SLAs.
  • Compliance: design controls that satisfy federal and enterprise regulatory requirements (e.g., NIST 800-53, NIST 800-171, NIST AI RMF, CMMC, FedRAMP, FISMA, as well as SOC 2 and ISO/IEC 42001), partnering with Risk, Security, and Legal in regulated and government environments.
  • Responsible & secure AI: embed safety, data privacy, and risk controls into the platform; partner with security and compliance to meet enterprise standards.

Applied Agentic Solutions

  • Ship use cases: build reliable, scalable agentic solutions for prioritized enterprise workflows, from prototype through production hardening.
  • Evaluation: define metrics and automated evaluation harnesses to measure accuracy, reliability, and ROI before and after deployment.
  • Cross-functional partnership: work with product, data, security, risk, and business stakeholders to translate needs into governed, production-ready systems.

Operations & Leadership

  • Operate in production: own reliability of live agent systems: on-call participation, incident response, root-cause analysis, and continuous hardening against SLOs.
  • Technical leadership: set engineering standards and best practices for agentic AI, and mentor engineers across teams adopting the platform.

Benefits:

SES hires W2 benefitted and non-benefitted consultants. Our contract employee benefits include group medical dental vision life LT and ST disability insurance, 21 days of accrued paid time off, 401k, tuition reimbursement, performance bonuses, paid overtime, and more.

Please contact me to discuss the details of this position further.

*Please forward resume directly to for immediate consideration - rstarinieri at sesc .com

I look forward to speaking with you soon!

Robin Starinieri

Director of Recruiting

Systems Engineering Services

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10112536
  • Position Id: 8997938
  • Posted 3 hours ago

Company Info

About SES

SES is in the business of helping IT organizations operate more productively by providing a full spectrum of IT Services through a client-friendly delivery model. Whether delivering consultants on a flexible staffing basis or assuming total project control, we are so confident in our delivery that we back our services with a Money-Back Guarantee.

Since our inception, SES has been providing Technology Talent and Solutions to our Clients. Our reputation as a dependable partner is built on the proven experience we’ve gained as a full-spectrum source for IT Services and the custom approach we take with each of our Clients. Whether it’s delivering complex systems on site with our clients through Collaborative InSourcing or delivering Ready-Made Project Teams through our Remote Development Site, our Clients view us as a partner because we deliver what we promise and we are committed to their success.

Headquartered in Reston, VA with offices around the US, SES has partnered with more than 100 of the Fortune 500 during our 30 year history. Our experience, flexibility, and immense talent pool allows us to serve our Clients’ needs at an enterprise level. If you value a partner who considers quality, value, and integrity as the key ingredients to a successful partnership, then let SES make your next IT initiative a success. Now there is a better way…

About_Company_OneAbout_Company_Two
Contact the job poster
Robin Starinieri

Robin Starinieri

Director of Recruiting @ SES
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

It looks like there aren't any Similar Jobs for this job yet.

Search all similar jobs