AI SME-Architect ( w / AWS)

Overview

On Site
Part Time
Contract - Independent
Contract - W2

Skills

AWS
Python
Lambda
security
optimization
Design
SQS
SNS
TensorFlow
Pytorch
Governance
EKS
implement
sagemaker
ai/ml
Numpy
Pandas
Query
Vector
LLM
RAG
GenAI
Bedrock
Dify
OpenWebUI
LlamaIndex
LangGraph
synthesize
Fine tuning
LoRA
QLoRA
LangChain
multi-agent
Qdrant
OpenSearch
pgvector

Job Details

AI SME/Architect- HYBRID

Auburn Hills, MI - long term (3 days a week to office)

The Role:

Our Client is seeking a AI SME/Architect with AWS Experience to join our team in Jersey City, NJ (Need Onsite day 1, hybrid 3 days from office).

Our challenge

We are seeking an exceptional hands-on Technical Lead to spearhead our enterprise GenAI engineering program. This is a unique opportunity for a seasoned technologist who combines deep AI/ML expertise with practical engineering skills to build and operationalize cutting-edge generative AI solutions. Candidate will lead the development of AI agents and platforms while remaining deeply involved in the technical implementation.

Responsibilities:

Technical Leadership & Development

  • Lead the design, development, and deployment of enterprise-scale GenAI solutions using a hybrid of custom developed solutions and open-source platforms (Dify, OpenWebUI, etc.)
  • Architect and implement AI agents using Python frameworks including LlamaIndex and LangGraph
  • Drive hands-on development while providing technical guidance to the engineering team
  • Establish best practices for GenAI development, deployment, and operations

AI/ML Engineering

  • Design and implement LLM-based solutions with deep understanding of model architectures, fine-tuning, and prompt engineering
  • Apply classical machine learning techniques where appropriate to complement GenAI solutions
  • Optimize AI pipelines for performance, cost, and scalability
  • Implement RAG (Retrieval Augmented Generation) patterns and vector databases

Context Engineering & Advanced RAG

  • Design and implement sophisticated context engineering strategies for optimal LLM performance
  • Build advanced RAG systems including multi-hop reasoning, hybrid search, and re-ranking mechanisms
  • Develop agentic RAG architectures where agents dynamically query, synthesize, and validate information
  • Implement context window optimization techniques and dynamic context selection strategies
  • Create self-improving RAG systems with feedback loops and quality assessment

LLM Optimization & Fine-tuning

  • Lead fine-tuning initiatives for domain-specific LLMs using techniques like LoRA, QLoRA, and full fine-tuning
  • Implement performance optimization strategies including quantization, pruning, and distillation
  • Design and execute benchmark suites to measure and improve model performance
  • Optimize inference latency and throughput for production workloads
  • Implement prompt optimization and few-shot learning strategies

Platform & Infrastructure

  • Design event-driven architectures for asynchronous AI processing at scale
  • Build and deploy containerized AI applications using Kubernetes on AWS
  • Implement AWS services (SageMaker, Bedrock, Lambda, EKS, SQS, SNS, etc.) for AI workloads
  • Establish CI/CD pipelines for AI model and application deployment

Security & Governance

  • Implement secure design principles for AI systems including data privacy and model security
  • Establish AI security frameworks covering prompt injection prevention, model access controls, and data governance
  • Ensure compliance with enterprise security standards and AI ethics guidelines
  • Design audit trails and monitoring for AI system behavior

Requirements:

AI/ML Expertise

  • Deep understanding of LLMs: Architecture, training, fine-tuning, and deployment strategies
  • Context engineering proficiency: Expert-level understanding of context window management, prompt engineering, and context optimization techniques
  • Advanced RAG implementation: Hands-on experience building sophisticated RAG systems with hybrid search, metadata filtering, and agentic capabilities
  • Fine-tuning expertise: Proven experience fine-tuning LLMs for specific domains using modern techniques (LoRA, PEFT, etc.)
  • Performance optimization: Track record of optimizing LLM inference for latency, throughput, and cost
  • Classical ML proficiency: Strong foundation in traditional machine learning algorithms and applications
  • Python mastery: Expert-level Python with extensive experience in ML libraries (PyTorch, TensorFlow, Pandas, NumPy)
  • GenAI frameworks: Hands-on experience with LlamaIndex, LangChain, LangGraph, or similar frameworks
  • Open-source GenAI platforms: Experience with Dify, OpenWebUI, or comparable platforms
  • Engineering Excellence
  • Cloud architecture: Proven experience designing and implementing AWS solutions using multiple services
  • Event-driven systems: Expertise in asynchronous, event-driven architectures for scalable AI processing
  • Containerization: Advanced knowledge of Docker, Kubernetes, and container orchestration
  • DevOps/MLOps: Experience with CI/CD, infrastructure as code, and ML model lifecycle management
  • Security & Enterprise Standards
  • Secure development: Strong understanding of secure coding practices and security design patterns
  • AI security: Knowledge of AI-specific security concerns (adversarial attacks, data poisoning, prompt injection)
  • Enterprise integration: Experience with enterprise authentication, authorization, and compliance requirements

Leadership & Communication

  • 12+ years of hands-on technical experience with at least 5 years in AI/ML
  • Proven track record of leading technical teams while remaining hands-on
  • Excellent communication skills to articulate complex technical concepts to diverse stakeholders
  • Experience working in enterprise environments with multiple stakeholders

Preferred Qualifications

  • Experience with multi-agent systems and agent orchestration
  • Knowledge of vector databases (Qdrant, OpenSearch, pgvector)
  • Expertise in embedding models and semantic search optimization
  • Contributions to open-source AI/ML projects
  • Experience with model quantization and edge deployment
  • Knowledge of graph-based RAG and knowledge graph integration
  • Certifications in AWS, Kubernetes, or ML platforms

Please ensure that you use the below template forma when submitting profiles. Only the following details along with the resume should be shared:

Do not submit any personal documents along with the profile.

Please always reply on the same email thread and keep all point-of-contacts (POCs) in CC.

Submission Template

Full Name

Contact Number

Email Address

Current Location

Work Authorization

Linked in

Expected Compensation

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.