AI SME-Architect ( w / AWS)

Overview

On Site

Part Time

Contract - Independent

Contract - W2

Skills

AWS

Python

Lambda

security

optimization

Design

SQS

SNS

TensorFlow

Pytorch

Governance

EKS

implement

sagemaker

ai/ml

Numpy

Pandas

Query

Vector

LLM

RAG

GenAI

Bedrock

Dify

OpenWebUI

LlamaIndex

LangGraph

synthesize

Fine tuning

LoRA

QLoRA

LangChain

multi-agent

Qdrant

OpenSearch

pgvector

Job Details

AI SME/Architect- HYBRID

Auburn Hills, MI - long term (3 days a week to office)

The Role:

Our Client is seeking a AI SME/Architect with AWS Experience to join our team in Jersey City, NJ (Need Onsite day 1, hybrid 3 days from office).

Our challenge

We are seeking an exceptional hands-on Technical Lead to spearhead our enterprise GenAI engineering program. This is a unique opportunity for a seasoned technologist who combines deep AI/ML expertise with practical engineering skills to build and operationalize cutting-edge generative AI solutions. Candidate will lead the development of AI agents and platforms while remaining deeply involved in the technical implementation.

Responsibilities:

Technical Leadership & Development

Lead the design, development, and deployment of enterprise-scale GenAI solutions using a hybrid of custom developed solutions and open-source platforms (Dify, OpenWebUI, etc.)
Architect and implement AI agents using Python frameworks including LlamaIndex and LangGraph
Drive hands-on development while providing technical guidance to the engineering team
Establish best practices for GenAI development, deployment, and operations

AI/ML Engineering

Design and implement LLM-based solutions with deep understanding of model architectures, fine-tuning, and prompt engineering
Apply classical machine learning techniques where appropriate to complement GenAI solutions
Optimize AI pipelines for performance, cost, and scalability
Implement RAG (Retrieval Augmented Generation) patterns and vector databases

Context Engineering & Advanced RAG

Design and implement sophisticated context engineering strategies for optimal LLM performance
Build advanced RAG systems including multi-hop reasoning, hybrid search, and re-ranking mechanisms
Develop agentic RAG architectures where agents dynamically query, synthesize, and validate information
Implement context window optimization techniques and dynamic context selection strategies
Create self-improving RAG systems with feedback loops and quality assessment

LLM Optimization & Fine-tuning

Lead fine-tuning initiatives for domain-specific LLMs using techniques like LoRA, QLoRA, and full fine-tuning
Implement performance optimization strategies including quantization, pruning, and distillation
Design and execute benchmark suites to measure and improve model performance
Optimize inference latency and throughput for production workloads
Implement prompt optimization and few-shot learning strategies

Platform & Infrastructure

Design event-driven architectures for asynchronous AI processing at scale
Build and deploy containerized AI applications using Kubernetes on AWS
Implement AWS services (SageMaker, Bedrock, Lambda, EKS, SQS, SNS, etc.) for AI workloads
Establish CI/CD pipelines for AI model and application deployment

Security & Governance

Implement secure design principles for AI systems including data privacy and model security
Establish AI security frameworks covering prompt injection prevention, model access controls, and data governance
Ensure compliance with enterprise security standards and AI ethics guidelines
Design audit trails and monitoring for AI system behavior

Requirements:

AI/ML Expertise

Deep understanding of LLMs: Architecture, training, fine-tuning, and deployment strategies
Context engineering proficiency: Expert-level understanding of context window management, prompt engineering, and context optimization techniques
Advanced RAG implementation: Hands-on experience building sophisticated RAG systems with hybrid search, metadata filtering, and agentic capabilities
Fine-tuning expertise: Proven experience fine-tuning LLMs for specific domains using modern techniques (LoRA, PEFT, etc.)
Performance optimization: Track record of optimizing LLM inference for latency, throughput, and cost
Classical ML proficiency: Strong foundation in traditional machine learning algorithms and applications
Python mastery: Expert-level Python with extensive experience in ML libraries (PyTorch, TensorFlow, Pandas, NumPy)
GenAI frameworks: Hands-on experience with LlamaIndex, LangChain, LangGraph, or similar frameworks
Open-source GenAI platforms: Experience with Dify, OpenWebUI, or comparable platforms
Engineering Excellence
Cloud architecture: Proven experience designing and implementing AWS solutions using multiple services
Event-driven systems: Expertise in asynchronous, event-driven architectures for scalable AI processing
Containerization: Advanced knowledge of Docker, Kubernetes, and container orchestration
DevOps/MLOps: Experience with CI/CD, infrastructure as code, and ML model lifecycle management
Security & Enterprise Standards
Secure development: Strong understanding of secure coding practices and security design patterns
AI security: Knowledge of AI-specific security concerns (adversarial attacks, data poisoning, prompt injection)
Enterprise integration: Experience with enterprise authentication, authorization, and compliance requirements

Leadership & Communication

12+ years of hands-on technical experience with at least 5 years in AI/ML
Proven track record of leading technical teams while remaining hands-on
Excellent communication skills to articulate complex technical concepts to diverse stakeholders
Experience working in enterprise environments with multiple stakeholders

Preferred Qualifications

Experience with multi-agent systems and agent orchestration
Knowledge of vector databases (Qdrant, OpenSearch, pgvector)
Expertise in embedding models and semantic search optimization
Contributions to open-source AI/ML projects
Experience with model quantization and edge deployment
Knowledge of graph-based RAG and knowledge graph integration
Certifications in AWS, Kubernetes, or ML platforms

Please ensure that you use the below template forma when submitting profiles. Only the following details along with the resume should be shared:

Do not submit any personal documents along with the profile.

Please always reply on the same email thread and keep all point-of-contacts (POCs) in CC.

Submission Template
Full Name
Contact Number
Email Address
Current Location
Work Authorization
Linked in
Expected Compensation

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share