AI SME/Architect with AWS Experience :: Local Candidate of NYC/NJ

  • New York, NY
  • Posted 8 hours ago | Updated 8 hours ago

Overview

Hybrid
Depends on Experience
Contract - W2
Contract - Independent
Contract - 12 Month(s)

Skills

AI/ML
AWS
LLM

Job Details

Role: AI SME/Architect with AWS Experience

Location: NYC, NY (Hybrid, 3 days onsite)

Duration: Long Term Contract

Our challenge

We are seeking an exceptional hands-on Technical Lead to spearhead our enterprise GenAI engineering program. This is a unique opportunity for a seasoned technologist who combines deep AI/ML expertise with practical engineering skills to build and operationalize cutting-edge generative AI solutions. Candidate will lead the development of AI agents and platforms while remaining deeply involved in the technical implementation.

Responsibilities:

Technical Leadership & Development

  • Lead the design, development, and deployment of enterprise-scale GenAI solutions using a hybrid of custom developed solutions and open-source platforms (Dify, OpenWebUI, etc.)
  • Architect and implement AI agents using Python frameworks including LlamaIndex and LangGraph
  • Drive hands-on development while providing technical guidance to the engineering team
  • Establish best practices for GenAI development, deployment, and operations

AI/ML Engineering

  • Design and implement LLM-based solutions with deep understanding of model architectures, fine-tuning, and prompt engineering
  • Apply classical machine learning techniques where appropriate to complement GenAI solutions
  • Optimize AI pipelines for performance, cost, and scalability
  • Implement RAG (Retrieval Augmented Generation) patterns and vector databases

Context Engineering & Advanced RAG

  • Design and implement sophisticated context engineering strategies for optimal LLM performance
  • Build advanced RAG systems including multi-hop reasoning, hybrid search, and re-ranking mechanisms
  • Develop agentic RAG architectures where agents dynamically query, synthesize, and validate information
  • Implement context window optimization techniques and dynamic context selection strategies
  • Create self-improving RAG systems with feedback loops and quality assessment

LLM Optimization & Fine-tuning

  • Lead fine-tuning initiatives for domain-specific LLMs using techniques like LoRA, QLoRA, and full fine-tuning
  • Implement performance optimization strategies including quantization, pruning, and distillation
  • Design and execute benchmark suites to measure and improve model performance
  • Optimize inference latency and throughput for production workloads
  • Implement prompt optimization and few-shot learning strategies

Platform & Infrastructure

  • Design event-driven architectures for asynchronous AI processing at scale
  • Build and deploy containerized AI applications using Kubernetes on AWS
  • Implement AWS services (SageMaker, Bedrock, Lambda, EKS, SQS, SNS, etc.) for AI workloads
  • Establish CI/CD pipelines for AI model and application deployment

Security & Governance

  • Implement secure design principles for AI systems including data privacy and model security
  • Establish AI security frameworks covering prompt injection prevention, model access controls, and data governance
  • Ensure compliance with enterprise security standards and AI ethics guidelines
  • Design audit trails and monitoring for AI system behavior

Requirements:

AI/ML Expertise

  • Deep understanding of LLMs: Architecture, training, fine-tuning, and deployment strategies
  • Context engineering proficiency: Expert-level understanding of context window management, prompt engineering, and context optimization techniques
  • Advanced RAG implementation: Hands-on experience building sophisticated RAG systems with hybrid search, metadata filtering, and agentic capabilities
  • Fine-tuning expertise: Proven experience fine-tuning LLMs for specific domains using modern techniques (LoRA, PEFT, etc.)
  • Performance optimization: Track record of optimizing LLM inference for latency, throughput, and cost
  • Classical ML proficiency: Strong foundation in traditional machine learning algorithms and applications
  • Python mastery: Expert-level Python with extensive experience in ML libraries (PyTorch, TensorFlow, Pandas, NumPy)
  • GenAI frameworks: Hands-on experience with LlamaIndex, LangChain, LangGraph, or similar frameworks
  • Open-source GenAI platforms: Experience with Dify, OpenWebUI, or comparable platforms
  • Engineering Excellence
  • Cloud architecture: Proven experience designing and implementing AWS solutions using multiple services
  • Event-driven systems: Expertise in asynchronous, event-driven architectures for scalable AI processing
  • Containerization: Advanced knowledge of Docker, Kubernetes, and container orchestration
  • DevOps/MLOps: Experience with CI/CD, infrastructure as code, and ML model lifecycle management
  • Security & Enterprise Standards
  • Secure development: Strong understanding of secure coding practices and security design patterns
  • AI security: Knowledge of AI-specific security concerns (adversarial attacks, data poisoning, prompt injection)
  • Enterprise integration: Experience with enterprise authentication, authorization, and compliance requirements

Leadership & Communication

  • 12+ years of hands-on technical experience with at least 5 years in AI/ML
  • Proven track record of leading technical teams while remaining hands-on
  • Excellent communication skills to articulate complex technical concepts to diverse stakeholders
  • Experience working in enterprise environments with multiple stakeholders

Preferred Qualifications

  • Experience with multi-agent systems and agent orchestration
  • Knowledge of vector databases (Qdrant, OpenSearch, pgvector)
  • Expertise in embedding models and semantic search optimization
  • Contributions to open-source AI/ML projects
  • Experience with model quantization and edge deployment
  • Knowledge of graph-based RAG and knowledge graph integration
  • Certifications in AWS, Kubernetes, or ML platforms
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.