Apply Now

Lead AI Engineer (FM Hosting, LLM Inference)/Remote

Remote • Posted 2 hours ago • Updated 2 hours ago

Contract W2

Contract Independent

Contract Corp To Corp

1 Year

No Travel Required

Remote

Depends on Experience

Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

Lead AI Engineer (FM Hosting
LLM Inference)

Summary

Lead AI Engineer (FM Hosting, LLM Inference)

Location- Remote

Job Title

Lead AI Engineer – Foundation Model Hosting & LLM Inference

Job Summary

We are looking for an experienced Lead AI Engineer to design, deploy, and optimize large-scale Foundation Model (FM) hosting and LLM inference platforms. The ideal candidate will lead AI infrastructure initiatives, improve model serving performance, and build scalable, secure, and cost-efficient AI systems for enterprise applications.

Key Responsibilities

Design and manage scalable infrastructure for hosting foundation models and LLMs.
Develop and optimize high-performance inference pipelines for low latency and high throughput.
Deploy and manage models using containerized and distributed environments.
Work with GPU acceleration, model quantization, batching, caching, and inference optimization techniques.
Implement APIs and microservices for AI model serving.
Monitor system reliability, availability, scalability, and cost efficiency.
Collaborate with AI/ML teams to productionize machine learning and generative AI models.
Lead architecture decisions for model deployment, orchestration, and observability.
Ensure security, governance, and compliance for AI infrastructure.
Mentor engineering teams and drive AI platform best practices.

Required Skills

Strong expertise in Python and backend system development.
Hands-on experience with LLM serving frameworks such as vLLM, TensorRT-LLM, or Text Generation Inference.
Experience with distributed computing, GPU infrastructure, and Kubernetes.
Knowledge of transformer architectures, model optimization, and inference tuning.
Experience with cloud platforms such as Amazon Web Services, Microsoft Azure, or Google Cloud.
Familiarity with Docker, CI/CD pipelines, and infrastructure automation.
Understanding of vector databases, embeddings, and retrieval systems.
Strong debugging, performance tuning, and problem-solving skills.
Excellent leadership and stakeholder communication abilities.

Preferred Qualifications

Bachelor’s or Master’s degree in Computer Science, AI, Machine Learning, or related field.
Experience deploying open-source or enterprise LLMs in production environments.
Knowledge of MLOps and observability tools.
Exposure to RAG architectures, fine-tuning, and AI agents is a plus.

Tools & Technologies

Python, FastAPI
vLLM / TensorRT-LLM
Kubernetes, Docker
PyTorch, CUDA
Ray, Triton Inference Server
Vector Databases (Pinecone, Milvus, FAISS)
Amazon Web Services / Microsoft Azure / Google Cloud
CI/CD & Monitoring Tools

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10513292
Position Id: 72535-12895-
Posted 2 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Senior AI Engineer

Remote

•

Yesterday

Senior AI Engineer Work locations: Remote role Duration: 12 Months About the Role AI Engineer(s) with Typescript, Vibe Coding, and Claude (or Cursor). What Youll Do Design and develop algorithms for generative models using deep learning techniques Design and build LLM-powered applications for internal and/or customer-facing use cases Develop and productionize RAG pipelinesusing enterprise data sources, vector databases, and retrieval systems Build and optimize AI agents / agentic workflows fo

Easy Apply

Contract

Depends on Experience

Lead Gen AI/ML Architect - 100% Remote

Remote

•

30+d ago

Title: Lead AI/ML Architect (Gen AI) Location: Remote Duration: Long Term Contract Keynote : (Must-Have Very Strong Requirement) Candidate must have very strong, hands-on experience in both Machine Learning AND Generative AI (LLMs). Should not be limited to theoretical knowledge must have built and deployed real-world ML models and GenAI applications in production. Strong expertise in end-to-end ML lifecycle + LLM-based solution design (RAG, prompt engineering, fine-tuning). Profiles with only

Easy Apply

Contract

Depends on Experience

AI Engineer | Remote

Remote

•

14d ago

Position: AI Engineer Type: Contract Compensation: $70 - $120/hour Location: Remote Commitment: 10-40 hrs/week Role ResponsibilitiesDesign, implement, and optimize large language model (LLM) solutions using PythonCollaborate with cross-functional experts to identify and solve real-world AI challengesDevelop scalable machine learning pipelines for training, fine-tuning, and inferenceEvaluate the performance of AI models, troubleshoot issues, and propose effective improvementsDocument processes a

Easy Apply

Contract, Third Party

70 - 120

Lead AI Solution Architect -W2 Contract

Remote

•

Yesterday

Job SummaryWe are seeking aLead AI Solution Architectto design and deliver production-grade AI systems at scale. This is a hands-on leadership role focused on buildingRAG pipelines, agentic AI systems, and LLM-based applicationsin enterprise environments. The ideal candidate brings deep expertise inAzure AI ecosystem, strong architectural judgment, and the ability to translate business problems into scalable AI solutions while also knowing whennotto build. Required Qualifications12+ years of so

Easy Apply

Contract

Depends on Experience

Search all similar jobs