Senior AI Engineer (NVIDIA NIM & Triton)
Dice Job Match Score™
🔗 Matching skills to job...
Job Details
Skills
- Amazon Web Services
- Artificial Intelligence
- Banking
- CUDA
- Cloud Computing
- Data Science
- Docker
- Energy
- Financial Services
- Generative Artificial Intelligence (AI)
- Health Care
- Insurance
- Kubernetes
- LangChain
- Large Language Models (LLMs)
- Machine Learning (ML)
- Machine Learning Operations (ML Ops)
- Management
- Microservices
- Microsoft Azure
- NIM
- Pharmaceutics
- Public Sector
- Python
- Retail
- Telecommunications
Summary
Job Title: Senior AI Engineer (NVIDIA NIM & Triton)
Location: Open Across USA (Remote)
Job Summary
We are seeking a Senior AI Engineer with strong experience in NVIDIA AI technologies, specifically NVIDIA NIM Microservices and Triton Inference Server. The ideal candidate will be responsible for designing, deploying, optimizing, and scaling Generative AI and LLM-based applications in enterprise environments.
Required Skills
- Hands-on experience with NVIDIA NIM Microservices
- Strong experience with NVIDIA Triton Inference Server
- Experience deploying and serving Large Language Models (LLMs)
- Knowledge of TensorRT-LLM and CUDA optimization
- Experience with Kubernetes and Docker containerization
- Strong Python programming skills
- Experience building AI/ML applications in AWS, Azure, or Google Cloud Platform
- Understanding of model inference, model serving, and performance tuning
- Experience with REST APIs and microservices architecture
Preferred Skills
- Experience with NVIDIA NeMo
- Experience with RAG (Retrieval-Augmented Generation) architectures
- Familiarity with LangChain or LlamaIndex
- Exposure to MLOps/LLMOps practices
- Experience with monitoring and observability tools
Responsibilities
- Design and deploy AI applications using NVIDIA NIM Microservices
- Build and optimize model serving infrastructure using Triton Inference Server
- Deploy and manage LLM workloads in Kubernetes environments
- Optimize inference performance using TensorRT-LLM and CUDA
- Collaborate with Data Science, MLOps, and Platform Engineering teams
- Implement scalable, secure, and production-ready AI solutions
- Troubleshoot and improve AI application performance and reliability
- Support cloud-based AI deployments across AWS, Azure, or Google Cloud Platform
About AgreeYa:
AgreeYa is a global systems integrator delivering a competitive advantage for its customers through software, solutions, and services. Established in 1999, AgreeYa is headquartered in Folsom, California, with a global footprint and a team of more than 1,800+ professionals across offices. AgreeYa works with 550+ organizations ranging from Fortune 100 firms to small and large businesses across industries such as Telecom, Banking, Financial Services & Insurance, Healthcare, Utility & Energy, Technology, Public Sector, Pharma & Biotech, Retail, Client, and others. Please visit us at for more information.
Equal Opportunity:
AgreeYa is an equal opportunity employer. We evaluate qualified applicants without regard to race, color, religion, gender identity, sexual orientation, national origin, disability, veteran status or other protected characteristics. Visit our website at to learn about our Career & Culture.
- Dice Id: swapps
- Position Id: 8998393
- Posted 2 days ago
Similar Jobs
Hmmm, it looks like we aren't able to display any Similar Jobs here. Please check back later.
Search all similar jobs