Overview
Skills
Job Details
Job Title: AI Engineer Video & Multimodal AI
Location: USA-Remote |
Experience Level: 10+ Years
About the Role:
We are hiring a Sr AI Engineer to spearhead the design, fine-tuning, and scalable deployment of cutting-edge AI systems, with a focus on deep learning, video intelligence, and multi-modal (vision + language) models. The ideal candidate has a strong academic foundation, preferably from Ivy League institutions and proven experience in driving innovative AI solutions from research to production.
Key Responsibilities:
Architect and lead the development of large-scale video AI and vision-language models (VLMs).
Fine-tune and optimize Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) for task-specific applications.
Scale model training and evaluation across distributed systems with an emphasis on GPU/accelerated environments.
Build and maintain robust AI pipelines for training, evaluation, benchmarking, and deployment using state-of-the-art MLOps tools.
Drive performance optimization of models for real-time inference using tools like TensorRT, ONNX, and NVIDIA Triton.
Collaborate cross-functionally with data scientists, researchers, and platform engineers to align model development with business goals.
Publish internal/external papers and contribute to IP creation and thought leadership in AI innovation.
Minimum Qualifications:
MS or Postgraduate degree in Computer Science or related field (PhD preferred); strong preference for Ivy League graduates.
10+ years of industry or research experience in AI/ML, with a focus on Deep Learning, Video AI, and multi-modal systems.
Advanced proficiency in Python and DL frameworks such as PyTorch and TensorFlow.
Deep expertise in fine-tuning LLMs and MLLMs, including prompt engineering, transfer learning, and embedding-based techniques.
Proven experience scaling AI model training and inference across multi-GPU and distributed compute platforms.
Strong hands-on knowledge of MLOps practices, including Docker, Kubernetes, MLFlow, and model serving.
Preferred Skills:
Familiarity with NVIDIA s AI ecosystem (TensorRT, Triton Inference Server, DeepStream SDK).
Experience with retrieval-augmented generation (RAG), attention-based models, and real-time video inference.
Prior experience in leading AI teams or projects and mentoring junior researchers/engineers.
Publications, patents, or open-source contributions in the field of AI/ML.
- Familiarity with NVIDIA s AI ecosystem (TensorRT, Triton Inference Server, DeepStream SDK).
Skills:
RAG, LLM, VLM, Multi Modal AI, Video