Job: AI ML Engineer
Duration: 6 Month C2H
Locals Only/ Out of Area/ Remote? REMOTE
3-5 Must Haves
Strong local (non-cloud) AI experience
Deep experience with AI frameworks + pipelines
RAG + vector database expertise
Infrastructure + performance optimization (GPU / Kubernetes)
Local-First AI Expertise: Proven track record deploying and optimizing open-source LLMs (e.g., LLaMA, Mistral) in non-cloud, restricted, or air-gapped private infrastructures
Deep Framework Proficiency: Heavy hands-on experience with PyTorch, Hugging Face, and orchestration layers like LangChain, LlamaIndex, or equivalent frameworks
Vector and Retrieval Mastery: Direct experience engineering production-grade RAG architectures, embeddings, semantic search, and local vector databases (e.g., FAISS, Qdrant, Milvus, Chroma)
Containerization and Compute Infrastructure: Strong experience containerizing AI workloads via Docker/Kubernetes and managing dedicated GPU-based compute environments
Advanced ML Concepts: Solid understanding of fine-tuning techniques (LoRA/QLoRA) versus prompt engineering, and model quantization formats (GGUF, AWQ, EXL2)
Autonomy: Ability to build, test, and iterate rapidly in an isolated development sandbox with zero dependency on third-party cloud APIs
Experience operating within heavily regulated or compliance-driven industries (e.g., high-governance data environments, fintech, or legal-tech)
Familiarity with local-first agentic workflows, Model Context Protocol (MCP), or building fully internal developer copilots and autonomous knowledge systems