Overview
Remote
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - 10 Month(s)
Skills
LLAMA
LLM
Large Language Model Meta AI
AI/ML
generative AI
Job Details
Role: LLAMA Software Engineers
Location: Remote or Bay Area
Duration: 10+ Months
We are seeking skilled LLAMA Software Engineers to join a high-impact team working on cutting-edge Large Language Model (LLM) solutions for our strategic client, Client. You will work on the implementation, fine-tuning, deployment, and integration of Client s open-source LLaMA (Large Language Model Meta AI) models in real-world production environments.
As part of this role, you will collaborate closely with clients AI/ML researchers, product teams, and infrastructure engineers to develop scalable, safe, and responsible generative AI applications.
Key Responsibilities
- Design, develop, and deploy applications using LLaMA and other open-source LLM architectures.
- Fine-tune and optimize large models for specific tasks using reinforcement learning, prompt engineering, LoRA, or QLoRA.
- Collaborate with Client stakeholders to integrate LLMs into products like search, recommendation, support automation, and content generation.
- Build robust APIs, pipelines, and tools to support LLM inference, performance monitoring, and scalability.
- Address bias, toxicity, latency, and cost-efficiency in LLM outputs in compliance with clients responsible AI guidelines.
- Contribute to open-source efforts or internal model innovation, benchmarking, and performance improvements.
Required Qualifications
- Bachelor s or Master s degree in Computer Science, Machine Learning, Artificial Intelligence, or related field.
- 3 7 years of experience in software engineering with at least 1 2 years working on LLMs, transformers, or generative AI.
- Hands-on experience with LLaMA, GPT, PaLM, Mistral, or similar models using libraries like Hugging Face Transformers, PyTorch, DeepSpeed, Ray, or Accelerate.
- Solid knowledge of fine-tuning techniques, distributed training, and inference optimization (e.g., quantization, model pruning).
- Proficiency in Python and strong engineering practices (version control, CI/CD, unit testing).
- Familiarity with Clients ecosystem or equivalent (FAIR, PyTorch/Xformers, FBGEMM, etc.) is a plus.
Preferred Skills
- Experience deploying LLMs in production environments at scale.
- Familiarity with MLOps platforms (e.g., MLFlow, SageMaker, Weights & Biases).
- Experience with multimodal models, agent frameworks (AutoGPT, LangChain, Open Agents), or retrieval-augmented generation (RAG) pipelines.
- Background in privacy-preserving ML, RLHF, or embedding-based search.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.