Location: Plano, TX
Salary: $160,000.00 USD Annually - $185,000.00 USD Annually
Description: Location: Plano, TX or Camas, WAThe OpportunityWe are seeking an
AI Engineer to design, build, and optimize production-grade generative AI solutions. In this role, you'll collaborate closely with AI leaders, ML engineers, and platform teams to deliver scalable systems powered by LLMs, GPU-accelerated frameworks, and cloud-native microservices. You'll ensure these solutions are secure, high-performing, and fully integrated within enterprise environments.
What You'll Do- Architect, develop, fine-tune, and deploy generative AI models into scalable production environments
- Build and maintain APIs and microservices-primarily using FastAPI-to enable AI capabilities across the organization
- Partner with AI Infrastructure teams to design robust LLM pipelines, including training workflows and RAG (retrieval-augmented generation) systems
- Integrate AI solutions into enterprise applications using secure, cloud-native patterns and best practices
- Ensure models meet standards for explainability, reliability, security, and regulatory compliance
- Monitor and improve model performance through evaluation frameworks, observability tools, and continuous fine-tuning
What You Bring- 8+ years of experience in the IT industry
- Minimum 2+ years of hands-on AI development experience
- 3+ years of professional Python programming
- Strong proficiency with LLMs, embeddings, vector databases, and RAG architectures
- Proven experience working with generative AI models, including multimodal systems
- Practical expertise with cloud-native AI platforms such as Azure AI Foundry, AWS Bedrock, OpenAI models, and AI governance frameworks
- Bachelor's degree in Computer Science, AI, or related discipline-or equivalent professional experience
Preferred Qualifications- Experience with GPU-accelerated training and inference using NVIDIA technologies (e.g., NIM, NeMo)
- Ability to optimize and scale AI models with NVIDIA NIM and fine-tune via NeMo services
- Familiarity with agentic AI frameworks and deploying production AI agents
- Experience delivering low-latency, high-throughput model deployments using tools such as vLLM and GPU-optimized inference frameworks
- Background in CI/CD pipelines for ML and Generative AI, including containerization and orchestration with Docker and Kubernetes
By providing your phone number, you consent to: (1) receive automated text messages and calls from the Judge Group, Inc. and its affiliates (collectively "Judge") to such phone number regarding job opportunities, your job application, and for other related purposes. Message & data rates apply and message frequency may vary. Consistent with Judge's Privacy Policy, information obtained from your consent will not be shared with third parties for marketing/promotional purposes. Reply STOP to opt out of receiving telephone calls and text messages from Judge and HELP for help.
Contact: This job and many more are available through The Judge Group. Please apply with us today!