AI-ML Quality Engineer

Overview

On Site
Full Time
Part Time
Accepts corp to corp applications
Contract - W2
Contract - Independent

Job Details

Hiring: W2 Candidates Only

Visa: Open to any visa type with valid work authorization in the USA

Job Overview

We are seeking a highly skilled Senior AI-ML LLM Quality Engineer with strong expertise in testing and validating large language models (LLMs) and generative AI products. The ideal candidate will have hands-on experience in Python scripting, automation frameworks, and evaluation of LLMs within enterprise environments. This role involves building test strategies, executing model performance validations, and guiding customers on automation strategies.

Key Responsibilities

Support testing and validation of Large Language Model (LLM)-powered applications.

Design and implement test strategies, evaluation workflows, and automation frameworks for generative AI systems.

Perform model performance validation across diverse generative AI use cases.

Collaborate with cross-functional teams to ensure reliable, transparent, and scalable AI solutions.

Guide customers on automation strategy, relevant tools, and best practices.

Must-Have Skills

Strong experience in Python scripting, REST APIs, YAML.

Hands-on experience with testing Generative AI / ML products and evaluating LLMs in enterprise environments.

Experience with LLM Testing Tools (e.g., LangSmith, Promptfoo).

Strong understanding of LLM behavior and evaluation workflows.

Proficiency with PyTest, Selenium, or similar test automation frameworks.

Strong experience with testing automation and ability to advise customers on relevant technologies.

Nice-to-Have Skills

Experience with advanced testing frameworks.

Experience testing RAG pipelines and LLM agent systems.

Familiarity with LangChain, LlamaIndex, or Haystack.

Knowledge of AI/ML model evaluation metrics.

Experience with Red Teaming (preferred but not mandatory).

Familiarity with AWS cloud platforms and MLOps tooling (e.g., MLflow).

Ideal Candidate Profile

6 years of relevant AI/ML and software testing experience.

Deep understanding of LLM testing methodologies and automation strategies.

Strong problem-solving and communication skills to collaborate effectively with product and engineering teams.

Passion for ensuring AI model quality, transparency, and reliability in real-world enterprise applications.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.