AI/ML QA Engineer (RAG & Evaluation focus)

Overview

Remote
Depends on Experience
Full Time
No Travel Required

Skills

Artificial Intelligence
Machine Learning (ML)
Quality Assurance

Job Details

AI/ML QA Engineer (RAG & Evaluation focus) - Fulltime | Fully REMOTE

Qualifications:
3-5 years of professional experience in software QA, AI/ML evaluation, or intelligent
system testing.
Proficiency in Python and scripting for automation and data analysis.
Hands-on experience with AI evaluation tools (e.g., Ragas, LangChain evaluation
tools, Weights & Biases, PromptLayer, etc.).
Solid understanding of retrieval-augmented generation (RAG) systems, vector
databases, and embedding evaluation.
Experience with modern MLOps tools, CI/CD systems, and version control (GitHub,
Jenkins, etc.).
Strong analytical and problem-solving skills, with a data-driven approach to quality
assurance.
Excellent communication skills and ability to collaborate cross-functionally with AI
engineers, data scientists, and product managers.
Experience in user feedback loop integration and reinforcement learning from
human feedback (RLHF). Nice to Have:
Familiarity with retrieval pipelines, document ranking, and semantic search systems.
Exposure to real estate or SaaS platforms is a plus.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.