Apply Now

-Lead / Senior QA Engineer Agentic AI Systems (EST Overlap Preferred)

Remote • Posted 4 hours ago • Updated 4 hours ago

Full Time

Remote

$70 - $80/hr

Fitment

Dice Job Match Score™

🤯 Applying directly to the forehead...

Job Details

Skills

Temporal(workflow orchestration
stateful systems testing)
Langfuse(LLM observability
tracing
and evaluation)

Summary

Role Overview

We are looking for a highly skilled QA professional to build and scale a next-generation Agentic AI Quality Engineering function. This role goes beyond traditional QA focusing on validating autonomous AI systems, designing evaluation frameworks, and ensuring high-quality outputs across multiple AI-driven products.

You will play a critical role in shaping how quality is defined, measured, and improved for agentic systems that operate with minimal human intervention.

Key Responsibilities

1. Agentic QA Strategy & Scaling

Design and scale an agentic QA model for autonomous AI systems
Move QA from human-driven validation to AI-led evaluation and continuous quality monitoring
Establish best practices for testing AI agents across lifecycle stages
2. Product Quality Ownership

Own QA for 3 core AI products:

AI Contact Center solutions
AI Chat & Form-based interaction systems
AI Assistants (autonomous / semi-autonomous agents)
Define quality benchmarks, SLAs, and success metrics for each product
Proactively identify quality gaps ahead of customer impact
3. Metrics, Observability & Evaluation

Define and track performance outputs for agentic systems (accuracy, latency, resolution quality, hallucination rate, etc.)
Build frameworks for:
Evals & graders (LLM evaluation pipelines)
Output scoring and benchmarking
Continuous feedback loops
Leverage tools like Langfuse for:
LLM observability and tracing
Prompt monitoring and performance analysis
Debugging agent behavior in production
Analyze:
Downstream issues
Production tickets
Failure patterns
4. Automation & Testing Frameworks

Build and scale automation across:
Regression testing
Smoke testing
End-to-end agent workflows
Develop and maintain Playwright-based automation scripts
Integrate QA into CI/CD pipelines for continuous validation
5. Agentic Testing & Validation

Design testing approaches for:
Multi-step agent workflows
Context retention and reasoning
Tool usage by agents
Work with orchestration frameworks like Temporal to:
Validate long-running workflows
Test retries, state transitions, and failure handling in agent pipelines
Account for non-deterministic behavior in AI systems
Invest additional effort in agentic validation, recognizing higher complexity vs traditional QA
6. Continuous Improvement & Innovation

Define frameworks to predict and prevent failures before customer exposure
Continuously improve QA processes using AI and automation
Partner with Product, Engineering, and AI teams to improve system quality
Required Skills & Experience

5 10+ years in QA / Quality Engineering, with strong automation experience
Hands-on experience with:
Test automation tools (Playwright preferred)
API and system testing
Strong understanding of:
AI/ML systems (LLMs, conversational AI preferred)
Evaluation frameworks and benchmarking
Experience with:
Temporal (workflow orchestration, stateful systems testing)
Langfuse (LLM observability, tracing, and evaluation)
Experience in:
Building QA frameworks from scratch
Working with production data, logs, and issue triaging
Good to Have

Experience with LLM eval frameworks, prompt testing, or AI red-teaming
Familiarity with agentic architectures / autonomous systems
Exposure to observability and analytics platforms
Working Model

Prefer candidates with EST time zone overlap
Ability to work closely with global product and engineering teams
What Success Looks Like

A scalable, automated QA system for agentic products
Measurable improvement in AI output quality and reliability
Reduced production issues and faster detection of failures
QA evolving from reactive testing to proactive quality intelligence

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 90932951
Position Id: 8943189
Posted 4 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

AI Quality Lead -Virtual

Remote or Illinois

•

Today

AI Quality Lead Amazing Opportunity and Great Place to Work! Are you a fast learner who is fascinated by technology? Our Story At Alight, we believe a company's success starts with its people. At our core, we Champion People, help our colleagues Grow with Purpose and true to our name we encourage colleagues to "Be Alight." Our Values Champion People - be empathetic and help create a place where everyone belongs. Grow with purpose - be inspired by our higher calling of improving lives. Be A

Full-time

USD 125,000.00 - 180,000.00 per year

Lead Director - Software Development Engineering (AI & Automation)

Remote or Richardson, Texas

•

Today

We're building a world of health around every individual - shaping a more connected, convenient and compassionate health experience. At CVS Health , you'll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger - helping to simplify health care one person, one family and one community at a time. Position Overview The Lead Director - AI Software Engin

Full-time

USD 144,200.00 - 288,400.00 per year

Agentic AI Engineer - Vice President

Remote or Tampa, Florida

•

Today

The Agentic AI Engineer is a strategic professional deeply immersed in the Artificial Intelligence and Machine Learning (AI/ML) field, with a strong, hands-on focus on leading agentic flow design, prompt design, and comprehensive testing for advanced AI systems. This role contributes to directional strategy by leveraging expertise in building autonomous and semi-autonomous AI solutions, particularly utilizing frameworks like Langraph, and applies this knowledge to critical business challenges.

Full-time

USD 125,600.00 - 188,400.00 per year

AI Engineering Lead

Remote

•

Today

Job#: 3028969 Job Description: Engineering Lead - GenAI & Agentic Systems (React / React Native Context) Engagement Details Contract Type: W2 Contract - 9 months (extension/conversion possible)Location: Remote Work Authorization: No visa sponsorship now or in the futureStart Date: ASAP Rate: NegotiableRole Overview We are seeking a hands-on Engineering Lead with deep GenAI and agentic system development experience to architect and build AI-driven workflows that integrate into web and mobile pro

Easy Apply

Full-time

Search all similar jobs