job summary:
1. Deterministic Testing & Data Validation
Validate generative AI tool outputs for structured, rules-based use cases by reconciling results against trusted data sources and established SQL-based metrics.
Ensure consistency, explainability, and auditability of outputs by confirming alignment with existing data pipelines and query logic
Expand and maintain test coverage across prioritized use cases to establish a robust, high-confidence baseline for the platform
Partner with data engineering and analytics teams to identify and resolve discrepancies in underlying data or logic
2. Non-Deterministic Testing & Scenario Evaluation
Design and execute scenario-based testing for more complex, AI-driven outputs where direct validation is not always possible
Evaluate results based on intent accuracy, reasonableness, and confidence thresholds rather than exact match validation
Prioritize testing across higher-risk and high-impact use cases using curated question sets and real-world scenarios
Identify patterns in output variability and drive iterative refinement to improve reliability and user trust
3. Human-in-the-Loop Review & Continuous Monitoring
Conduct ongoing review of generative AI tool interactions post-launch, validating outputs and ensuring quality across all user scenarios
Identify edge cases, inconsistencies, and emerging risks, and escalate findings to product and engineering teams
Synthesize insights from testing and live usage to inform enhancements, training data improvements, and governance practices
Serve as an accountable reviewer, providing a critical control point for responsible AI deployment and continuous improvement
location: Malvern, Pennsylvania
job type: Contract
salary: $55 - 60 per hour
work hours: 8am to 5pm
education: Bachelors
responsibilities:
1. Deterministic Testing & Data Validation
- Validate generative AI tool outputs for structured, rules-based use cases by reconciling results against trusted data sources and established SQL-based metrics.
- Ensure consistency, explainability, and auditability of outputs by confirming alignment with existing data pipelines and query logic
- Expand and maintain test coverage across prioritized use cases to establish a robust, high-confidence baseline for the platform
- Partner with data engineering and analytics teams to identify and resolve discrepancies in underlying data or logic
2. Non-Deterministic Testing & Scenario Evaluation
- Design and execute scenario-based testing for more complex, AI-driven outputs where direct validation is not always possible
- Evaluate results based on intent accuracy, reasonableness, and confidence thresholds rather than exact match validation
- Prioritize testing across higher-risk and high-impact use cases using curated question sets and real-world scenarios
- Identify patterns in output variability and drive iterative refinement to improve reliability and user trust
3. Human-in-the-Loop Review & Continuous Monitoring
- Conduct ongoing review of generative AI tool interactions post-launch, validating outputs and ensuring quality across all user scenarios
- Identify edge cases, inconsistencies, and emerging risks, and escalate findings to product and engineering teams
- Synthesize insights from testing and live usage to inform enhancements, training data improvements, and governance practices
- Serve as an accountable reviewer, providing a critical control point for responsible AI deployment and continuous improvement
qualifications:
Required Skills & Experience
Robust SQL skills required.
Robust analytical background with experience in data validation, SQL, and analytics workflows
Ability to assess outputs both quantitatively (data accuracy) and qualitatively (reasonableness, business context)
Demonstrated critical thinking and sound judgment, especially in ambiguous or non-deterministic environments
Experience working with large datasets, reporting tools, or analytics platforms
Preferred Qualifications
Exposure to AI/ML or generative AI tools and associated testing or validation frameworks
Experience in scenario-based testing, UAT, or model validation
Familiarity with financial services, retirement, or plan sponsor analytics
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including: medical, prescription, dental, vision, AD&D, and life insurance offerings, short-term disability, and a 401K plan (all benefits are based on eligibility).
This posting is open for thirty (30) days.
Any consideration of a background check would be an individualized assessment based on the applicant or employee's specific record and the duties and requirements of the specific job.
![]()