Overview
Skills
Job Details
Job Description -
Need locals and no 3rd party candidates
Sr. Engineer - AI Evaluation & Reliability ( W2 )
Hybrid - Redwood City, CA
Skype
JD
• 6+ years building evaluation or testing infrastructure for ML/LLM systems or large-scale distributes system
• Proven ability to translate product requirements into measurable metrics and test plans.
• Strong Python skills
• Strong Experience with modern data tooling
• Hands-on experience running A/B tests, canaries, or experiment frameworks.
• Experience defining and maintaining operational reliability metrics (SLIs/SLOs) for AI-driven systems.
• Familiarity with large-scale distributed or streaming systems serving AI/agent workflows (millions of events or alerts/day).
• Excellent communication skills -- able to clearly convey technical results and