Washington
•
Today
AI systems are only as trustworthy as the methods used to evaluate them. At Apple, where AI powers experiences for billions of people, getting evaluation right is not a support function-it is a foundational science. Our team, part of Apple Services Engineering, is building that scientific foundation: rigorous, scalable evaluation methodology for LLMs, agentic systems, and human-AI interaction.\\n\\nWhat makes this team unusual is its interdisciplinary core. You will work alongside measurement sc
Full-time




