The Health Sensing Machine Learning Interpretability & Analytics (MLIA) team ensures clinical rigor and contextual trust are at the foundation of Apple's health sensing features. We are looking for an exceptional ML Engineer to help us build the next generation of scalable evaluation infrastructure and lead rigorous investigations into model performance. You will develop cutting-edge tools, synthetic data pipelines, and automated frameworks that ensure our health features are mathematically sound, demographically equitable, and clinically safe. If you are passionate about AI safety, robust software architecture, and pushing the boundaries of ML innovation, come join us!
In this role, you will architect and build large-scale evaluation frameworks to interrogate unimodal ML systems and multi-modal foundation models. Beyond infrastructure, you will lead deep-dive ML evaluations, performing failure analysis to uncover performance gaps, reasoning flaws, and edge cases. You will translate findings into actionable insights and work directly with algorithm teams to improve the safety and reliability of our health features. Your work will empower teams across Apple to rapidly evaluate multi-modal sensor fusion while upholding Apple's privacy standards.
BS in Computer Science, Machine Learning, Statistics, or related field\n3+ years of experience in ML Engineering or Applied ML\nStrong experience in evaluating supervised, unsupervised, LLMs and deep learning models.\nProficiency in Python with the ability to write production-grade code (OOP, CI/CD, Git)\nHands-on experience in failure analysis, evaluating LLMs and driving subsequent model improvements\nExperience building data pipelines, inference frameworks, and automated evaluation systems\nStrong communication skills to articulate complex technical concepts across technical and non-technical audiences
MS/PhD in Computer Science, Machine Learning, Statistics, or related field\nExperience evaluating LLMs or agentic systems (e.g., LLM-as-a-judge, RAG evaluation)\nExperience with synthetic data generation and prompt engineering\nExperience in parallel data processing (Spark, Kubernetes, Airflow) or privacy-preserving ML (Federated Learning)\nBackground in AI Safety, model interpretability, or adversarial testing\nInterest in digital health and clinical rigor
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 90733111
- Position Id: f08bb4bac53b468f3a53a3956e047717
- Posted 2 days ago