Apply Now

AIML - Sr Machine Learning Engineer, Evaluation

Cupertino, CA, US • Posted 30+ days ago • Updated 10 hours ago

Full Time

On-site

Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

Collaboration
Spectrum
Artificial Intelligence
Research
Failure Analysis
Large Language Models (LLMs)
Computer Science
Evaluation
Modeling
Optimization
Training
Python
PyTorch
Privacy
Machine Learning (ML)
Conflict Resolution
Problem Solving
Communication

Summary

We are seeking a highly skilled and experienced machine learning engineer to join AIML Evaluation to build the systems that evaluate and refine Apple's foundation models and agents. As a key member of the team, you will help design and develop benchmarks, evaluators, simulation environments, and prompt and context optimization pipelines that drive quality improvements across Apple's AI experiences.

You will collaborate with product teams and the foundation model team to close the loop between observation and improvement, contributing datasets, environments, and reward signals that drive model and agent quality.

Description

Our team builds the benchmarks, environments, and tooling that power model and agent refinement, and turns observations into actionable opportunities for the next model and agent iteration. We work across the full spectrum of evaluation: offline benchmarks, device-in-the-loop simulation, and on-device observation in production. We develop LLM-as-judge evaluators, train reward models calibrated against human feedback, optimize prompts and context for agents, and contribute targeted datasets and reward signals to foundation model post-training.

In this role, you will play a crucial role in designing and developing evaluation and refinement infrastructure that supports a broad range of AI products at Apple.

You will work on agent and model evaluation across offline, device-in-the-loop, and on-device settings; build automated prompt and context optimization pipelines; and partner with product and research teams to translate failure analysis into measurable model and agent improvements.

You will also have the opportunity to engage with product teams across Apple and contribute to advancements in large language models and agentic systems that will reach millions of users.

To succeed in this role, you should have a strong background in machine learning systems, distributed infrastructure, and a proven track record of building and maintaining ML evaluation or training infrastructure.

You should be a proactive problem solver with excellent communication skills and the ability to work effectively across multiple codebases, teams, and organizations. Experience with LLM evaluation, reward modeling, prompt optimization, or agentic systems is highly desirable.

Minimum Qualifications

Strong background in machine learning and distributed systems

Experience building and maintaining ML infrastructure for evaluation, training, or deployment

Ability to work effectively across multiple codebases, teams, and organizations

8+ years of professional experience as a software engineer, preferably in machine learning or a related field

Bachelor's or Master's degree in Computer Science or a related field

Preferred Qualifications

Experience with LLM evaluation, LLM-as-judge, or reward modeling

Experience with prompt optimization, agent harness development, or post-training (SFT, DPO, RLHF)

Proficiency in Python and ML frameworks such as PyTorch

Experience with agentic systems, simulation environments, or trajectory-based data generation

Familiarity with on-device or privacy-preserving ML

Proactive and determined problem-solving skills

Excellent communication skills

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 90733111
Position Id: 48ce6808096ec1de19f4f08e5a2c4971
Posted 30+ days ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Cupertino, California

•

Today

Join the team building the evaluation systems that enable Apple's next generation of AI experiences. As a Machine Learning Engineer, you will develop scalable infrastructure, intelligent evaluators, and data-driven methodologies that measure and improve the quality of large language models and multimodal AI systems used across Apple products. You'll partner closely with ML researchers, software engineers, and product teams to design novel evaluation techniques, analyze model behavior, and trans

Full-time

Evaluation & Insights Machine Learning Engineer

Cupertino, California

•

Today

Imagine what you could do here. At Apple, great new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish! Are you passionate about music, movies, and the world of Artificial Intelligence and Machine Learning? So are we! Join our Human-Centered AI team for Apple Products. In this role, you'll represent the user perspective on new features, review and analyze

Full-time

AIML - Sr Machine Learning Engineer, Data and ML Innovation

Cupertino, California

•

Today

We are looking for talented machine learning engineers who are excited to tackle some of the most meaningful and technically challenging problems in building and deploying foundation model-based products for our customers. As a Machine Learning Engineer focused on foundation model evaluation, you will play a critical role in assessing the capabilities of the models that power Apple Intelligence features. You will work closely with machine learning researchers to translate evaluation insights i

Full-time

Machine Learning Engineer - AI & ML Evaluation Frameworks

Cupertino, California

•

Today

The Health Sensing Machine Learning Interpretability & Analytics (MLIA) team ensures clinical rigor and contextual trust are at the foundation of Apple's health sensing features. We are looking for an exceptional ML Engineer to help us build the next generation of scalable evaluation infrastructure and lead rigorous investigations into model performance. You will develop cutting-edge tools, synthetic data pipelines, and automated frameworks that ensure our health features are mathematically soun

Full-time

Search all similar jobs

More jobs at Apple, Inc. in Cupertino, CA

AIML - Sr Machine Learning Engineer, Evaluation

Dice Job Match Score™

Job Details

Skills

Summary

Similar Jobs