The System Intelligence Machine Learning (SIML) organization is looking for Research Engineers with a strong foundation in Machine Learning and Computer Vision to develop the next generation of multi-modal Human Sensing technologies. You will be part of a fast-paced, impact-driven Applied Research organization building foundation models for facial and full-body perception, and working on cutting-edge machine learning that is at the heart of the most loved features on Apple platforms including Apple Intelligence, Camera, Photos, Visual Intelligence, etc. These innovations form the foundation of the seamless, intelligent experiences our users enjoy every day!
As a Machine Learning Research Engineer, you will be responsible for designing and developing cutting-edge AI/ML models for Human Sensing, with a focus on building robust cross-domain identity recognition systems. Multi-modal Human Sensing is a foundational capability that powers intelligent experiences based on key human traits such as identity, expression, clothing, action, gesture, gaze and human-object interaction. Major Apple Intelligence experiences such as personalized Natural Language Search, Memories Creation, as well as personalized Image Generation are powered by our ability to learn robust representations of visual human traits. Efficient real-time visual human sensing powers flagship Photography experiences such as Cinematic mode and Photographic Styles, communication experiences such as Center Stage, and paves the way for more natural human-device interactions, e.g., with the DockKit framework.\n\nYOUR PRIMARY RESPONSIBILITIES WILL INCLUDE:\nDesigning, implementing, and deploying state-of-the-art visual recognition systems.\n\nBuilding foundation models for facial and full-body perception.\n\nDriving data quality excellence through strategic dataset curation, validation, and generation to support world-class model development.\n\nBuilding tools and frameworks for systematic failure analysis, identifying edge cases, and driving continuous model improvement.\n\nDirectly interacting with all cross-functional stakeholders to gather product requirements and translating these into actionable plans for ML research and development.\n\nEffectively communicating results and insights to partners and senior leaders, providing clear and actionable recommendations.\n\nStaying current with the latest trends, technologies, and standard methodologies in machine learning, multi-modal foundation models, computer vision and natural language understanding.\n\nActively contributing to Apple's ML community by disseminating research ideas and results, enhancing shared infrastructure, and mentoring fellow practitioners.
Master's or Ph.D. in Computer Science, Computer Engineering, or related fields; or equivalent professional experience in Computer Vision (CV) and Machine Learning (ML) research and development.\nProven track record of training, fine-tuning and evaluating deep learning models for vision tasks using modern ML architectures.\nBackground in research and innovation, demonstrated through publications in top-tier journals or conferences, patents, or impactful software developments.\nProficient in Python, PyTorch or equivalent deep learning frameworks.
Expert-level knowledge of state-of-the-art methods in face recognition or other facial analysis and biometric systems.\nHands-on experience training and scaling multi-modal large language models (LLMs) or large-scale vision-language models (VLMs).\nExperience with on-device ML, model optimization (knowledge distillation, quantization, pruning), or production-grade ML pipelines.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 90733111
- Position Id: 370b56e51f5d69f8136d8f4c85479341
- Posted 20 hours ago