Apply Now

Applied Machine Learning Research Engineer - Multimodal LLMs for Human Understanding

Sunnyvale, CA, US • Posted 11 hours ago • Updated 11 hours ago

Full Time

On-site

Fitment

Dice Job Match Score™

✨ Finding the perfect fit...

Job Details

Skills

Real-time
Product Requirements
Research
Algorithms
Large Language Models (LLMs)
Video
Collaboration
Innovation
Training
Python
Natural Language Processing
Fusion
Generative Artificial Intelligence (AI)
Deep Learning
JAX
PyTorch
Computer Science
Electrical Engineering
FOCUS
Artificial Intelligence
Machine Learning (ML)
Computer Vision

Summary

We're starting to see the incredible potential of multimodal foundation and large language models, and many applications in the computer vision and machine learning domain that previously appeared infeasible are now within reach. We are looking for a highly motivated and skilled Applied Machine Learning Research Engineer to join our team in the Video Computer Vision group and help us push the boundaries of human understanding. The Video Computer Vision org has pioneered human-centric real-time features such as FaceID, FaceKit, and Gaze and Hand gesture control which have changed the way millions of users interact with their devices. We balance research and product requirements to deliver Apple quality, pioneering experiences, innovating through the full stack, and partnering with HW, SW and AI teams to shape Apple's products and bring our vision to life.

You'll work on ground breaking research projects to advance our AI and computer vision capabilities, contribute to both foundational research and practical applications on multimodal large language models, and design, implement, and evaluate algorithms and models for human understanding. You have a strong background in developing and exploring multimodal large language models that integrate diverse data modalities such as text, image, video, and audio. You'll have the opportunity to collaborate with multi-functional teams, including researchers, data scientists, software engineers, human interface designers and application domain experts. You'll stay up-to-date on the latest advancements in AI, machine learning, and computer vision and apply this knowledge to drive innovation within the company.

Experience in developing, training/tuning multimodal LLMs.\nProgramming skills in Python.\nMasters degree with a minimum of 3 years relevant industry experience.

Expertise in one or more of: computer vision, NLP, multimodal fusion, Generative AI.\nExperience with at least one deep learning framework such as JAX, PyTorch, or similar.\nPublication record in relevant venues.\nPhD in Computer Science, Electrical Engineering, or a related field with a focus on AI, machine learning, or computer vision.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 90733111
Position Id: 508f1970922665b3840640fb724fe42c
Posted 11 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Sunnyvale, California

•

Today

Full-time

Sr. Machine Learning Engineer, Siri Speech

Cupertino, California

•

Today

Are you excited about Generative AI and Large Language Models? Do you want to work on cutting-edge generative technologies that power intelligent, natural interactions for billions of users? Join the SWE Speech team at Apple!\\n\\nOur team develops state-of-the-art machine learning technologies for speech understanding, speech generation, and speech-to-speech interaction. We build scalable infrastructure, high-quality datasets, and advanced models that power Siri, Dictation, and speech-enabled A

Full-time

Applied Research Engineer - Multimodal AI

Sunnyvale, California

•

Today

The Video Computer Vision (VCV) organization is a centralized applied research and engineering team developing real-time, on-device Computer Vision and Machine Perception technologies across Apple products. Within VCV, the Multimodal Intelligence team builds next-generation multimodal AI systems that combine large language models, multimodal LLMs, and foundation models to create intelligent systems capable of understanding, reasoning, and acting across language, vision, audio, and tools. We deve

Full-time

Generative AI Applied Scientist, SIML - ISE

Cupertino, California

•

Today

Apple's System Intelligence and Machine Learning (SIML) team is seeking a senior Generative AI expert to pioneer the next generation of human-centric device interaction and multimodal scene understanding. You will be at the core of our efforts to develop multimodal LLMs that can perceive and understand complex scenes and nuanced human interactions, behaviors, and preferences. This is a unique opportunity to join a leading applied research group known for its foundational contributions to Apple I

Full-time

Search all similar jobs

Applied Machine Learning Research Engineer - Multimodal LLMs for Human Understanding

Dice Job Match Score™

Job Details

Skills

Summary

Similar Jobs