Overview
Skills
Job Details
A globally leading technology company is looking for a highly skilled Data Scientist + ML Engineer (Generative AI) to join the team. In this role, you will be responsible for developing, fine-tuning, and applying advanced generative AI models including diffusion models, large language models (LLMs), and other state-of-the-art architectures. You will collaborate closely with cross-functional partners in research, data engineering, and operations to deliver high-quality machine learning solutions and scalable datasets.
This position requires a balance of technical depth and creative problem-solving. You should be comfortable working with large, complex datasets and possess a strong grasp of modern ML frameworks, distributed computing environments, and end-to-end data pipelines.
RESPONSIBILITIES:
- Design and Implement LLM-Driven Synthetic Data Pipelines: Design and build workflows using LLMs and Gen AI techniques to create high-volume, high-quality synthetic data for model training and testing.
- Design, implement, and deploy machine learning models with a focus on generative AI (diffusion models, LLMs, and related architectures)
- Fine-tune, evaluate, and optimize large language models for specific downstream tasks and data needs
- Develop and maintain scalable data pipelines supporting training, evaluation, and inference workflows
- Conduct exploratory data analysis to surface insights and identify opportunities for model or data improvement
- Partner cross-functionally with researchers, engineers, and data program managers to define requirements and deliver high-impact ML solutions
- Build and enhance internal tools, libraries, and automation workflows to accelerate experimentation and iteration
QUALIFICATIONS:
- Bachelor s degree in Computer Science or related field from an accredited U.S. institution
- 2+ years of experience in Machine Learning or Software Engineering
- Expert-level proficiency in Python and familiarity with deep learning frameworks such as PyTorch
- Strong foundation in machine learning algorithms, data preprocessing, and evaluation techniques
- Demonstrated experience working with diffusion models, stable diffusion, or large language models (LLMs)
- Excellent analytical, problem-solving, and debugging skills
- Strong communication and documentation skills with the ability to explain complex concepts clearly
- Ability to work independently in a fast-paced, iterative development environment
Type: Contract
Duration:12 months +
Work Location: Cupertino, CA (Hybrid)
Pay range: $ 65-75 (DOE)