Overview
Skills
Job Details
A globally leading technology company is looking for an Applied ML Scientist to design and evaluate next-generation machine learning and GenAI systems that power real-world applications. In this role, you ll develop advanced LLM-based evaluation frameworks, prototype AI-driven solutions, and collaborate closely with cross-functional teams to bring cutting-edge research into production. If you re passionate about building impactful ML systems and driving innovation at scale, we invite you to apply!
Job Responsibilities:
Design, develop, and evaluate advanced machine learning models and AI-driven evaluation systems (e.g., LLM-as-a-judge, automated evaluation frameworks) to enhance model performance and reliability.
Prototype, test, and deploy scalable GenAI, LLM, and NLP/NLU solutions for real-world applications, ensuring strong alignment between research insights and production impact.
Collaborate cross-functionally with research, product, and engineering teams to define requirements, identify opportunities for innovation, and translate business needs into technical solutions.
Implement robust software engineering and MLOps practices to ensure reproducible and maintainable ML workflows.
Minimum Qualifications:
Strong foundation in machine learning fundamentals with the ability to tackle complex ML challenges.
Experience or proven interest in designing and implementing AI-driven approaches to evaluation (e.g. LLM-as-a-judge, automated evaluation, etc).
Demonstrated ability to develop high-impact language model systems for real-world applications.
Expertise in GenAI, LLM, and/or NLP/NLU evaluation.
Demonstrated ability to identify research directions, rapidly prototype solutions, and drive them to practical impact.
Proficient in software engineering best practices (e.g., modular software design, testing).
Strong proficiency in Python.
Strong proficiency PyTorch, TensorFlow, or Jax.
Excellent communication skills with a proven ability to engage diverse stakeholders.
Experience with MLOps standards, including containerization, orchestration (e.g., Kubernetes), and CI/CD.
Preferred Qualifications:
Depth in one or more areas is acceptable; candidates are not expected to excel in every listed skill:
Proven experience developing and owning high-impact, developer-facing systems and tools.
Experience developing and evaluating complex agentic systems using LLMs.
Experience adapting and aligning LLMs through various training strategies, e.g. continued pre-training, supervised fine-tuning, and reinforcement learning.
Expertise in uncertainty estimation and calibration, active learning, or related problem spaces.
Experience with ML platform design or ownership
Hands-on experience with large-scale data processing frameworks, e.g. Spark, PySpark, Dask, or Ray.
Track record of contributions to open-source ML projects or publications in top-tier ML conferences (e.g., NeurIPS, ICML, ACL).
Type: Contract (W2)
Duration: 12 months (with extension possible)
Work Location: Seattle, WA (remote)
Pay range: $100.00 - $ 120.00/HR