Join Apple Services Engineering to build the next generation of AI evaluation systems. We are seeking machine learning platform engineers at multiple levels (Mid-Level to Principal) to architect and build high-availability services and internal tools that enable self-service evaluation at scale. You will partner with researchers to operationalize their innovations, transforming complex workflows into intuitive, developer-first platforms. We are looking for builders who thrive in the ambiguity of new initiatives and are passionate about creating scalable infrastructure.
You will join the engineering team responsible for democratizing AI evaluation across the organization. Your focus will be on developing the developer experience-architecting and implementing the APIs, SDKs, and platform services that turn complex evaluation metrics into simple, self-service calls. You will work hand-in-hand with researchers to operationalize sophisticated measurement techniques, ensuring they scale reliably within our high-availability infrastructure. In this role, you will drive the engineering standards for a new organization, upholding the code quality, automation, and testing rigor required to support the rapid evolution of Generative AI and Agentic systems.
2+ years of hands-on software engineering experience (or Master's degree with relevant project experience). Note: We are hiring across multiple seniority levels; expectations will scale with experience.\nStrong proficiency in the Python ecosystem (e.g., FastAPI, Pydantic, Pandas). You are capable of writing production-grade code and contributing to architectural discussions on day one.\nCustomer Obsession & Product Thinking: Experience acting as a technical partner to internal customers. You can translate vague requirements from other teams into concrete engineering specifications.\nDemonstrated experience partnering with Data Scientists or Researchers: You have the ability to navigate the ambiguity of research workflows and operationalize scientific code.\nFunctional literacy in AI/ML concepts: You understand the fundamental lifecycle of machine learning (datasets, training vs. inference, evaluation metrics) and can discuss the engineering challenges involved in serving models.\nStrong expertise in API Design & Internal Tools: You have built APIs that other developers rely on, with a focus on versioning, backward compatibility, and developer experience.\nOperational excellence background: You have practical experience using CI/CD pipelines, containerization (Docker/Kubernetes), and monitoring (Datadog/Prometheus).\nBS CS , Master's preferred.
Experience building MLOps & Platform Infrastructure: You have architected the foundational infrastructure for AI, such as model registries, inference services, or feature stores (using tools like Kubernetes, Ray, or Kubeflow).\nDeep familiarity with AI Evaluation Frameworks: You have used or contributed to modern evaluation tools like DeepEval, Ragas, TruLens, or LangSmith. You understand how to implement and scale model-based evaluation workflows.\nDeep understanding of Generative AI & Agents: You understand the engineering challenges of relying on LLMs and Agents as software components-specifically managing token economics, handling rate limits, and evaluating non-deterministic, multi-step reasoning capabilities.\nBuilder Experience: You have thrived in startup-like environments, navigating high ambiguity to deliver complex technical roadmaps from scratch.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 90733111
- Position Id: 566f3d7b75c8f673fdad8de2f9ee2ed4
- Posted 20 hours ago