We're building the next generation of AI evaluation systems - and we're looking for a hands-on engineer who can bridge ML, software, and product to make AI systems more measurable, testable, and trustworthy.\\n\\nWe're part of the AI/ML Evaluation organization, seeking a Senior or Staff-level Applied ML Engineer with strong software engineering skills and a solid understanding of machine learning. In this hands-on role, you'll help design and build intelligent systems that simulate complex interactions (including agentic workflows powered by LLMs), develop tools for extracting structured insights, and create robust evaluation datasets.\\n\\nYou'll also contribute to building scalable platforms for simulation and behavior analysis. This role sits at the intersection of ML, engineering, and product - ideal for someone passionate about bringing clarity and rigor to real-world AI performance.
We're looking for a pragmatic engineer who thrives at the intersection of machine learning and software development - capable of building robust, scalable systems that support evaluation and development of advanced AI capabilities, including large language models and agentic behaviors.\n\nA successful candidate is comfortable navigating ML, systems, and product domains.\nYou bring strong software engineering fundamentals, experience building and maintaining end-to-end pipelines, and a practical understanding of how to evaluate AI systems in real-world contexts. You're curious about how LLMs behave in interactive or agentic settings, thoughtful about evaluation design, and eager to build tools that improve visibility and trust in AI. Above all, you enjoy collaborating across disciplines and bringing structure to complex, evolving problems.
8+ years of experience in software engineering, ML engineering, or applied ML roles\nProficiency in Python or another modern programming language (e.g., Java, Go, Swift)\nExperience building and maintaining production-grade systems\nSolid understanding of machine learning concepts, especially LLMs and their applications\nExcellent communication and collaboration skills with cross-functional partners
Experience working on AI evaluation systems, LLM-based simulations, or agentic AI frameworks\nBackground in building tools for data analysis, model evaluation, or synthetic data generation\nFamiliarity with metrics instrumentation and observability in ML systems\nExperience designing pipelines for AI/ML workflows\nExposure to applied research, generative models, or real-time systems\nUnderstanding of how model quality connects to product outcomes and user experience
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 90733111
- Position Id: 4ceba099d3e493d9344e2826bf993ef6
- Posted 20 hours ago