Overview
On Site
USD 119,800.00 - 234,700.00 per year
Full Time
Skills
Research
Microsoft Office
Innovation
Use Cases
Reasoning
Orchestration
Accountability
Routing
Dashboard
A/B Testing
Regulatory Compliance
Caching
Root Cause Analysis
Collaboration
Screening
PASS
Computer Science
C
C++
JavaScript
C#
Java
Python
TypeScript
Large Language Models (LLMs)
Artificial Intelligence
Evaluation
Cloud Computing
Microsoft Azure
Amazon Web Services
Google Cloud Platform
Google Cloud
Software Engineering
IC
Internal Communications
Integrated Circuit
SAP BASIS
Microsoft
Immigration
Military
Job Details
Overview
Copilot usage is growing across Microsoft 365 and custom agent experiences. To keep pace with diverse customer needs, regulatory requirements, and rapid innovation in the model ecosystem, we're expanding our model choice across multiple providers and modalities. A robust, data-driven evaluation and observability platform ensures we select the right model for each scenario-balancing quality, safety, latency, and cost-and de-risks vendor lock-in while increasing resilience and agility.
Within Microsoft, our teams already compare models on capabilities, cost, and latency and visualize responsible AI metrics (e.g., groundedness, coherence, relevance, similarity) in integrated dashboards-this role accelerates and productizes those patterns for Copilot Studio makers and platform teams. You will build the backend systems, APIs, and evaluation pipelines that let Copilot and Copilot Studio safely and efficiently route requests across multiple model providers. You'll partner with platform PMs, applied scientists, and reliability engineers to instrument end-to-end quality signals, govern rollouts, and create decisioning frameworks that map model/provider selection to Copilot core use cases (authoring, reasoning, retrieval-augmented generation, multi-agent orchestration, and domain-specific tasks).
Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
- Build multi provider model routing services with policy driven selection, traffic splitting, and observability.
- Develop evaluation pipelines for offline and online experiments; automate metrics like task success, groundedness, latency, and cost.
- Operationalize model comparisons through scorecards, dashboards, and A/B testing for rollout decisions.
- Harden backend infrastructure for scale, security, and compliance; implement caching, rate limiting, and safety filters.
- Instrument telemetry and monitoring using OpenTelemetry; enable RCA and performance insights.
- Collaborate cross functionally with PMs, scientists, and Responsible AI teams; document and evangelize best practices.
Qualifications
Required Qualifications:
#MCSJObs
Software Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.
Copilot usage is growing across Microsoft 365 and custom agent experiences. To keep pace with diverse customer needs, regulatory requirements, and rapid innovation in the model ecosystem, we're expanding our model choice across multiple providers and modalities. A robust, data-driven evaluation and observability platform ensures we select the right model for each scenario-balancing quality, safety, latency, and cost-and de-risks vendor lock-in while increasing resilience and agility.
Within Microsoft, our teams already compare models on capabilities, cost, and latency and visualize responsible AI metrics (e.g., groundedness, coherence, relevance, similarity) in integrated dashboards-this role accelerates and productizes those patterns for Copilot Studio makers and platform teams. You will build the backend systems, APIs, and evaluation pipelines that let Copilot and Copilot Studio safely and efficiently route requests across multiple model providers. You'll partner with platform PMs, applied scientists, and reliability engineers to instrument end-to-end quality signals, govern rollouts, and create decisioning frameworks that map model/provider selection to Copilot core use cases (authoring, reasoning, retrieval-augmented generation, multi-agent orchestration, and domain-specific tasks).
Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities
- Build multi provider model routing services with policy driven selection, traffic splitting, and observability.
- Develop evaluation pipelines for offline and online experiments; automate metrics like task success, groundedness, latency, and cost.
- Operationalize model comparisons through scorecards, dashboards, and A/B testing for rollout decisions.
- Harden backend infrastructure for scale, security, and compliance; implement caching, rate limiting, and safety filters.
- Instrument telemetry and monitoring using OpenTelemetry; enable RCA and performance insights.
- Collaborate cross functionally with PMs, scientists, and Responsible AI teams; document and evangelize best practices.
Qualifications
Required Qualifications:
- Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
- Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.
- OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- Experience building backend services or APIs in a production environment using at least one modern programming language (e.g., C#, Java, Python, or TypeScript).
- Experience working with large language models (LLMs) or similar AI systems in any capacity (development, integration, or evaluation).
- Experience deploying or operating distributed systems or cloud-based services (e.g., Azure, AWS, or Google Cloud Platform) in a production environment.
#MCSJObs
Software Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.