Overview
USD 119,800.00 - 234,700.00 per year
Full Time
Skills
Reliability Engineering
Customer Service
Roadmaps
Computer Hardware
Innovation
Collaboration
Accountability
Systems Management
Operational Excellence
Build Tools
Debugging
Microsoft Azure
Artificial Intelligence
Storage
Management
IaaS
Screening
PASS
Computer Science
C
C++
C#
Java
JavaScript
Python
Machine Learning (ML)
GPU
High Performance Computing
HPC
Cloud Computing
Virtualization
Software Engineering
IC
Integrated Circuit
Internal Communications
Legal
Recruiting
Microsoft
Job Details
The Microsoft Azure Artificial Intelligence and High Performance Computing (AI and HPC) team is seeking systems engineers to support customers in deploying, monitoring, profiling, and debugging their applications on hyperscale cloud infrastructure. Azure is enabling some of the largest supercomputing deployments in the public cloud, as demonstrated by its presence in rankings such as Top500, Machine Learning Performance (MLPerf), and Graph500.
Operating at supercomputing scale requires specialized tools and techniques to ensure system reliability, runtime performance, and job health, while continuing to meet customer Service Level Agreements (SLAs). In this role, you will develop and apply advanced tools, identify operational gaps, and implement features that support the smooth operation of cloud-native supercomputers.
As a Senior Software Engineer, you will help establish best practices, influence architectural decisions, and contribute to the roadmap of key software and hardware components. Your work will directly impact a broad range of users and drive the next wave of innovation in artificial intelligence and high performance computing in the cloud.
Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities:
Qualifications:
Required Qualifications:
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: ;br>
Microsoft will accept applications for the role until November 6, 2025.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form .
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
#azurecorejobs
Operating at supercomputing scale requires specialized tools and techniques to ensure system reliability, runtime performance, and job health, while continuing to meet customer Service Level Agreements (SLAs). In this role, you will develop and apply advanced tools, identify operational gaps, and implement features that support the smooth operation of cloud-native supercomputers.
As a Senior Software Engineer, you will help establish best practices, influence architectural decisions, and contribute to the roadmap of key software and hardware components. Your work will directly impact a broad range of users and drive the next wave of innovation in artificial intelligence and high performance computing in the cloud.
Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Responsibilities:
- Be part of a comprehensive systems management team focused on operational excellence and customer success.
- Build tools and analyze key system metrics and telemetry to proactively identify and debug HPC system issues.
- Partner with customers, vendors, and other teams within Azure to drive comprehensive solutions for operating world class Supercomputers in the public cloud environment.
- Help ensure Azure platform is consistent on performance, can scale on-demand, and engineered to withstand the unparalleled computing demand from the customer workloads.
- Contribute to a test-driven engineering culture to reduce regressions and bugs in production and will set a higher bar for infrastructure quality.
Qualifications:
Required Qualifications:
- Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, OR Java, JavaScript, or Python
- OR equivalent experience.
- 3+ years of experience in operating AI/HPC systems, developing and running AI/HPC applications on clusters, or operating Cloud Infrastructure.
- 2+ years of specialized experience with one of AI/HPC system management OR High-Speed Networks OR HPC Storage OR managing Cloud Infrastructure.
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
- Bachelor's Degree in Computer Science
- OR related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, OR Python
- OR Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
- OR equivalent experience.
- 1+ year(s) of experience running and troubleshooting machine learning workloads on Graphics Processing Unit (GPU)-based High Performance Computing (HPC) systems, including familiarity with the HPC software stack.
- 1+ year(s) of experience with cloud computing, virtualization, and container technologies.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: ;br>
Microsoft will accept applications for the role until November 6, 2025.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form .
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
#azurecorejobs
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.