Senior Principal Software Engineer - AI GPU Innovation

Overview

On Site
USD 96,800.00 per year
Full Time

Skills

Oracle Cloud
OCI
Prototyping
Design Review
System Integration
Performance Testing
Management
Machine Learning (ML)
Deep Learning
Operational Excellence
Evaluation
Systems Architecture
Hardware Development
Collaboration
Oracle Engineering and Construction
Benchmarking
Performance Analysis
Training
Performance Tuning
Algorithms
Computer Science
Software Development
Java
Golang
C#
Object-Oriented Programming
GPU
Open Source
x86
ARM
Debugging
ROOT
Conflict Resolution
Problem Solving
Communication
IT Management
Amazon Web Services
Microsoft Azure
Performance Appraisal
Artificial Intelligence
Optimization
Computer Hardware
BMC
Firmware
UEFI
BIOS
Linux
Scripting
Customer Facing
Recruiting
Health Care
Taxes
Financial Planning
Legal
Insurance
Integrated Circuit
IC
Internal Communications
Cloud Computing
Value Engineering
Innovation
Life Insurance
Accessibility
Oracle
Law

Job Details

Job Description

Oracle Cloud Infrastructure's (OCI) architecture development engineering team is seeking a highly driven GPU platform software & system development engineer at the Principal Engineer level. We are at the forefront of AI innovation, exploring the next generation of AI accelerators and hardware solutions.

As a Senior Principal software engineer, part of our growing team, you will be involved in evaluation, prototyping, and optimizing cutting-edge AI hardware, AI accelerators, including custom-designed AI chips and systems and software to drive next-gen Cloud AI Infrastructure platforms.

You will contribute to platform definition, platform development oversight as well as in house development, design reviews, system integration, performance testing and characterization. You will interact closely with third party GPU IC suppliers & partners as well as internal hardware and software development teams to help drive Oracle's AI Cloud platform solution space. You will be a critical part of the team developing Oracle's growing Cloud AI Infra solutions.

You will work with the latest AI hardware architectures, benchmark their performance, and collaborate with software engineers to ensure tight integration with AI workloads. You'll have a direct impact on shaping the future of AI hardware for machine learning and deep learning applications.

Career Level - IC5

Responsibilities

Our Senior Principal engineers are also the people who can work independently and provide technical leadership to the broader organization. You should have experience developing AI infrastructure and operating high-scale services, and an understanding of how to make these cloud-scale services resilient. The ideal candidate will be technically strong and productive; someone who knows how to balance speed and quality with iterative and incremental improvements. You understand operational excellence and know-how to infuse a culture of being proactive within your team. You recommend and justify major changes to new and existing products and establish consensus with data-driven approaches.
  • Evaluation of system architecture and proposed implementation path analysis.
  • Work directly with hardware design and development teams on architecture, implementation, development, deployment, and troubleshooting of AI hardware platforms. Collaboration is also expected with the wider Oracle engineering and operations functional groups as well as our external partners.
  • Conduct comprehensive benchmarking and performance analysis of AI accelerators from emerging hardware vendors (e.g., SambaNova, Groq).
  • Compare and contrast new AI accelerators with industry-standard hardware (e.g., NVIDIA GPUs) for training and inference workloads.
  • Develop tools and processes for evaluating the performance of hardware in real-world AI applications.
  • Contribute to the design and improvement of performance optimization algorithms for AI models running on the hardware.

Basic Qualifications
  • BS or MS degree in Computer Science or relevant technical field involving coding or equivalent practical experience.
  • 10+ years of total experience in software development
  • Demonstrated ability to write great code using Java, GoLang, C#, or similar OO languages.
  • Solid knowledge of AI / GPU platform architecture and their capabilities.
  • Experience working on large-scale, highly distributed services infrastructure.
  • Solid working experience with GPU supplier test code as well as open-source AI test / characterization tools.
  • Experience with the architecture, design, and implementation of modern server platforms consisting of multiple architectures and vendors, including x86 and ARM server architectures.
  • Demonstrated experience debugging and root-causing complex issues that may have a mix of hardware and software causes.
  • Systematic problem-solving approach, strong communication skills, a sense of ownership, and drive

Preferred Qualifications
  • Experience as technical lead on a large-scale cloud service
  • Hands-on experience developing and maintaining services on a public cloud platform (e.g., AWS, Azure, Oracle)
  • Experience with AI accelerator chips (e.g., SambaNova, Groq, etc.).
  • Knowledge of AI accelerator benchmarks and tools for performance evaluation (e.g., MLPerf, DeepBench).
  • Understanding of AI model optimization techniques for hardware acceleration.
  • strong understanding and experience running firmware and system diagnostics tools using BMC firmware, UEFI/ BIOS and Linux tools. Skilled in scripting to customize tests.

Qualifications

Disclaimer:

Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.

Range and benefit information provided in this posting are specific to the stated locations only

US: Hiring Range in USD from: $96,800 to $251,600 per annum. May be eligible for bonus, equity, and compensation deferral.

Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.

Oracle US offers a comprehensive benefits package which includes the following:
1. Medical, dental, and vision insurance, including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto, homeowner and pet insurance

The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.
Career Level - IC5

About Us

As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. We've partnered with industry-leaders in almost every sector-and continue to thrive after 40+ years of change by operating with integrity.

We know that true innovation starts when everyone is empowered to contribute. That's why we're committed to growing an inclusive workforce that promotes opportunities for all.

Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.

We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing or by calling +1 in the United States.

Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.