Senior QA Engineer Performance & Reliability

San Jose, CA, US β€’ Posted 26 days ago β€’ Updated 26 days ago
Contract Independent
Contract W2
On-site
$50 - $70/hr
Company Branding Image
Fitment

Dice Job Match Scoreβ„’

πŸ«₯ Flibbertigibetting...

Job Details

Skills

  • Quality Assurance
  • TPM
  • ROOT
  • Linux
  • Firmware
  • Embedded Systems
  • Semiconductor
  • PCI

Summary

Job Description

We are hiring a Senior QA Engineer Performance & Reliability to lead the performance characterization and reliability validation of our Secure TCU System, ensuring it meets rigorous data center standards.

In this role, you will own the test design, execution, and deep-dive analysis for performance and reliability, working closely with development teams to identify bottlenecks and resolve complex system-level issues.

Key Responsibilities

Performance & Reliability Strategy

  • Test Design & Execution: Design and execute comprehensive test plans for performance benchmarking, stress testing, longevity/endurance testing, and thermal/power characterization of TCU/BMC systems.

  • Workload Analysis: Analyze system behavior under various heavy workloads to identify performance bottlenecks in throughput, latency, and resource utilization (CPU, Memory, PCIe).

  • Reliability Validation: Conduct Mean Time Between Failures (MTBF) prediction, long-duration stability tests, and error injection campaigns to validate system robustness.

Deep Dive & Issue Resolution

  • Root Cause Analysis: Lead the deep-dive investigation of performance degradation and reliability failures. Use advanced debugging tools (oscilloscopes, logic analyzers, firmware traces) to isolate issues.

  • Developer Collaboration: Work directly with firmware and hardware engineers to reproduce complex bugs, analyze crash dumps, and verify fixes.

  • Infrastructure Enhancement: Develop and maintain automated performance testing frameworks and reporting dashboards to track regression and trends over time.

Reporting & Leadership

  • Reporting: Generate detailed performance assessment reports and reliability analysis metrics for stakeholders.

  • Mentorship: Mentor junior engineers on performance testing methodologies and system debugging techniques.

Qualifications

  • Experience: 5+ years of experience in embedded system testing, with a strong focus on performance verification and reliability engineering.

  • System Knowledge: Deep understanding of TCU, BMC, HMC, RoT (Root of Trust), Secure Boot, TPM, HSM, PCIe (Gen4/5), DDR memory, and networking protocols.

  • Performance Tools: Proficiency with performance profiling tools, traffic generators, and standard benchmarks (e.g., SPEC, IOzone, iperf). Experience with thermal and power measurement tools.

  • Programming: Strong scripting skills in Python for test automation and data analysis; familiarity with C/C++ for code analysis is a plus.

  • Operating Systems: Strong Linux/Unix skills, including kernel tuning, system monitoring, and log analysis.

  • Tools: Experience with CI/CD pipelines (Jenkins, GitLab CI) and version control (Git).

  • Education: BS/MS degree in Computer Science, Electrical Engineering, or a related field.

Ways to Stand Out

  • Experience with AI-driven log analysis or anomaly detection tools to predict reliability issues.

  • Background in validation of high-speed interfaces (PCIe, CXL) and memory subsystems (DDR5/LPDDR5).

  • Experience with data center server architecture and thermal management.

  • Knowledge of industry reliability standards (e.g., Telcordia, JEDEC).

Note: We do not expect candidates to meet every single requirement. A strong core of these skills with a problem-solving mindset is what we value most

Employers have access to artificial intelligence language tools (β€œAI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10426227
  • Position Id: 8878197
  • Posted 26 days ago

Company Info

About Aziro Technologies LLC

Aziro (formerly MSys Technologies and pronounced as "Ah-zee-roh") is an AI-native product engineering company driving innovation-led transformation for global enterprises, high-growth ISVs, and AI-first pioneers.

Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

San Jose, California

β€’

25d ago

Easy Apply

Contract

Depends on Experience

Remote

β€’

28d ago

Easy Apply

Full-time

Depends on Experience

San Jose, California

β€’

8d ago

Easy Apply

Contract, Third Party

Depends on Experience

Remote

β€’

8d ago

Easy Apply

Third Party, Contract

Depends on Experience

Search all similar jobs