Senior DevOps Engineer

Palo Alto, CA, US • Posted 7 hours ago • Updated 3 hours ago
Contract W2
On-site
$95/hr
Fitment

Dice Job Match Score™

🔗 Matching skills to job...

Job Details

Skills

  • Senior DevOps Engineer

Summary

Job description
Company is helping our client find a Senior DevOps Engineer to provide follow-the-sun coverage for the ADAS line of business, ensuring platform stability, SLA compliance, and rapid incident response during West Coast business hours.
In this role, you'll deliver critical coverage during Japan's off-hours, enabling 24/7 SLA adherence for ADAS production systems running on a multi-tenant ecosystem. You'll partner closely with Japan-based DevOps teams, ADAS engineering (19+ engineers, scaling), and platform teams to maintain uptime, reliability, and operational excellence across multiple production environments.
The ideal candidate is an experienced DevOps/SRE professional who thrives in high-availability environments, is comfortable owning incidents end-to-end, and brings a strong automation-first mindset.

As a Senior DevOps Engineer, you'll:
  • Provide SEV-1/SEV-2 incident coverage during PST/PDT hours, ensuring 24/7 SLA adherence and meeting the 2-hour initial response target.
  • Deploy, manage, and scale AWS infrastructure (S3, Aurora/Postgres, IAM, Route 53, WAF, CloudFront) using Terraform and infrastructure-as-code best practices.
  • Build, maintain, and optimize CI/CD pipelines (GitHub Actions) to support consistent, reliable deployments across multiple environments (dev, stage, pre-production, production).
  • Monitor and support AWS and Kubernetes-based workloads (Stargate platform) to meet a 99.5% uptime target.
  • Track application and infrastructure performance using tools such as Prometheus, Grafana, Sentry, and related observability platforms.
  • Respond to incidents, perform root cause analysis, and implement corrective actions to improve system reliability.
  • Identify manual processes and implement automation to improve efficiency, reduce deployment times, and minimize operational overhead.
  • Integrate security best practices into CI/CD pipelines and infrastructure, ensuring compliance with DevSecOps and IaC standards.
  • Develop and maintain runbooks, documentation, and best practices to support knowledge sharing and seamless operations across global teams.
Ideal candidate profile
Nice to Have (WANT)
Solid experience working with service mesh (e.g., Istio).
Security Knowledge: Understanding of DevSecOps principles, including secure deployment practices, vulnerability scanning, and incident response.
Solid understanding of software development lifecycle (SDLC) and agile delivery (Scrum / Kanban).
Prior experience in multi-tenant or enterprise-scale platforms.
Experience with backup/restore automation and disaster recovery procedures.
Daily tasks
Responsibilities Follow-the-Sun Incident Response: Provide SEV-1/SEV-2 incident coverage during PST/PDT hours (JP team off-hours), ensuring the contracted 2-hour initial response SLA is met around the clock.
Infrastructure Management: Deploy and maintain cloud-based infrastructure on AWS (S3, Aurora/Postgres, IAM, Route 53, WAF, Cloudfront) leveraging IaC practices (Terraform) for scalability and reliability.
Pipeline Management: Build, maintain, and improve CI/CD pipelines (GitHub Actions) to ensure efficient and consistent delivery of software across ADAS tenant environments (dev, stage, pre-production, production).
Platform Stability & Uptime: Monitor cross-account AWS infrastructure and Kubernetes workloads (Stargate-based) to maintain the 99.5% monthly uptime target aligned with AWS/Stargate SLAs.
Required skills
Required (MUST)
3+ years of professional DevOps / SRE experience.
CI/CD Pipelines: Proficiency in setting up, maintaining, and troubleshooting CI/CD pipelines (e.g., GitHub Actions).
Containerization: Solid experience with Docker and Kubernetes, including deployment, scaling, and management.
Cloud Providers: Hands-on experience with AWS (strongly preferred). Strong understanding of IaaS and PaaS offerings, IAM, and networking within cloud environments.
Infrastructure as Code (IaC): Proficiency with Terraform for managing cloud infrastructure.
Monitoring & Logging: Experience with monitoring and logging tools (e.g., Prometheus, Grafana, Sentry, ELK stack) for performance tracking and troubleshooting.
Incident Management: Experience with on-call rotations, incident triage, and follow-the-sun support models.
Strong communication skills in cross-functional environments involving engineers, product owners, and leadership - particularly across time zones (JP/NA coordination).
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: comrise
  • Position Id: 2026-27286
  • Posted 7 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Hybrid in Palo Alto, California

Today

Easy Apply

Third Party, Contract

Depends on Experience

Cupertino, California

Today

Easy Apply

Full-time, Third Party, Contract

Depends on Experience

Santa Clara, California

Today

Easy Apply

Contract

$70 - $80

Palo Alto, California

15d ago

Easy Apply

Contract

$70 - $100

Search all similar jobs