Senior Site Reliability Engineer

  • Bothell, WA
  • Posted 16 hours ago | Updated moments ago

Overview

On Site
Accepts corp to corp applications
Contract - Independent
Contract - W2
Contract - 6+ month(s)

Skills

java
Devops
Kubernetes
SRE
Prometheus

Job Details

Job Title: Senior Site Reliability Engineer / DevOps Engineer

Location: Bothell, WA

Duration: Contract

Term: 6+ months

Job Description:

Experience Desired: 7+ Years.

Key Responsibilities
Platform Reliability & Operations

  • Own reliability, availability, scalability, and performance of API Gateway services running on Kubernetes
  • Design and implement SRE best practices including SLIs, SLOs, SLAs, error budgets, and incident management
  • Lead production readiness reviews, root cause analysis (RCA), and post-incident improvements
  • Drive capacity planning, performance tuning, and resilience testing
  • Kubernetes & Cloud Engineering
  • Manage and optimize Kubernetes clusters (EKS / AKS / GKE / On-prem)
  • Develop and maintain Helm charts, manifests, and deployment strategies
  • Implement rollout strategies such as blue-green, canary, and rolling deployments
  • Collaborate with development teams to ensure cloud-native design patterns
  • Observability & Monitoring (Strong Focus)
  • Build and maintain enterprise-grade observability (O11y) solutions:
  • Prometheus & Grafana for metrics and dashboards
  • Splunk for centralized logging and alerting
  • OpenTelemetry for distributed tracing
  • Define actionable alerts and dashboards for platform and application health
  • Improve MTTR through better visibility and automation
  • CI/CD & Automation
  • Design and maintain CI/CD pipelines (Jenkins, GitHub Actions, GitLab CI, etc.)
  • Automate infrastructure using Infrastructure as Code (Terraform, CloudFormation, etc.)
  • Develop automation scripts using Python, Bash, or Groovy
  • Security & Compliance
  • Implement DevSecOps practices including secrets management, image scanning, and RBAC
  • Work closely with security teams on vulnerability remediation and compliance controls
  • Innovation & POCs
  • Actively contribute to POCs for AI Gateway / Intelligent API Gateway initiatives
  • Evaluate and prototype integrations with AI/ML-driven routing, observability, and security features
  • Stay current with emerging SRE, cloud, and AI gateway technologies

Soft Skills

  • Strong troubleshooting and problem-solving skills
  • Ability to work cross-functionally with developers, architects, and security teams
  • Proactive mindset with a passion for automation and reliability
  • Good documentation and communication skills

Key Skills:

SRE, Devops, Java, Kubernetes, Observability

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About VDart, Inc.