Overview
Skills
Job Details
Infrastructure Data Engineer (Kubernetes-Golang)
W2 Contract
Salary Range: $135,200 - $156,000 per year
Location: Cupertino, CA - Remote
Job Summary:
We are looking for a highly skilled Senior Kubernetes & Data Infrastructure Engineer with deep expertise in distributed systems, Kubernetes (on-prem), cloud platforms, API development, and Golang-based software engineering. The ideal candidate has hands-on experience building and operating large-scale distributed systems, developing migration tooling, and supporting production-grade Kubernetes clusters.
Duties and Responsibilities:
- Own daily operational support for on-premise Kubernetes clusters, ensuring reliability, availability, scalability, and performance.
- Develop and maintain a large-scale batch orchestration platform running 100,000+ jobs per day.
- Build migration tooling to facilitate the transfer of job configurations and workloads to new data platforms.
- Design and implement distributed data systems for high availability, resilience, and performance.
- Write efficient, high-performance Golang code for automation, tooling, and platform services.
- Create operational scripts and tooling (Bash, Python) for automation, observability, and infrastructure workflows.
- Implement and improve CI/CD pipelines and DevOps standards across Kubernetes environments.
- Set up, manage, and optimize Prometheus, Grafana, and monitoring/alerting pipelines for full-stack observability.
- Troubleshoot distributed systems and Kubernetes workloads in production environments.
- Participate in system design interviews, Kubernetes design reviews, and cross-team technical discussions.
- Collaborate with platform, SRE, and data engineering teams to enhance platform resilience and operational efficiency.
Requirements and Qualifications:
- 4-8+ years of experience in Data Infrastructure, Platform Engineering, DevOps, or SRE roles.
- Deep hands-on operational experience with Kubernetes, especially in on-premise environments.
- Strong proficiency in Golang development (core language + concurrency + distributed systems patterns).
- Experience designing and supporting distributed systems at scale.
- Strong scripting skills (Bash/Shell; Python is a plus).
- Experience with Prometheus, Grafana, alerting systems, and end-to-end observability stacks.
- Solid understanding of Linux systems, networking, container runtimes, and CI/CD pipelines.
- Experience building or maintaining systems that run large-scale workloads (tens of thousands of jobs).
- Strong analytical, debugging, and distributed system troubleshooting skills.
- Excellent communication skills and ability to work across engineering teams.
- High ownership mindset with a focus on reliability, sustainability, and operational excellence.
- Ability to work in a fast-paced, high-impact environment.
Preferred Qualifications:
- Experience with hybrid cloud + on-prem Kubernetes architectures.
- Familiarity with service mesh (Istio, Linkerd) and advanced Kubernetes networking.
- Exposure to data engineering workflows or batch processing frameworks.
- Experience with GitOps tooling (ArgoCD, Helm, Kustomize).
- Knowledge of infrastructure security, RBAC, certificates, and cluster hardening.
- Previous experience supporting critical production systems at a very large scale.
Bayside Solutions, Inc. is not able to sponsor any candidates at this time. Additionally, candidates for this position must qualify as a W2 candidate.
Bayside Solutions, Inc. may collect your personal information during the position application process. Please reference Bayside Solutions, Inc.'s CCPA Privacy Policy at ;/span>