A fast-growing AI data platform is currently looking for a Site Reliability Engineer to join their U.S.-based infrastructure team. This individual will play a key role in building and securing the infrastructure behind a platform that powers complex data workflows for the aerospace and defense industry.
In this high-impact role, you'll work closely with a global engineering team to design, build, and scale cloud-native infrastructure supporting AI/ML workloads. The ideal candidate will bring strong hands-on experience across Google Cloud Platform, Kubernetes, and infrastructure as code, with a focus on reliability, security, and performance. You'll help define best practices around CI/CD, observability, and incident response while contributing to the evolution of a modern, multi-cloud platform. This is an opportunity to take ownership in a growing environment where you'll have real influence over how systems are built and operated at scale.
This is a fully remote position open to candidates based in the U.S., with occasional travel to a Chicago office (1-2 weeks per quarter).
Required Skills & Experience - 5+ years of experience in Site Reliability, DevOps, or Cloud Engineering
- Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent professional experience)
- Hands-on experience with Google Cloud Platform, Kubernetes, and Terraform
- Strong understanding of CI/CD pipelines (GitHub Actions, Jenkins, or similar)
- Experience building and maintaining scalable, highly available cloud infrastructure
- Familiarity with observability tools such as Prometheus, Grafana, ELK, or similar
- Experience with incident response, monitoring, and production support environments
- Strong scripting or programming experience (Python, Go, Bash, or similar)
- Understanding of security best practices in cloud environments
Desired Skills & Experience - Experience working in a startup or product-focused environment
- Exposure to multi-cloud infrastructure (AWS + Google Cloud Platform)
- Experience supporting AI/ML workloads or data platforms
- Familiarity with DevSecOps practices and automated security tooling
- Experience improving developer experience through platform or "golden path" tooling
What You Will Be Doing Tech Breakdown
- 100% Google Cloud Platform
Daily Responsibilities
The Offer You will receive the following benefits:
- Medical, Dental, and Vision Insurance
- Vacation Time
Applicants must be currently authorized to work in the US on a full-time basis now and in the future.