AWS SRE Cloud Administrator
Assignment Duration: 12+ Months
Engagement Type: Contract
Work Location: Santa Monica, CA & MC Lean, VA - Onsite (Hybrid)
Define and operationalize SLIs, SLOs, error budgets, and reliability
governance models.
Lead incident command, RCA, and long-term reliability remediation for
large-scale systems.
Engineer and tune Java-based microservices (JVM internals, strategies,
memory profiling).
Design and operate GKE (Google Kubernetes Engine) at scale, including
multi-cluster and fleet management.
Implement Google Cloud Platform-native architectures using:
o GKE, Compute Engine, Cloud Load Balancing
o Cloud Spanner, Bigtable, Cloud SQL
o Pub/Sub, Cloud Storage
o IAM, VPC Service Controls
Build secure and repeatable infrastructure using Terraform and policy-as-
code.
Design advanced service mesh and traffic management using Istio / Anthos
Service Mesh.
Implement stateful Kubernetes workloads using Portworx.
Implement advanced Kubernetes storage using Portworx for stateful
workloads.
Support event-driven architectures using Kafka, Kafka Streams, KSQLDB,
and Spark Streaming.
Integrate Google Cloud Platform-native streaming solutions such as Pub/Sub.
Optimize systems for low-latency, high-throughput workloads.
Implement advanced observability using Prometheus, Datadog, Splunk,
Kiali.
Leverage eBPF for kernel-level tracing, networking diagnostics, and
performance tuning.
Manage advanced ingress, load balancing, and traffic shaping using Nginx
Controller and Seesaw.
Architect high-scale CI/CD pipelines using GitLab CI/CD, Jenkins, and Google Cloud Platform-
native tooling.
Build internal developer platforms (PaaS) to standardize deployments and
reduce toil.
Automate operations using Python, Go, Bash, and custom reliability
tooling.
Required Technical
Expertise:
Java (Advanced JVM internals, performance tuning)
Google Cloud Platform Cloud (Professional-level depth)
GKE/Kubernetes (CKA/CKS depth)
Docker, Terraform
CI/CD: GitLab CI/CD, Jenkins
Streaming: Kafka, Kafka Streams, KSQLDB, Spark
Service Mesh: Istio, Anthos Service Mesh
Monitoring & Logging: Prometheus, Datadog, Splunk, Kiali
OS & Scripting: Linux/Unix, Bash
Programming: Python or Go
Virtualization: VMware
Networking & Performance: eBPF, Nginx Controller, Seesaw
Multi-cluster Kubernetes governance
Internal platform engineering (PaaS)
High-traffic SaaS or consumer-scale platforms
Real-time streaming & event-driven architectures
Deep observability and kernel-level tracing
GKE fleet & Anthos multi-cluster architectures
JVM performance engineering at hyperscale
Service mesh traffic shaping & zero-downtime releases
eBPF-based observability & kernel tracing
Platform engineering / internal PaaS design
Real-time streaming & event-driven systems
Certifications
Required:
Google Professional Cloud Architect or Professional Cloud DevOps Engineer
Certified Kubernetes Administrator (CKA) or Certified Kubernetes