Job Title: SRE Infra Consultant with Google Cloud Platform
Location: Across USA (Any Location)
REMOTE
FULL TIME
Job Summary:
We are seeking a highly skilled Site Reliability Engineering (SRE) Infrastructure Consultant with strong expertise in Google Cloud Platform (Google Cloud Platform). The ideal candidate will be responsible for designing, implementing, automating, and maintaining scalable, secure, and highly available cloud infrastructure solutions. This role requires hands-on experience with Google Cloud Platform core services, Infrastructure as Code, containerization, and production-grade operations support.
Key Responsibilities:
Design, deploy, and manage secure cloud infrastructure on Google Cloud Platform.
Implement and manage networking components including VPCs, IAM policies, Secret Manager, and Cloud Logging.
Develop and maintain Infrastructure-as-Code (IaC) using Terraform.
Containerize applications using Docker and manage artifacts in Artifact Registry.
Support CI/CD automation and ensure infrastructure reliability, scalability, and security.
Implement monitoring, alerting, and observability solutions to meet defined SLOs and SLAs.
Collaborate with DevOps, Data Engineering, and Application teams for cloud-native deployments.
Participate in incident management, root cause analysis (RCA), and continuous improvement initiatives.
Required Skills (Absolutely Must Have):
Strong hands-on experience with Google Cloud Platform services including:
VPC (networking and firewall rules)
IAM (role-based access control)
Secret Manager
Cloud Logging
Artifact Registry
Proficiency in Terraform for infrastructure provisioning.
Strong experience with Docker for containerization.
Experience working in SRE/Operations teams supporting production environments.
Nice to Have:
Experience with Google Kubernetes Engine (Kubernetes).
Hands-on experience with Google Cloud Platform Data Engineering stack:
Google Cloud Composer
Google Cloud Dataflow
Google Cloud Dataproc
BigQuery
Pub/Sub, Cloud Functions, Cloud Run
Experience with API management tools such as Apigee.
Familiarity with monitoring tools such as Prometheus and Grafana.
Exposure to Apache Flink or streaming frameworks.
Additional Preferred Qualifications:
Hands-on experience with automation using Python and SQL.
Experience with ITSM platforms such as ServiceNow.
Knowledge of Pub/Sub messaging, Datastream, and Cloud-native integration tools.
Strong troubleshooting, problem-solving, and communication skills.
Experience Requirements:
6+ years of experience in Cloud Infrastructure / SRE / DevOps roles.
3+ years of hands-on experience with Google Cloud Platform.
Proven experience managing production-grade cloud environments.