Description:
Short Description :-
The Platform Scripting Engineer supports the H100 platform team by developing and maintaining automation scripts, CI/CD workflows, and operational tooling. This role focuses on translating infrastructure requirements into reliable, repeatable scripts and pipelines that the team uses for day-to-day operations, runner management, and platform maintenance. Kubernetes is required.
Key Responsibilites :-
- Develop and maintain Bash and Python scripts for infrastructure provisioning, validation, and operational tasks.
- Build and maintain GitHub Actions workflows for CI/CD pipelines (Terraform validation, KCC manifest checks, container builds).
- Create and update runner image build pipelines for Linux, Windows, Android, and iOS self-hosted runners.
- Automate Google Cloud Platform Secret Manager operations: secret creation, rotation, access policy management.
- Build scripts for GKE operational tasks: node pool scaling, cluster upgrades, Config Sync status checks, log collection.
- Develop JFrog Artifactory automation: artifact promotion, cleanup policies, repository management.
- Create monitoring scripts: health checks, resource utilization reports, cost analysis.
- Maintain and extend Kustomize overlays and KCC YAML manifests under guidance of senior engineers.
- Write unit tests and validation checks for all scripts and automation.
- Document scripts with usage guides, examples, and troubleshooting steps.
- Support incident response with diagnostic scripts and log analysis tools.
Required Skills & Qualification''s :-
- 5+ years in platform/infrastructure scripting and automation.
- Strong Bash scripting: functions, error handling, logging, idempotent operations.
- Python scripting for API integrations, data processing, and tooling.
- PowerShell for Windows automation tasks.
- GitHub Actions workflow authoring and debugging.
- Familiarity with Google Cloud Platform CLI tools (gcloud, gsutil, kubectl).
- Basic Terraform understanding (reading and modifying HCL, running plan/apply).
- YAML proficiency (Kubernetes manifests, Kustomize, GitHub Actions workflows).
- Linux systems administration (file systems, networking, process management, systemd).
- Git workflows: branching strategies, PR reviews, conventional commits.
Preferred /Nice to Have Skill''s :-
- Experience with GKE and Kubernetes (kubectl, pods, deployments, services).
- Familiarity with JFrog Artifactory or Nexus.
- Experience building Docker images and container workflows.
- Familiarity with OPA/Rego for policy validation.
- Windows Server administration for runner image management.
Technology Stack:-
- Scripting: Bash, Python, PowerShell.
- CI/CD: GitHub Actions, JFrog Artifactory.
- Cloud: Google Cloud Platform (GKE, Compute, Secret Manager, IAM, Logging).
- Containers: Docker, GKE, ARC runners.
- IaC: Terraform (basic), Kustomize, YAML.
- OS: Linux (Ubuntu), Windows Server.