Overview
On Site
Hybrid
Full Time
Part Time
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - Permanent
100% Travel
Skills
Azure
gcp
IAC
platform
Reliability
Job Details
Role : Platform Reliability Engineer
Location: Dallas TX On site
Core Responsibilities
Technical Skills
-
- Google Cloud Platform Services: Expert in BigQuery, Dataproc, Dataflow, Data Fusion, Pub/Sub, Cloud Storage, Cloud Run
- Infrastructure as Code: Advanced Terraform skills with module development
- Containers & Orchestration: Deep experience with Docker, Kubernetes, and GKE
- CI/CD: Proficient with Cloud Build and GitLab CI/CD pipelines
- Data Security: Strong understanding of data governance, security, and compliance
- Data Processing: Expert in data pipeline design using Apache Spark, Apache Beam
- Machine Learning Ops: Working knowledge of Vertex AI, AI Platform, TensorFlow/PyTorch deployment
- Streaming: Experience with Kafka and Pub/Sub architectures
- API Design: RESTful API design with authentication and authorization patterns
- Workflow Management: Experience with Astronomer/Airflow for orchestration
- Data Transformation: DBT implementation experience
- GitOps: Proficient with GitOps principles and tools (ArgoCD, Flux)
Soft Skills
- Strategic thinking and architectural vision
- Advanced communication and stakeholder management
- Ability to balance technical excellence with business needs
- Strong mentorship and knowledge sharing capabilities
- Collaborative approach to architectural decisions
Certifications
- Google Cloud Professional Cloud Architect (Required)
- Google Cloud Professional Data Engineer (Required)
- Google Cloud Professional Security Engineer (Recommended)
- Kubernetes CKA/CKAD (Recommended)
- Implement automated testing frameworks for infrastructure changes
- Build APIs for common platform operations with Authentication & Authorization
- Create sandbox environments with guardrails for experimentation
- Develop templated project provisioning workflows
- Create runbooks for common failure scenarios
- Implement automated scaling policies based on usage patterns
- Set up cost anomaly detection alerts
- Implement lifecycle policies for cost-effective data storage
- Build example implementations and starter kits
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.