Overview
Skills
Job Details
We are seeking a Senior Data & ML Infrastructure Engineer to join our growing team focused on enabling scalable, reliable, and cost-efficient data and machine learning pipelines. This is an exciting opportunity to work at the intersection of data engineering, machine learning, and platform reliability.
In this role, you ll be designing and enhancing Python frameworks that support key components like cost tracking, data quality, lineage, governance, and MLOps. You'll work closely with cross-functional teams, including ML Engineers, Data Scientists, and Platform Engineers, to ensure seamless integration and operation of scalable batch and stream pipelines on Google Cloud Platform (Google Cloud Platform).
Key Responsibilities:
Design & enhance Python libraries to support robust data and ML operations including governance, lineage, and cost tracking.
Implement data processing optimizations to reduce cost and improve performance of large-scale ML pipelines.
Develop scalable features and training data pipelines using BigQuery, Dataflow, and Cloud Composer on Google Cloud Platform.
Build and maintain monitoring, logging, and alerting systems to ensure data pipeline reliability and visibility.
Lead infrastructure rollouts with careful planning, phased deployment strategies, validation steps, and rollback plans.
Serve as the primary point of contact for cross-team coordination during updates, deployments, and incident handling.
Work closely with ML platform teams to ensure seamless integration of enhancements and changes.
Create detailed runbooks, documentation, and handoffs for operational support.
Requirements:
5+ years of experience in data engineering, ML infrastructure, or related roles.
Strong proficiency in Python and experience building reusable libraries/frameworks.
Hands-on experience with BigQuery, Dataflow, and Cloud Composer.
Solid understanding of data pipeline orchestration, MLOps, and cloud-native architectures.
Experience implementing monitoring and observability for pipelines and infrastructure.
Strong communication and coordination skills, especially in cross-functional environments.
Familiarity with ML workflows and how infrastructure supports model development and deployment.
Nice to Have:
Experience with CI/CD for data pipelines
Exposure to data quality tools, lineage tracking frameworks, or ML feature stores
Google Cloud certification (e.g., Professional Data Engineer, ML Engineer)
Python, Data Engineering, Machine Learning, MLOps, BigQuery, Google Cloud Platform, Google Cloud Platform, Dataflow, Cloud Composer, Data Pipelines, Infrastructure as Code, Data Governance, Data Quality, Lineage, Monitoring, Logging, ML Infrastructure, CI/CD, Feature Pipelines, Cost Optimization, Data Orchestration, Airflow, ML Engineering, ML Ops, Runbooks, Observability, Python, Data Engineering, MLOps, BigQuery, Dataflow, Google Cloud Platform, Google Cloud Platform, Data Pipelines, Composer, Machine Learning Infrastructure