Overview
Skills
Job Details
Job Title: ML Engineer Lead - Technical
Job Location: Vancouver, BC, CANADA (Hybrid Model) 3 days to Office mandatory.
Emplpyment Type: Fulltime/ Hybrid position
Experience 13+ years
Retail Domain Experience Must.
Profile:
We are looking for a Technical MLOps Lead to drive the architecture, implementation, and operationalization of advanced forecasting models and machine learning systems in the retail ecommerce domain. The ideal candidate will blend deep technical expertise in MLOps, data platforms, and model lifecycle management with strong business understanding of demand forecasting, planning, and allocation use cases.
This role will lead the design, delivery, and optimization of robust ML pipelines using Databricks, MLflow, Unity Catalog, and cloud-native tools on AWS or Azure, enabling scalable, traceable, and cost-effective ML solutions across the retail landscape.
This role requires excellent communication and organizational skills and the ability to work closely with stakeholders.
Key Responsibilities:
- Lead the design of end-to-end CI/CD pipelines for model training, deployment, monitoring, and retraining using Databricks, MLflow, Unity Catalog, and Airflow.
- Implement version control, model lineage, and governance practices aligned with enterprise architecture.
- Drive infrastructure automation using Terraform, Docker, Kubernetes, and cloud-native services (SageMaker, Azure ML).
- Optimize cost, performance, and scalability of ML pipelines and runtime environments.
- Partner with data scientists, product owners, and business analysts to translate forecasting and planning requirements into scalable, reusable pipelines.
- Support hierarchical time-series models, multi-location clustering, and demand prediction at SKU/location granularity.
- Define monitoring metrics for model drift, accuracy decay, and business impact KPIs.
- Lead the standardization of model lifecycle practices including approvals, rollback strategies, and audit logging.
- Manage and maintain a feature store for consistent reuse across models.
- Collaborate with business, data engineering, and DevOps teams to ensure enterprise-grade solutions.
- Mentor junior MLOps engineers and establish internal documentation, templates, and reusable components.
- Translate user stories and epics into technical specifications and working pipelines.
- Collaborate across agile squads to ensure timely delivery with high quality and traceability.
- Lead incident resolution and RCA when ML models fail in production or regress in performance.
- Drive post-deployment validation and facilitate user acceptance testing (UAT) for ML integrations.
Qualifications:
- 10+ years of experience in MLOps, data engineering, or ML infrastructure roles.
- Strong proficiency in Databricks, MLflow, Unity Catalog, Airflow, and feature store solutions.
- Programming proficiency in Python and PySpark for building reusable ML components and ETL flows..
- Proven success operationalizing ML models for forecasting, planning, or optimization use cases in retail or ecommerce.
- Experience with AWS (S3, EC2, EKS, SageMaker) or Azure (ML Studio, Synapse, AKS).
- Strong skills in SQL for data analysis, ETL validation, and reporting (Snowflake/Databricks preferred).
- Hands-on experience with CI/CD pipelines, GitOps practices, and infrastructure-as-code (Terraform, CloudFormation).
- Familiarity with containerization (Docker), orchestration (Kubernetes), and observability tools (Grafana, Prometheus, ELK).
- Ability to author technical documentation and translate business needs into MLOps solution architecture.
Nice to Have:
- Experience with demand forecasting, inventory optimization, or promotion effectiveness in retail.
- Familiarity with Agile/Scrum processes and tools (JIRA, Confluence, ServiceNow).
- Exposure to data science workflows, including model explainability and bias detection.
- Understanding of data governance, security, and compliance in enterprise ML systems.