Job title: Azure MLOps Engineer
Location: Remote, USA
Experience: 5+ yrs
Must-Have
- Hands-on experience with Azure Machine Learning (workspaces, compute, environments, datasets, pipelines).
- Proficiency in building end-to-end MLOps workflows: data prep, training, model packaging, registration, deployment, and automated retraining.
- Strong Python skills for ML engineering and automation (SDK v2/CLI v2).
- Experience deploying models to AKS/Managed Online Endpoints/Batch Endpoints with blue-green or canary strategies.
- CI/CD using Azure DevOps Pipelines or GitHub Actions (multi-stage YAML, approvals, artifacts).
- Infrastructure as Code using Bicep or Terraform, including networking (VNets, subnets, private endpoints).
- Model governance: versioning, lineage, reproducibility, data/model drift monitoring, and audit readiness.
- Observability using Azure Monitor, Log Analytics, and Application Insights.
- Security best practices: Entra ID (AAD), RBAC, Key Vault, Private Link, managed identities, and compliance standards.
- Experience with Git, code reviews, branching strategies, and automated testing (unit/integration).
Good-to-Have
- Experience with Databricks, Delta Lake, and Feature Store patterns.
- Familiarity with MLflow tracking/registry and model evaluation frameworks.
- Experience with Kubernetes (AKS) and Helm for advanced deployments.
- Knowledge of data engineering on Azure (ADF/Synapse/ADLS Gen2/Event Hub).
- Experience implementing Responsible AI: bias checks, explainability, content filters, guardrails.
- Exposure to Azure OpenAI/agentic patterns and RAG architectures for production workloads.
- Professional certifications: Microsoft Certified – Azure AI Engineer Associate, Azure Data Scientist Associate, or DevOps Engineer Expert.
- Strong documentation and stakeholder communication skills.
Responsibilities / Expectations from the Role
- Design and implement scalable MLOps architectures on Azure ML aligned to enterprise standards.
- Build automated pipelines for data prep, training, evaluation, and deployment across Dev/Test/Prod.
- Create reusable templates (YAML, CLI/SDK scripts, IaC modules) and accelerate project onboarding.
- Implement model governance: approvals, model cards, champion/challenger, and release management.
- Operationalize monitoring for performance, drift, data quality, latency, and cost; define SLOs/SLAs.
- Partner with Data Scientists to productionize notebooks into robust, testable services.
- Troubleshoot production incidents and drive root-cause analysis and postmortems.
- Collaborate with Security/Networking teams to enable private, secure deployments.
- Document runbooks, playbooks, and architecture diagrams; conduct knowledge transfer sessions.
- Continuously evaluate new Azure ML/MLOps features and recommend best practices.
USP of the Role : Build reliable, secure, and automated ML platforms on Azure at enterprise scale.
Project Details: End-to-end MLOps for AI/ML products, including automated training, deployment, and monitoring on Azure.