Qualifications/ What you bring (Must Haves) – Highlight Top 3-5 skills | - 10-15 years of software engineering experience focused on cloud infrastructure or ML platform operations - 5+ years hands-on with AWS, including deep expertise in Amazon SageMaker (Studio, Pipelines, Model Registry, Endpoints, Feature Store) - 3+ years building and operating production MLOps pipelines — training, versioning, deployment, monitoring, rollback - Experience with SageMaker Unified Studio or Studio Classic — domain/project setup, blueprints, multi-tenant configuration - Infrastructure-as-Code with Terraform, CDK, or CloudFormation - IAM design for ML platforms — execution roles, service roles, cross-account access, Lake Formation, SSO/SAML - MLflow or equivalent experiment tracking - SageMaker Pipelines or similar workflow orchestration (Airflow, Step Functions) - Model serving — real-time endpoints, batch transform, auto-scaling, endpoint monitoring - Snowflake as a data source for ML pipelines - Kubernetes (EKS) and container orchestration - Networking and security — VPC, security groups, private endpoints, cross-account connectivity |
Added bonus if you have (Preferred): | - SageMaker Unified Studio domain provisioning, custom blueprints, project standardization - SageMaker Feature Store for online/offline feature management - SageMaker Model Monitor — data quality checks, bias detection, drift detection - AWS Machine Learning Specialty certification |