Overview
Skills
Job Details
Technical Skills:
o Amazon SageMaker: In-depth knowledge of SageMaker, including domain setup, configuration, and infrastructure management.
o Cloud Knowledge: A deep understanding of cloud computing concepts, especially related to Amazon Web Services (AWS).
o Infrastructure Design: Ability to design and implement MLOPs cloud solutions, considering scalability, security, and performance.
o Experience: Practical firsthand experience with cloud MLOps and Data Analtics platforms, preferably AWS SageMaker, Glue, EMR, Athena.
o Best Practices: Familiarity with best practices for MLOps and Data Engineering.
o EC2 Instances: Understanding of EC2 instance types and their suitability for AWS SageMaker.
o S3: Proficiency in using Amazon S3 for data storage and SageMaker input/output.
o IAM: Ability to manage permissions and access control using Identity and Access Management.
o Lambda: Knowledge of serverless computing for automating tasks.
o ML & Data Pipelines: Experience with creating data pipelines using AWS SageMaker services integrated with Glue and EMR.
o Monitoring and Troubleshooting: Proficiency in monitoring SageMaker cluster health, identifying bottlenecks, and resolving issues.
o Cost Optimization: Strategies to tag SageMaker resources with an eye on optimizing costs and observability.
Security and Compliance:
o Encryption: Understanding of data encryption at rest and in transit to ensure secure data analytics cloud environment.
o Security Groups and VPC: Knowledge of network security and virtual private clouds.
o Compliance Controls: Ensuring compliance with industry standards and regulations.
Scripting and Automation:
o Langauge Proficiency: Python, R, Spark, SQL in scripting languages for automating tasks.
o MLOPs: Ability to collaborate with the business to optimize MLOps process, and model lifeycle using SageMaker
Infrastructure as Code (IaC): Ability to assist DevOps engineers to develop proper Terraform templates used to provision AWS analytics infrastructure.
Backup and Disaster Recovery:
o Snapshotting: Familiarity with taking EMR cluster snapshots for backup and recovery.
o High Availability: Implementing strategies for fault tolerance and disaster recovery.
Experience and Certifications:
o Experience: Senior AWS Cloud Engineers must have 3 to 5 years of firsthand experience in designing and building cloud MLOps and Data Analytics applications.