Overview
Skills
Job Details
Must have experience with similar platform engineering/management solutions:
Building/optimizing Data LakeHouse with Open Table formats
Kubernetes deployments/cluster administration
Transitioning on-premise big data platforms to scalable cloud-based platforms like AWS
Distributed Systems, Microservice architecture, and containers
Cloud Streaming use cases in Big Data Ecosystems (e.g., EMR, EKS, Hadoop, Spark, Hudi, Kafka/Kinesis)
Must have hands-on experience with below tech stack:
Must have experience with below tech stack:
GitHub Actions
AWS
IAM
API Gateway
Lambda
Step Functions
Lake formation
EKS & Kubernetes
Glue: Catalog, ETL, Crawler
Athena
Lambda
S3 (Strong foundational concepts like object data store vs block data store, encryption/decryption, storage tiers, etc)
Apache Hudi
Apache Flink
PostgreSQL and SQL
RDS (Relational Database Services).
Python
Java
Terraform Enterprise
Must be able to explain what TF is used for
Understand and explain basic principles (e.g., modules, providers, functions)
Must be able to write and debug TF
Helpful tech stack experience would include:
Helm, Kafka, and Kafka Schema Registry, AWS Services: CloudTrail, SNS, SQS, CloudWatch, Aurora, EMR, Redshift, Iceberg, Vault, AWS Secrets Manager