Overview
Skills
Job Details
Key Responsibilities:
Lead the design, development, and maintenance of scalable and secure enterprise data lake solutions.
Architect data ingestion pipelines (batch and streaming) using modern frameworks and cloud-native services.
Collaborate with data architects, data scientists, and business stakeholders to define data models and integration strategies.
Implement data governance, lineage, metadata management, and cataloging best practices.
Ensure data quality, scalability, and high availability across distributed environments.
Optimize performance for ETL/ELT workflows and large-scale data processing.
Manage security, access controls, and compliance (GDPR, HIPAA, etc.).
Lead and mentor junior engineers, providing technical direction and best practices.
Stay current with emerging data lake technologies and cloud innovations.
Required Qualifications:
15+ years of IT experience with focus on data engineering and data platforms.
7+ years of hands-on experience building and managing data lake ecosystems (on AWS, Azure, or Google Cloud Platform).
Strong expertise in big data frameworks (Apache Spark, Hadoop, Hive, Presto, Iceberg, Delta Lake, Hudi, etc.).
Proficiency with ETL/ELT tools and orchestration frameworks (Airflow, Glue, Data Factory, Informatica, etc.).
Strong knowledge of streaming platforms (Kafka, Kinesis, Pub/Sub).
Solid understanding of data governance, security, and compliance frameworks.