Job Description
- Advanced expertise in ETL / ELT pipeline design
o Batch data processing
o Near real time / streaming data
- Experience with structured and semi structured data
- Strong knowledge of:
o Incremental loading
o CDC (Change Data Capture)
- Pipeline orchestration and dependency management
- Strong programming skills in Python or Scala or Java (nice to have)
- Performance optimization for large scale data processing
- Solid understanding of:
o Dimensional modeling (Star / Snowflake)
o Normalized and denormalized models
- Strong experience on: Azure, AWS, or Google Cloud Platform
- Hands on with Data Warehouses (Snowflake, Synapse, BigQuery, Redshift)
Data Architecture & Solution Design
- Design end to end data engineering architectures
- Define scalable solutions for:
o Data lakes / lakehouse
o Data warehouses
o Streaming and real time systems
- Ensure alignment with enterprise architecture, security, and compliance standards
- Review and approve technical designs
Data Pipeline Development & Management
- Lead development of ETL / ELT pipelines
- Handle:
o Batch and real time ingestion
o Structured and semi structured data
- Optimize pipelines for performance, reliability, and cost
- Manage schema evolution and data dependencies
-
Data Quality, Reliability & Operations
- Establish data quality standards and validation rules
- Implement monitoring, alerting, and observability
- Perform root cause analysis for data incidents
- Drive operational excellence and stability
DevOps / DataOps Enablement
- Build CI/CD pipelines for data workloads
- Automate testing, deployment, and rollback
- Improve reliability through automation