Key Responsibilities
• Design and develop scalable data pipelines using Databricks (Delta Lake, Spark)
• Build and maintain ETL/ELT workflows for large-scale data processing
• Optimize data architecture for performance, reliability, and cost-efficiency
• Work with structured and unstructured data from multiple sources
• Collaborate with data scientists, analysts, and business stakeholders
• Implement data quality, validation, and governance frameworks
• Troubleshoot and resolve data pipeline and performance issues
Required Qualifications:
• Bachelor’s degree in Computer Science, Engineering, or related field
• Databricks Certification (Associate or Professional level) – required
• 3+ years of experience in data engineering or related roles
• Strong experience with:
• Apache Spark (PySpark/Scala)
• Databricks platform
• SQL and data modeling
• Experience with cloud platforms (AWS, Azure, or Google Cloud Platform)
• Knowledge of data warehousing concepts and big data technologies
Preferred Qualifications:
• Experience with Delta Lake and Lakehouse architecture
• Familiarity with workflow orchestration tools (Airflow, Azure Data Factory)
• Experience with streaming data (Kafka, Spark Streaming)
• Knowledge of CI/CD pipelines and DevOps practices
• Exposure to machine learning pipelines is a plus