Job Title: Lead Data Engineer- DataBricks
Location - Milpitas, CA (5 days On-Site)
Job Summary
We are seeking a Lead Data Engineer with deep expertise in Databricks to architect, build, and lead scalable data engineering solutions on cloud-based lakehouse platforms. The role combines hands-on technical leadership with solution design, mentoring, and close collaboration with architects, BI, and AI teams.
Key Responsibilities
Technical Leadership & Architecture
- Lead the design and implementation of Databricks Lakehouse architectures
- Define medallion architecture (Bronze, Silver, Gold layers) using Delta Lake
- Drive architectural decisions for batch and streaming data pipelines
- Establish coding standards, best practices, and reusable frameworks
Data Engineering & Databricks
- Design and build scalable ETL/ELT pipelines using Databricks (PySpark/SQL/Scala)
- Optimize Spark jobs for performance, reliability, and cost
- Implement Delta Lake features (ACID, time travel, schema enforcement)
- Develop and manage Databricks workflows, jobs, and clusters
Cloud & Platform Integration
- Architect Databricks solutions on Azure (preferred) or AWS
- Integrate Databricks with cloud storage and data services
- Azure: ADLS, ADF, Synapse
- AWS: S3, Glue, Redshift
- Enable BI and analytics consumption (Power BI, Tableau)
Governance, Security & DevOps
- Implement data governance using Unity Catalog
- Define RBAC, data access controls, and security best practices
- Enable CI/CD for Databricks using GitHub / Azure DevOps
- Use Infrastructure-as-Code (Terraform) for environment management
Leadership & Collaboration
- Lead, mentor, and grow data engineering teams
- Conduct design and code reviews
- Collaborate with Data Architects, Product Owners, and stakeholders
- Support production releases, monitoring, and incident resolution
Required Skills
Databricks & Big Data
- Expert-level Databricks experience (Azure or AWS)
- Strong Spark / PySpark / Spark SQL expertise
- Delta Lake and Lakehouse architecture
- Streaming (Structured Streaming) experience
Cloud & Data Platforms
- Strong experience with Azure or AWS cloud platforms
- Data orchestration tools (ADF, Airflow, or similar)
- Strong SQL and data modeling skills
DevOps & Automation
- Git-based version control
- CI/CD pipelines for data engineering workloads
- Terraform or similar IaC tools
Preferred Qualifications
- Experience with MLflow and MLOps workflows
- Exposure to Microsoft Fabric or Snowflake
- Databricks certifications (Professional Data Engineer / Architect)
- Experience working in Agile environments
Education
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
Quick Fit Indicators
✔ Leads Databricks lakehouse implementations
✔ Strong Spark optimization and governance expertise
✔ Mentors and scales engineering teams
✔ Owns delivery, quality, and platform reliability