Job Title: Lead Data Engineer / Data Architect
Location: San Jose, CA (ONSITE – LOCAL CANDIDATES ONLY - F2F)
Duration: 12+ Months
Employment Type: Contract
Experience Required: 8–10 Years
Job Summary
We are seeking an experienced Lead Data Engineer / Data Architect to design, scale, and optimize enterprise-grade Lakehouse data platforms. The ideal candidate will have deep expertise in Databricks, Apache Spark, Delta Lake, cloud data platforms, and enterprise data architecture, with proven experience building production-grade scalable data ecosystems.
This role requires a strong technical leader who can drive architecture decisions, establish data engineering standards, optimize platform performance and cost, and mentor engineering teams.
Only local San Jose, CA candidates available for onsite work should be considered.
Required Experience
- 8+ years of Data Engineering / Data Platform Engineering experience
- Proven experience in enterprise data architecture and large-scale data platform design
- Strong leadership experience guiding engineering teams and architecture decisions
- Hands-on experience building production-grade data platforms
Must Have Technical Skills
Databricks / Big Data Engineering
Strong hands-on expertise with:
- Databricks
- Apache Spark
- PySpark
- Scala
- Delta Lake
- Lakehouse Architecture
- Distributed data processing
- Large-scale ETL / ELT pipelines
Data Architecture
Strong experience in:
- Enterprise data platform architecture
- Scalable data pipeline design
- Data platform modernization
- Lakehouse implementation
- Performance optimization
- Cost optimization
- Production data platform engineering
Data Modeling
Strong understanding of:
- Medallion Architecture
- Bronze Layer
- Silver Layer
- Gold Layer
- Data warehousing concepts
- Dimensional modeling
- Data governance principles
- Metadata-driven architecture
Cloud Platforms
Hands-on experience with one or more:
- AWS
- Azure
- Google Cloud Platform
Database / Query Optimization
Strong expertise in:
- SQL
- Query optimization
- Large-scale data transformations
- Data performance tuning
- Partitioning / indexing strategies
- Compute optimization
Engineering / DevOps
Experience with:
- CI/CD pipelines for data engineering
- Version control (Git)
- Deployment automation
- Monitoring / observability
- Production support best practices
Key Responsibilities
- Design and lead enterprise Lakehouse architecture implementation using Databricks
- Build scalable, secure, and production-ready data platforms
- Architect and optimize large-scale data pipelines and transformation frameworks
- Drive engineering standards, architecture governance, and best practices
- Optimize Spark / Databricks performance, scalability, and cost efficiency
- Design and implement Delta Lake-based data architectures
- Define and enforce Medallion architecture data modeling standards
- Collaborate with business, analytics, engineering, and architecture stakeholders
- Mentor and guide data engineering teams
- Support platform modernization and cloud data transformation initiatives
- Troubleshoot production issues and optimize platform reliability
Preferred Qualifications
- Experience in enterprise-scale data modernization programs
- Architecture leadership experience
- Exposure to data governance and security frameworks
- Experience with streaming data platforms is a plus