Overview
Skills
Job Details
Role Name: Data Bricks Lead Engineer- Onsite Hybrid role
Location: Chicago, NY, Atlanta, Charlotte
Start date: 8/18/25
Duration: 12 Months
"Due to additional onboarding requirements, a meet and greet is required for all new hires.
Candidates must be willing to go to the closest Capgemini, Client, or onsite location as indicated by project team to meet with a Capgemini team member prior starting their assignment. If the candidate is not local, travel covered by Capgemini. If travel is involved and after selection the candidate declines the offer, costs will be paid by vendor and not Capgemini."
JOB DESCRIPTION:
Data Engineering & Pipeline Development
Design and implement robust batch and streaming data pipelines using Databricks (PySpark/Scala) and Delta Lake.
Ingest data from diverse sources including APIs, FTP/SFTP, Dropbox, Event Hub, and Azure Data Factory.
Build reusable data processing frameworks for ingestion, transformation, and enrichment.
Medallion Architecture Implementation
Architect and manage data flows through the Databricks Medallion architecture:
Bronze Layer: Raw ingestion from external systems and APIs
Silver Layer: Cleaned, structured, and validated data
Gold Layer: Curated, business-ready datasets for analytics and operational use
Ensure data quality, lineage, and traceability across all layers.
Streaming & Real-Time Enablement
Develop real-time data ingestion and processing solutions using Event Hub, Structured Streaming, and Delta Live Tables.
Enable near real-time data availability for downstream systems including Dynamics 365 CE & F&O and Power BI.
Collaboration & Integration
Work closely with data architects, API teams, and integration engineers to align data flows with business and technical requirements.
Support integration with external systems such as Shelter LOBs, PetProPortal, SFCC, and Vetco.
Monitoring, Optimization & Governance
Monitor pipeline performance and implement optimizations for cost, speed, and reliability.
Ensure compliance with data governance, security, and privacy standards using Unity Catalog and Azure Purview.
Implement alerting, logging, and observability using Azure Monitor and Databricks tools.
Required Skills & Experience:
8+ years of experience in data engineering, with 3+ years in Databricks and Spark-based development.
Strong expertise in:
PySpark, Delta Lake, Structured Streaming
Azure Data Factory, Event Hub, Logic Apps
Data lakehouse architecture and Medallion framework
Experience with real-time data processing and large-scale ETL/ELT pipelines.
Familiarity with Dynamics 365 data structures and enterprise integration patterns.
Preferred Qualifications:
Databricks Certified Data Engineer or equivalent
Microsoft Certified: Azure Data Engineer Associate
Experience with Unity Catalog, Azure Purview, and CI/CD for data pipelines
Expected Outcomes:
Scalable and reliable batch and streaming pipelines supporting enterprise data needs
High-quality, curated datasets aligned with business domains
Real-time data availability for analytics, reporting, and operational systems
Optimized data infrastructure with strong governance and observability""