Overview
Skills
Job Details
Data Engineer Databricks Lead (Hybrid, Dallas, TX)
Long-term Contract | Hybrid (3 days onsite in Addison, TX) | C2C
We're currently hiring 2-3 seasoned Data Engineers for long-term contract roles in the Dallas area. This opportunity is ideal for senior professionals with hands-on experience in Databricks, AWS, and data governance frameworks.
What makes this role exciting? It's moving fast, interviews are being scheduled quickly, so it's an excellent opportunity for candidates.
Key Responsibilities:
Databricks Platform Oversight:
Lead the design and deployment of large-scale data pipelines on the Databricks platform.
Enforce best practices for notebook development, job orchestration, and cost-effective cluster management.
Data Ingestion & Streaming:
Build and optimize real-time and batch pipelines using Apache Kafka.
Integrate Kafka with Databricks and ensure reliable, high-volume data processing.
Data Governance (Unity Catalog):
Implement and manage access controls, lineage, and cataloging standards.
Drive compliance and data quality enforcement within a governed lakehouse.
Cloud Integration (AWS):
Architect solutions using AWS services like S3, Lambda, EC2, and Glue.
Ensure secure, scalable integration with Databricks.
Performance & Cost Optimization:
Monitor and optimize cluster usage, storage, and DBU consumption.
Apply autoscaling, auto-termination, and resource tuning best practices.
Mentorship & Technical Leadership:
Guide junior engineers, perform code reviews, and standardize development practices.
Drive collaboration across teams to solve complex data challenges.
ETL Pipeline Development:
Develop and optimize robust ETL/ELT processes using PySpark/Spark SQL.
Resolve bottlenecks and enhance job performance.
Ideal Candidate Profile:
7+ years of experience in data engineering; 3+ years in a lead or senior-level capacity.
Proven hands-on expertise with Databricks and Apache Kafka.
Strong experience with Unity Catalog and AWS cloud infrastructure.
Advanced skills in Python, Spark (PySpark/Spark SQL), and Delta Lake.
Background in building governed, cost-efficient data architectures.
Excellent communication skills to interface with technical and business stakeholders.
Bonus Skills (Nice to Have):
Experience with Apache Flink or similar stream-processing platforms.
Databricks certifications.
Familiarity with CI/CD for data engineering.
Exposure to Google Cloud Platform or Azure cloud platforms.