Data Engineer

Overview

Hybrid
$50 - $65
Contract - W2
No Travel Required

Skills

Databricks
Data Engineering
Apache Kafka
AWS
S3
EC2
Lambda
Glue
VPC
IAM
Unity Catalog
Delta Lake
PySpark
Spark SQL
Python
ETL
ELT
Data Governance
Data Pipeline Development
Real-time Data Processing
Batch Processing
Cluster Management
Cloud Integration
Cost Optimization
Apache Flink
CI/CD
Data Lakehouse
Data Cataloging
Data Lineage
Data Security
Autoscaling
Auto-termination
Technical Leadership
Mentoring
Code Review
Streaming Data
Data Ingestion
Data Architecture
Data Compliance
Performance Tuning
Troubleshooting
Cloud Cost Management
GCP
Azure

Job Details

Data Engineer Databricks Lead (Hybrid, Dallas, TX)

Long-term Contract | Hybrid (3 days onsite in Addison, TX) | C2C

We're currently hiring 2-3 seasoned Data Engineers for long-term contract roles in the Dallas area. This opportunity is ideal for senior professionals with hands-on experience in Databricks, AWS, and data governance frameworks.

What makes this role exciting? It's moving fast, interviews are being scheduled quickly, so it's an excellent opportunity for candidates.


Key Responsibilities:

  • Databricks Platform Oversight:

    • Lead the design and deployment of large-scale data pipelines on the Databricks platform.

    • Enforce best practices for notebook development, job orchestration, and cost-effective cluster management.

  • Data Ingestion & Streaming:

    • Build and optimize real-time and batch pipelines using Apache Kafka.

    • Integrate Kafka with Databricks and ensure reliable, high-volume data processing.

  • Data Governance (Unity Catalog):

    • Implement and manage access controls, lineage, and cataloging standards.

    • Drive compliance and data quality enforcement within a governed lakehouse.

  • Cloud Integration (AWS):

    • Architect solutions using AWS services like S3, Lambda, EC2, and Glue.

    • Ensure secure, scalable integration with Databricks.

  • Performance & Cost Optimization:

    • Monitor and optimize cluster usage, storage, and DBU consumption.

    • Apply autoscaling, auto-termination, and resource tuning best practices.

  • Mentorship & Technical Leadership:

    • Guide junior engineers, perform code reviews, and standardize development practices.

    • Drive collaboration across teams to solve complex data challenges.

  • ETL Pipeline Development:

    • Develop and optimize robust ETL/ELT processes using PySpark/Spark SQL.

    • Resolve bottlenecks and enhance job performance.


Ideal Candidate Profile:

  • 7+ years of experience in data engineering; 3+ years in a lead or senior-level capacity.

  • Proven hands-on expertise with Databricks and Apache Kafka.

  • Strong experience with Unity Catalog and AWS cloud infrastructure.

  • Advanced skills in Python, Spark (PySpark/Spark SQL), and Delta Lake.

  • Background in building governed, cost-efficient data architectures.

  • Excellent communication skills to interface with technical and business stakeholders.


Bonus Skills (Nice to Have):

  • Experience with Apache Flink or similar stream-processing platforms.

  • Databricks certifications.

  • Familiarity with CI/CD for data engineering.

  • Exposure to Google Cloud Platform or Azure cloud platforms.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Artius Solutions