Data Engineer at Dallas, TX & Middletown, NJ

Dallas, TX, US • Posted 4 hours ago • Updated 4 hours ago
Contract W2
On-site
Depends on Experience
Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

  • API
  • Access Control
  • Apache Kafka
  • Apache Spark
  • Broadcasting
  • Change Data Capture
  • Cloud Computing
  • Clustering
  • Concurrent Computing
  • Continuous Delivery
  • Continuous Integration
  • Customer Focus
  • Data Modeling
  • Databricks
  • DevOps
  • ELT
  • Extract
  • Transform
  • Load
  • Git
  • GitHub
  • Grafana
  • JSON
  • Management
  • Microservices
  • Microsoft Azure
  • Optimization
  • Orchestration
  • Performance Tuning
  • PySpark
  • React.js
  • Real-time
  • Regulatory Compliance
  • SQL
  • Scala
  • Semantics
  • Soft Skills
  • Streaming
  • Unity
  • Workflow
  • struct

Summary

Fullstack Databricks Developer
Onsite Locations: Dallas TX, Middletown NJ
Long term

Skills & Qualifications
1.Technical Core (Databricks & Spark)
- Expert PySpark/Scala: Deep understanding of Spark internals, broadcast joins, and RDD/Dataframe partitioning.
Delta Lake Mastery: Proficiency in Delta features like Z-Ordering, Liquid Clustering, Change Data Feed (CDF), and Time Travel.
Streaming Patterns: Hands-on experience with Watermarking, Checkpoints, and handling late-arriving data in Structured Streaming.
2. Data Modeling & Languages
SQL: Expert-level SQL for complex transformations and window functions.
JSON/Semi-Structured Data: Mastery of parsing and generating complex nested JSON objects within Spark (e.g., struct, array, to_json, from_json).
Medallion Design: Proven experience moving data across Bronze, Silver, and Gold layers with clear "Data Contracts."
3. Full Stack & DevOps
CI/CD: Experience automating data pipeline deployments (Git-based workflows).
Observability: Ability to set up monitoring and alerts using Databricks SQL Alerts or Grafana to track pipeline lag.
4.Soft Skills
Architectural Thinking: Ability to decide when to use "Continuous" vs. "AvailableNow" streaming based on cost vs. latency requirements.
Client Focus: Understanding how an API client (e.g., a React app or a microservice) will consume the Gold layer JSON.

Job Title:
Data Engineer (Streaming & Full Stack Databricks)

Role Summary
We are seeking a high-performing Data Engineer to design and implement a real-time data platform using the Medallion Architecture.
You will be responsible for the end-to-end development of data pipelines from ingesting real-time source data into Bronze, transforming it into a relational silver layer, and finally delivering high-concurrency, consumption-ready JSON Gold tables.
You will act as a "Full Stack" data professional, handling everything from infrastructure automation (DataOps) to complex nested data modeling.

Key Responsibilities
Real-Time Ingestion: Build scalable ingestion pipelines using Auto Loader and Spark Structured Streaming to capture data from Kafka, Event Hubs, or CDC sources into raw Delta tables.
Relational Transformation: Develop ELT logic to cleanse, deduplicate, and normalize data into a relational format. Ensure ACID compliance and "exactly-once" processing semantics.
JSON API Optimization: Design and build the layer specifically for client consumption. This involves flattening/nesting data into optimized JSON structures within Delta tables to support low-latency API queries.
Advanced Orchestration: Implement and manage complex workflows using Delta Live Tables (DLT) or Standard Streaming Live tables and Databricks Workflows to ensure data freshness and lineage.
Governance & Security: Use Unity Catalog to enforce fine-grained access control (row/column level) and maintain a searchable data catalog for consuming clients.
DataOps & Automation: Own the deployment lifecycle using Databricks Asset Bundles (DABs) and CI/CD pipelines (GitHub Actions/Azure DevOps) to ensure reproducible environments.
Performance Tuning: Optimize streaming triggers, watermarking, and stateful processing to minimize latency and manage cloud costs effectively.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10113363
  • Position Id: 8923042
  • Posted 4 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Dallas, Texas

Today

Easy Apply

Contract, Third Party

Depends on Experience

Dallas, Texas

Today

Easy Apply

Contract, Third Party

70+

Dallas, Texas

8d ago

Easy Apply

Contract, Third Party

$69 - $70

Dallas, Texas

2d ago

Easy Apply

Contract

Depends on Experience

Search all similar jobs