Data Engineer

Overview

On Site

Full Time

Part Time

Accepts corp to corp applications

Contract - Independent

Contract - W2

Skills

Data Flow

Real-time

Decision-making

Data Integration

Big Data

Google Cloud Platform

Google Cloud

Unstructured Data

Analytics

Reporting

Management

Optimization

Data Governance

Access Control

Agile

Continuous Improvement

Python

Scala

Data Processing

SQL

Database Design

MySQL

PostgreSQL

Microsoft SQL Server

Apache Spark

Apache Kafka

Apache Hadoop

Cloud Computing

Amazon Web Services

Amazon S3

Amazon Redshift

Electronic Health Record (EHR)

Microsoft Azure

ADF

Databricks

Extract

Transform

Load

ELT

Data Modeling

Data Warehouse

NoSQL

Database

MongoDB

Apache Cassandra

Amazon DynamoDB

Docker

Kubernetes

Continuous Integration

Continuous Delivery

Apache NiFi

Workflow

Orchestration

Machine Learning (ML)

Data Quality

Analytical Skill

Problem Solving

Conflict Resolution

SANS

Communication

Collaboration

Job Details

Job Title: Data Engineer

Locations: Dallas, TX / Plano, TX / Austin, TX / Houston, TX / Richardson, TX
Experience Required: 6-10 Years

About the Role

We re seeking a skilled and passionate Data Engineer to design, build, and optimize data pipelines and architectures that enable efficient data processing and analytics.

You ll work closely with data scientists, analysts, and software engineers to ensure data flows are reliable, scalable, and secure supporting real-time insights and enterprise-grade decision-making.

This role is ideal for someone with hands-on experience in data integration, ETL pipelines, big data platforms, and cloud data solutions (AWS, Azure, or Google Cloud Platform).

Key Responsibilities

Design, develop, and maintain ETL/ELT pipelines to process large volumes of structured and unstructured data.
Build and optimize data models and data warehouses/lakes for analytics and reporting.
Integrate data from multiple sources using tools such as Apache Spark, Kafka, or Airflow.
Implement and manage data pipelines on cloud platforms like AWS (Glue, Redshift, S3) or Azure (Data Factory, Synapse).
Ensure data quality, integrity, and performance through validation, monitoring, and optimization.
Collaborate with stakeholders to define data requirements and deliver robust data solutions.
Apply data governance, access control, and security best practices.
Troubleshoot performance issues and optimize query execution across large datasets.
Work in an Agile environment and contribute to continuous improvement and automation initiatives.

Primary Skills (Must Have)

Strong experience in Python or Scala for data processing.
Proficiency with SQL and database design (MySQL, PostgreSQL, SQL Server).
Hands-on experience with Apache Spark, Kafka, or Hadoop ecosystems.
Cloud experience with AWS (Glue, S3, Redshift, EMR) or Azure (ADF, Synapse, Databricks).
Expertise in ETL/ELT design, data modeling, and pipeline orchestration.
Understanding of data warehousing, data lakes, and distributed data systems.

Secondary Skills (Good to Have)

Familiarity with NoSQL databases (MongoDB, Cassandra, DynamoDB).
Experience with containerization (Docker, Kubernetes) and CI/CD pipelines.
Exposure to Airflow, NiFi, or dbt for workflow orchestration.
Knowledge of machine learning data pipelines or data quality frameworks.
Excellent analytical and problem-solving skills.
Strong communication and collaboration abilities.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About Purple Drive Technologies LLC

Share