Data Engineer / Data Analyst

Irving, TX, US • Posted 1 day ago • Updated 1 day ago
Contract Independent
Contract W2
No Travel Required
On-site
$60 - $65/hr
Fitment

Dice Job Match Score™

📊 Calculating match score...

Job Details

Skills

  • SQL Tuning
  • Amazon Web Services
  • Apache Hadoop
  • Apache Hive
  • Apache Kafka
  • Apache Spark
  • Artificial Intelligence
  • Business Intelligence
  • HDFS
  • Google Cloud Platform
  • Good Clinical Practice
  • JSON
  • Google Cloud
  • PySpark
  • Real-time
  • Python
  • SQL
  • Teradata
  • Data Structure
  • ELT
  • Data Flow
  • Data Processing

Summary

Role: Data Engineer/Analyst

Location: Irving, TX (Day 1 onsite)

Duration: 12+ Months

Role Overview 

We are seeking a highly skilled Data Engineer to lead the architecture, development, and optimization of our end-to-end data pipelines. A primary focus of this role will be driving our on-premise to cloud migration strategy, ensuring a seamless transition of legacy systems into a modern Google Cloud Platform ecosystem. You will be responsible for deep-dive data analysis, building robust ETL/ELT processes, and delivering actionable insights through advanced reporting.

 

Key Responsibilities

· Design and implement high-throughput streaming architectures using Kafka to capture and process event-driven data

· Lead the end-to-end migration of complex datasets and workloads from Hadoop, Hive, and Teradata to Google Cloud Platform (Google Cloud Platform)

· Build and maintain robust ETL/ELT pipelines using Python and PySpark, ensuring seamless integration of both batch and streaming data

· Conduct deep-dive SQL analysis to ensure data quality and consistency throughout the migration lifecycle and across JSON-based data structures

· Collaborate with stakeholders to deliver advanced ETL reporting, turning raw streams into actionable dashboards and performance metrics

· Perform complex SQL analysis to validate data integrity, identify patterns, and troubleshoot performance bottlenecks across distributed systems

· Develop automated ETL workflows using Python and PySpark, transforming raw data into structured formats for business intelligence and executive reporting.

· Exceptional collaboration skills, with the ability to mentor junior engineers and communicate technical concepts to non-technical stakeholders

Technical Stack & Tools

· Cloud Platform: Google Cloud Platform (Google Cloud Platform) – specifically BigQuery, Dataflow, Cloud Functions, GCS, and Cloud Composer.

· Data Processing: Python (Expert), PySpark, Apache Spark.

· Streaming & Messaging: Apache Kafka (Real-time architecture design).

· Legacy Ecosystems: Hadoop (HDFS), Hive, and Teradata.

· Data Languages: Advanced SQL (Optimization, Window Functions, Performance Tuning).

· Data Formats: handling and parsing complex JSON and semi-structured data.

Required Qualifications & Experience 

· Bachelor’s degree and five or more years of work experience

· Hands-on experience with Apache Kafka to design and implement real-time streaming architectures

· Proven experience leading large-scale migration strategies, successfully moving complex workloads and multi-terabyte datasets from Hadoop, Hive, and Teradata environments to Google Cloud Platform (Google Cloud Platform)

· Proven experience with Google Cloud Platform (BigQuery, Cloud Functions, Dataflow, and GCS)

· Mastery of Advanced SQL to perform deep-dive analysis, troubleshoot performance bottlenecks in distributed systems, and ensure 100% data integrity during migration phases

· Expert proficiency in Python and PySpark for building scalable ETL/ELT pipelines that seamlessly unify batch and streaming data streams

· Strong background in Hadoop, Hive, and Teradata to facilitate smooth legacy transitions.

· Expertise in handling and parsing JSON data for large-scale ingestion optimized schemas for downstream consumption and cloud-native storage

· Comprehensive understanding of end-to-end data lifecycles, from raw ingestion to the delivery of actionable dashboards and performance metrics

· Demonstrated ability to act as a bridge between technical and non-technical stakeholders, with a focus on mentoring junior talent and fostering a collaborative engineering culture

Preferred Qualifications

· Relevant Cloud Certifications (e.g., Google Cloud Platform Professional Data Engineer or AWS Certified Data Engineer).

· Experience using AI-assisted development tools (e.g., GitHub Copilot, Gemini) to accelerate delivery cycles and optimize code performance.

 

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10519030
  • Position Id: 8939092
  • Posted 1 day ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Irving, Texas

Today

Contract

USD 81.00 - 87.00 per hour

Irving, Texas

Today

Easy Apply

Contract

USD0 - USD0

Hybrid in Dallas, Texas

2d ago

Easy Apply

Contract

Depends on Experience

Irving, Texas

Today

Easy Apply

Contract

$50 - $55

Search all similar jobs