Data Engineer, SQL, Spark

Overview

Remote
On Site
$40 - $50
Contract - W2
Contract - 6 Month(s)

Skills

Amazon Redshift
Amazon Web Services
Analytical Skill
Apache Kafka
Apache Spark
Business Intelligence
Cloud Computing
Conflict Resolution
Data Analysis
Data Architecture
Data Flow
Data Marts
Data Modeling
Data Processing
Data Quality
Data Warehouse
Databricks
ELT
Electronic Health Record (EHR)
Extract
Transform
Load
Git
Good Clinical Practice
Google Cloud Platform
Microsoft Azure
Microsoft Power BI
Performance Tuning
Scripting
Scalability
Scala
SQL
Python
PySpark

Job Details

Duration: 3+ months (Project will extend quarterly based on budget and performance)

in a nearby coworking space (such as Regus offices in your city), followed by remote work thereafter.

Note: Work Location: Two weeks of onsite work in a nearby coworking space (such as Regus offices in your city), followed by remote work thereafter.

Job Overview:

We are seeking a highly skilled and detail-oriented Data Engineer with strong expertise in SQL and Apache Spark to join our growing data team. In this role, you will be responsible for building scalable and efficient data pipelines, transforming large volumes of data, and supporting data analytics initiatives across the organization.

 

Key Responsibilities:

  • Design, develop, and maintain robust data pipelines and ETL/ELT processes using Spark and SQL.
  • Process and transform large datasets from various sources to ensure high performance and data quality.
  • Optimize Spark jobs for performance, scalability, and cost-efficiency in distributed environments.
  • Work closely with data analysts, data scientists, and business stakeholders to understand data requirements and deliver solutions.
  • Build and manage data models and data marts in cloud data warehouses (e.g., Snowflake, Redshift, BigQuery).
  • Ensure data accuracy, integrity, and availability across systems.
  • Participate in code reviews, troubleshooting, and performance tuning of existing data processes.
  • Maintain documentation for data flows, transformations, and processes.

 

Required Skills & Qualifications:

  • 9+ years of hands-on experience as a Data Engineer or in a similar role.
  • Strong proficiency in SQL for data transformation, querying, and performance tuning.
  • Experience working with Apache Spark (PySpark or Scala) for large-scale data processing.
  • Familiarity with data lakes, data warehouses, and cloud data platforms (AWS, Google Cloud Platform, Azure).
  • Proficient in scripting languages such as Python or Scala.
  • Solid understanding of data modeling concepts and data architecture.
  • Experience with version control tools like Git.
  • Strong analytical and problem-solving skills.
  • Excellent communication and collaboration skills.

Preferred Qualifications:

  • Experience with tools like Airflow or other workflow orchestration platforms.
  • Knowledge of cloud-native services (e.g., AWS Glue, EMR, Databricks).
  • Experience with real-time data streaming tools (Kafka, Spark Streaming) is a plus.
  • Exposure to BI tools like Tableau, Power BI, or Looker.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.