Data Engineer II

Overview

Remote
$120,000 - $140,000
Full Time
No Travel Required

Skills

Apache Spark
Databricks
Synapse Serverless SQL Pool
Data Engineering
Extract
Transform
Load
SQL
Meta-data Management
Critical Thinking
Data Governance
HIPAA
Emerging Technologies

Job Details

Permanent, 100% Remote Position

Open to ONLY candidates residing in Alabama, Arkansas, Florida, Georgia, Illinois, Louisiana, Michigan, New Hampshire, North Carolina, Ohio, Pennsylvania, South Carolina, Tennessee, Texas, Virginia or Wisconsin

MUST HAVE: Synapse Serverless SQL Pool or Databricks; AND Apache Spark

The Data Engineer II will be a key contributor in a cross-functional project environment that includes Business Analysts, Project Managers, Information Architects, Data Analysts, and Database Administrators. This role works closely with researchers to identify and translate their data needs into informatics and technical solutions. The ideal candidate brings experience with complex data sets, emerging technologies, data pipeline development, and healthcare compliance standards. A solid background in biomedical informatics, metadata management, and technical documentation is required.

Responsibilities:
-Collaboration and Project Support Serve as an integral member of data-centric project teams, ensuring deliverables meet scope, timeline, and budget requirements.
-Work directly with researchers to understand research needs, analyze requirements, and translate findings into informatics-driven technical solutions.
-Effectively manage workload and report task status to team leaders. Participate in technical team discussions and contribute to solution design and implementation.
-Data Engineering and Analytics Design and maintain robust data pipelines and ETL processes to support research and operational initiatives.
-Develop and deliver complex reports that integrate data from multiple, disparate systems.
-Ensure data accuracy, consistency, and integrity across all systems.
-Apply biomedical informatics standards and principles to support research-specific goals and outputs.
Technology and Standards Evaluate emerging technologies and create proof-of-concepts for research and clinical applications. Follow standard operating procedures and ensure compliance with HIPAA and institutional data policies.
-Maintain data standards, metadata documentation, and governance strategies for complex datasets.
Documentation and Communication Gather user requirements and create clear and complete technical documentation.
-Translate technical concepts and data workflows into understandable formats for various stakeholders.
MINIMUM QUALIFIATIONS:
-Bachelor's degree in a related field (e.g., Computer Science, Biomedical Informatics, Data Science, or a healthcare-related discipline)
-At least 3 years of related experience

-MUST HAVE: Experience with Synapse Serverless SQL Pool or Databricks; AND Apache Spark

Preferred Qualifications

-Experience working with research data in academic or clinical environments
-Strong proficiency with data engineering tools and languages (e.g., Python, SQL, Apache Spark, Airflow)
-Familiarity with HIPAA and healthcare data compliance standards Knowledge of biomedical informatics methodologies
-Experience with metadata management and data governance frameworks
-Experience creating and maintaining documentation for complex data systems and research workflows
Knowledge, Skills, and Abilities Requirements
-Proven experience in designing and maintaining data pipelines and ETL processes
-Ability to manage and manipulate large and complex datasets Strong problem-solving and critical-thinking skills
-Excellent communication and collaboration abilities
-Proficient in data visualization and reporting tools (e.g., Tableau, Power BI)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.