Data Engineer

Overview

Remote

$110,000 - $130,000

Full Time

No Travel Required

Skills

Apache Framework

Apache Hadoop

Big Data

Apache Kafka

Apache Spark

C++

Extract

Transform

Load

SQL

Snow Flake Schema

Modeling

Data Storage

Data Quality

Data Engineering

Data Integration

Data Cleansing

Data Validation

Databricks

Data Warehouse

Microsoft Azure

Python

Data Management

Scalability

Job Details

This is a permanent, 100% remote opportunity

Job Description:

The Data Engineer is responsible for designing, developing, and maintaining the infrastructure and systems required for data storage, processing, and analysis. They play a crucial role in building and managing the data pipelines that enable efficient and reliable data integration, transformation, and delivery for all data users across the enterprise. The data engineer also is responsible for the creation of BI solutions designed to gain insights, monitor key organizational and operational measures, and provide visibility throughout the organization and to our customers of system performance.

- Designs and develops data pipelines that extract data from various sources, transform it into the desired format, and load it into the appropriate data storage systems.

- Integrates data from different sources, including databases, data warehouses, APIs, and external systems.

- Analyze, design, develop, and document BI solutions based on Information Services standards and best practices.

- Coordinate with the team to build and share knowledge, ensuring consistent delivery of information.
Analyze, diagnose, and resolve reporting, ETL, and data issues.

- Ensures data consistency and integrity during the integration process, performing data validation and cleaning as needed.

- Transforms raw data into a usable format by applying data cleansing, aggregation, filtering, and enrichment techniques.

- Works to optimizes data pipelines and data processing workflows for performance, scalability, and efficiency.

- Monitors and tunes data systems, identifies and resolves performance bottlenecks, and implements caching and indexing strategies to enhance query performance.

- Implements data quality checks and validations within data pipelines to ensure the accuracy, consistency, and completeness of data.

Required:

Bachelor's degree in Computer Science, Information Systems, Mathematics or similar field or equivalent experience.

At least six years of work experience in data management disciplines, including data integration, modeling, optimization and data quality, or other areas directly relevant to data engineering responsibilities and tasks.

Proven project experience developing and maintaining data warehouses in big data solutions (e.g. Snowflake)

Ability to design, build, and deploy data solutions that capture, explore, transform, and utilize
data to support AI, ML, and BI

Experience in data science languages/tools such as SQL, Python, R, SAS, or Excel

Proficiency in the design and implementation of modern data architectures and concepts such as
cloud services (AWS, Azure, Google Cloud Platform) and modern data warehouse tools (Snowflake, Databricks)

Experience with database technologies such as SQL, NoSQL, Oracle, Hadoop, or Teradata
Ability to collaborate within and across teams of different technical knowledge to support delivery
and educate end users on data products.

Expert problem-solving skills, including debugging skills, allowing the determination of sources of
issues in unfamiliar code or systems, and the ability to recognize and solve repetitive problems.

Excellent business acumen and interpersonal skills; able to work across business lines at a senior
level to influence and effect change to achieve common goals.

Ability to describe business use cases/outcomes, data sources and management concepts, and
analytical approaches/options.

Ability to translate among the languages used by executive, business, IT, and quant stakeholders.

Preferred

Master's degree in Information Systems, Business Intelligence Analytics, or similar field.

Knowledge of Apache technologies such as Kafka, Airflow, and Spark to build scalable and efficient data pipelines.

Experience in programming languages such as Java, Python, and C/C++

Healthcare industry experience

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share