Apply Now

Data Engineer with Cloud Data Integration & Transformation

Remote • Posted 22 hours ago • Updated 22 hours ago

Contract Independent

Contract W2

Contract Corp To Corp

No Travel Required

Remote

$80 - $90/hr

Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

Apache Spark
COBOL
Data Integration
Data Lake
Data Engineering
Databricks
Extract, Transform, Load
IBM DB2
Legacy Systems
Microsoft Azure
PySpark
SQL
Warehouse

Summary

Data Engineer – Cloud Data Integration & Transformation

Location: Remote (CST/EST Horus)

Exp: 14+yrs

Note: Need 14 + years of experience. Candidate should have good communication skills.

Note Candidate must have experience of migration, integration from Legacy system (COBOL, DB2) to Azure Databricks.

About the Role

We are seeking a hands-on Data Engineer to develop and maintain scalable data pipelines and transformation routines within a modern Azure + Databricks environment. This role is focused on executing ingestion, cleansing, standardization, matching, merging, and enrichment of complex legacy datasets into a governed data lakehouse architecture.

The ideal candidate brings deep experience with Spark (PySpark), Delta Lake, Azure Data Factory, and data wrangling techniques — and is comfortable working in a structured, code-managed, team-based delivery environment.

Key Responsibilities

Pipeline Development & Maintenance

Build and maintain reusable data pipelines using Databricks, PySpark, and SQL.

Implement full and incremental loads from sources including VSAM, Db2 (LUW and z/OS), SQL Server, and flat files.

Use Delta Lake on ADLS Gen2 to support ACID transactions, scalable upserts/merges, and time travel.

Leverage Azure Data Factory for orchestration and triggering of Delta Live Tables and Databricks Jobs as part of nightly pipeline execution.

Data Cleansing & Transformation

Apply cleansing logic for deduplication, parsing, standardization, and enrichment based on business rule definitions.

Use Spark-Cobol Library to parse EBCDIC/COBOL-formatted VSAM files into structured DataFrames.

Maintain ''bronze → silver → gold'' structured layers and ensure quality during data transformations.

Support classification and mapping logic in collaboration with analysts and architects.

Observability, Testing & Validation

Integrate robust logging and exception handling to enable observability and pipeline traceability.

Monitor job performance and cost with Azure Monitor and Log Analytics.

Support validation and testing using frameworks like Great Expectations or dbt tests to enforce expectations on nulls, ranges, and referential integrity.

Security, DevOps & Deployment

Store and manage credentials securely using Azure Key Vault during pipeline execution.

Maintain pipeline code using Azure DevOps Repos and participate in peer reviews and promotion workflows via Azure DevOps Pipelines.

Deploy notebooks, configurations, and transformations using CI/CD best practices in repeatable environments.

Collaboration & Profiling

Collaborate with architects to ensure alignment with data platform standards and governance models.

Work with analysts and SMEs to profile data, refine cleansing logic, and conduct variance analysis using Databricks Notebooks and Databricks SQL Warehouse.

Support metric publication and lineage registration using Microsoft Purview and Unity Catalog and contribute to profiling datasets for Power BI consumption.

Required Skills & Experience

5+ years of experience in data engineering or ETL development roles.

Proficiency in:
- Databricks, PySpark, SQL
- Delta Lake and Azure Data Lake Storage Gen2
- Azure Data Factory for orchestration and event-driven workflows

Experience with:
- Cleansing, deduplication, parsing, and merging of high-volume datasets
- Parsing EBCDIC/COBOL-formatted VSAM files using Spark-Cobol Library
- Connecting to Db2 databases using JDBC drivers for ingestion

Familiarity with:
- Git, Azure DevOps Repos & Pipelines
- Great Expectations or dbt for validation
- Azure Monitor + Log Analytics for job tracking and alerting
- Azure Key Vault for secrets and credentials
- Microsoft Purview and Unity Catalog for metadata and lineage registration

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91134724
Position Id: 8959435
Posted 22 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Databricks Data Engineer (AWS)

Remote

•

3d ago

We are looking for a hands-on Databricks Data Engineer with strong AWS experience to design, build, and optimize scalable data pipelines and lakehouse solutions. The role focuses on implementing robust batch and streaming data solutions using Databricks, Delta Lake, and AWS cloud-native services, ensuring high performance, scalability, and security. Key Responsibilities (Data Engineering & Pipeline Development) Build and maintain end-to-end data pipelines using Databricks, Delta Lake, and AWS se

Easy Apply

Contract

Depends on Experience

Databricks Data Engineer

Remote

•

25d ago

Job Title:Databricks Data EngineerLocation:Remote with Quarterly travel to Gaithersburg, MDDuration: 6 monthsPR: $68.50/hr on W2Security Clearance Required: PUBLIC TRUSTGreater DC Area preferredJob Description:Client is seeking a Databricks Data Engineer to develop/support new/existing data pipelines, and data analytics environments in Azure cloud-based data lake. As a data engineer, you will translate business requirements to data engineering solutions to support an enterprise scale Microsoft A

Easy Apply

Contract

65 - 68

Azure Data Engineer

Remote

•

2d ago

Role Senior azure Engineer Location: Remote (United States)Employment Type: Contract (W2) Duration: 6+ Months (Possible Extension)Key ResponsibilitiesDesign, build, and orchestrate scalabledata pipelinesusing Azure-native services.Develop and optimizebatch and near real-time data processing solutionsusing Spark and Databricks.Implement and manageenterprise-grade Lakehouse architectures(Medallion architecture).Performdata processing, validation, and automationusing Python (PySpark, Pandas, NumPy)

Easy Apply

Contract

50 - 55

Senior DBT Cloud / Databricks Architect

Remote

•

8d ago

Role Overview We are looking for a Senior DBT Cloud / Databricks Architect to lead the design and implementation of modern data transformation and analytics solutions. The ideal candidate will have strong expertise in DBT Cloud, Databricks, and cloud data platforms, and will play a key role in building scalable, high-performance data pipelines and data models. Key Responsibilities Design and develop scalable data transformation pipelines using DBT Cloud Architect and implement Databricks-based d

Easy Apply

Contract

Depends on Experience

Search all similar jobs