Apply Now

Data Engineer

Charlotte, NC, US • Posted 3 hours ago • Updated 3 hours ago

Contract W2

Contract Corp To Corp

12 Months

On-site

$55 - $60/hr

Rivago infotech inc

Fitment

Dice Job Match Score™

📊 Calculating match score...

Job Details

Skills

python
Hadoop
pyspark
airflow

Summary

Required Skills & Experience

Programming: Python/PySpark, Scala is a plus

Big Data: Hadoop (HDFS, YARN), Hive, Spark (optimization, tuning)

Orchestration: Apache Airflow

Databases/ETL: MongoDB (indexing, sharding, tuning) SQL Server & SSIS (development, migration) Strong SQL & stored procedures

Data Lake: HDFS, Hive, Parquet/ORC, partitioning, compaction

APIs: REST-based ingestion Reverse engineering & lineage tools

CI/CD & DevOps: Git, Jenkins, Docker, IaC

Monitoring: logging, metrics, lineage

Key Responsibilities

Reverse Engineering & Data Mapping
Reverse engineer ETL pipelines (SSIS, Spark, stored procedures) to document data
flows, logic, and transformations.
Perform detailed source-to-target mappings with field-level transformations and business
rules.
Build data dictionaries, lineage, and mapping artifacts.
Collaborate with SMEs to uncover undocumented logic.
Identify data model gaps and recommend remediation.
ETL Pipeline Remediation
Design and refactor pipelines aligned to new source APIs and data contracts.
Re-engineer ETL for 1:1 functional parity during migrations.
Implement schema evolution, transformations, and mapping changes (batch &
streaming).
Eliminate redundancy and optimize legacy logic.
Build modular, reusable pipelines using Spark/PySpark/Scala.
Modernize SSIS and integrate with orchestration frameworks.
Orchestrate workflows in Airflow (DAGs, dependencies, SLAs).
Implement logging, error handling, alerting, and metadata capture.
Data Storage Optimization
Simplify schemas; remove redundant/obsolete data across Hive and MongoDB.
Optimize partitioning, clustering, and file formats (Parquet, ORC, Avro).
Redesign MongoDB indexing, sharding, and collections.
Tune HDFS, Hive, MongoDB, and SQL Server for performance and cost.
Implement lifecycle management, archival, and retention.

Functional Skills

Experience in ETL migration/remediation projects
Strong reverse engineering of legacy ETL (SSIS, Spark, scripts)
Expertise in STM, transformation specs, and lineage artifacts
Data modeling (dimensional, normalized, denormalized)
Schema evolution and zero-downtime migrations
Performance tuning across compute and storage layers
Strong debugging and problem-solving for distributed systems

Preferred Qualifications

AI/ML-assisted ETL remediation or code conversion
Experience with Wiz or Palo Alto Prisma (APIs, data models, risk metrics)
Prior Prisma to Wiz (or similar CSPM/CNAPP) migrations
Knowledge of CSPM/CNAPP domains (vulnerabilities, identities, exposures)
Experience in regulated, compliance-heavy environments

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91131106
Position Id: 8980547
Posted 3 hours ago

Company Info

About Rivago infotech inc

Rivago Infotech Inc has been a leader in IT staffing and Software development for over 5 years and is one of the largest diversity and development firms in the industry. We are known for our high-touch, customer-eccentric approach, offering our clients unmatched quality, responsiveness and flexibility . We are appreciated by our clients for our streamlined execution, highly efficient service and exceptional talent management that go above and beyond traditional staffing services.

Go to company profile

Contact the job poster

Garima Rajput

Recruiter @ Rivago infotech inc

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Sr. Data Engineer - SAP SD module

Remote

•

Today

About the role We are seeking aSenior Data Engineerwith strong expertise inSAP sales data domainsanddata warehouse development. The ideal candidate will have deep proficiency inSQL, hands-on experience working withSAP data as source (orders, shipments, backlog, invoices), and the ability to design and build robust data pipelines and models. Responsibilities : Design, develop, and maintain data pipelines for SAP sales data (orders, shipments, backlog, invoices)Translate SAP data structures into a

Easy Apply

Third Party, Contract

55 - 58

Senior Data Engineer

Irving, Texas

•

Today

Experienced Senior Data Engineer to support large scale data platform modernization initiatives within a regulated banking environment. The role focuses on designing and building enterprise-grade in-house frameworks, supporting high-volume batch and CDC-based incremental processing using Cloudera platform, and enabling ongoing Google Cloud Platform (Google Cloud Platform) modernization efforts. Technology & Skill Requirements Apache Spark (PySpark and/or Scala) in large-scale production enviro

Easy Apply

Full-time

130,000 - 135000

Search all similar jobs