Spark job migration specialist

San Francisco, CA, US • Posted 12 hours ago • Updated 12 hours ago

Contract Independent

Contract W2

No Travel Required

On-site

Depends on Experience

Fitment

Dice Job Match Score™

🔢 Crunching numbers...

Job Details

Skills

Spark job migration specialist

Summary

Position: Spark job migration specialist

SFO, CA

A Spark job migration specialist migrates data pipelines, JAR tasks, and analytics workloads from legacy systems (like Hadoop/CDH or AWS EMR) to ACOS modern platforms This involves refactoring code (e.g., Hive to PySpark), performance testing, and updating Spark 2.x to 3.x.

Key Job Responsibilities

Workload Migration: Migrate JVM workloads and Spark-Submit tasks to Databricks JAR tasks or Notebook tasks.
Pipeline Re-engineering: Convert existing HiveQL scripts and Oozie workflows into optimized Spark SQL or PySpark applications.
Refactoring: Adapt data pipelines from Azure Synapse to any cloud platform , including updating library dependencies and notebook references.
Performance Optimization: Implement Adaptive Query Execution (AQE) in Spark 3 to improve shuffle performance and fix skew joins.
Testing & Validation: Perform regression testing to ensure output consistency between old and new systems using validation scripts.
Job Customization: Use spark.sparkContext.setJobDescription() to label, monitor, and troubleshoot specific Spark tasks in the UI.

Job Description/Profile

Role: Big Data Migration Engineer (Spark)
Experience: 5+ years experience with Apache Spark (PySpark/Scala) and Cloud platforms (Azure/AWS).
Requirements:

Strong experience with HDFS, Hadoop ecosystem (Hive, Spark, HBase, MapReduce). Experience in data migration to cloud / enterprise data platforms. Knowledge of: Data ingestion tools (Sqoop, Kafka, NiFi, etc.) Cloud storage (ADLS, S3, Blob Storage) Distributed processing frameworks SQL and performance tuning expertise. Experience in scripting (Python, Shell, Scala). Key Migration Focus Areas

Data Pipelines: Ensuring schema evolution, data correctness, and testing with golden datasets.
Job Definitions: Reconfiguring job properties, cluster settings, and Spark configurations.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10118140
Position Id: 8920810
Posted 12 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Spark Job Migration Specialist

San Francisco, California

•

Today

We are looking for Spark Job Migration Specialist for our client in San Francisco, CA Job Title: Spark Job Migration Specialist Job Location: San Francisco, CA Job Type: Contract Job Overview: Pay Range: $60hr - $65hrResponsibilities: Migrate JVM workloads and Spark-Submit tasks to Databricks JAR tasks or Notebook tasks.Convert HiveQL scripts and Oozie workflows into optimized Spark SQL or PySpark applications.Refactor data pipelines from Azure Synapse to cloud platforms, updating dependencies

Easy Apply

Third Party, Contract

Depends on Experience

Big Data Migration Engineer

San Francisco, California

•

Today

Job Title: Senior Big Data Engineer (Apache Spark, Databricks, Cloud Migration) Location : Location: Onsite (SFO, CA) / Remote OK Job Description:We are looking for a Senior Big Data Engineer with strong expertise in Apache Spark, Databricks, and Cloud Migration (AWS/Azure). The ideal candidate will be responsible for migrating legacy data pipelines and Spark workloads to modern cloud-based platforms. Key Responsibilities: Migrate Spark jobs, JAR tasks, and data pipelines from legacy systems (Ha

Easy Apply

Contract

Depends on Experience

Databricks Data Engineer

San Francisco, California

•

13d ago

Role: Databricks Data Engineer Location: San Francisco/ Seattle ( 5 Days Onsite) JD We are seeking multiple positions (10 years of experience) in Databricks along with Adv SQL Knowledge The primary responsibility of this role is to provide Technical support for the Databricks Environment. Must Have Atleast 10+Years of Experience Local to CA Key Responsibilities: Bug Fixes in the Databricks environment Ability to Monitor, Transform and optimize ETL pipelines for Databricks and Knowledge of Data

Easy Apply

Full-time

Depends on Experience

Senior Analytical Engineer

San Francisco, California

•

Today

Company Description Visa is a world leader in payments technology, facilitating transactions between consumers, merchants, financial institutions and government entities across more than 200 countries and territories, dedicated to uplifting everyone, everywhere by being the best way to pay and be paid. At Visa, you'll have the opportunity to create impact at scale - tackling meaningful challenges, growing your skills and seeing your contributions impact lives around the world. Join Visa and do

Full-time

USD 149,800.00 - 240,100.00 per year

Search all similar jobs