Java Developer with Spark

Overview

Hybrid
$60 - $65
Contract - W2
Contract - 12 Month(s)

Skills

Amazon S3
Amazon Web Services
Analytics
Apache Hive
Apache Kafka
Apache Maven
Apache Spark
Build Tools
Cloud Computing
Collaboration
Collections
Concurrent Computing
Continuous Delivery
Continuous Integration
Data Engineering
Data Governance
Data Processing
Database
Databricks
Debugging
Design Review
DevOps
Distributed Computing
Docker
Electronic Health Record (EHR)
Extract
Transform
Load
GitLab
Gradle
HDFS
Java
Jenkins
Kubernetes
Linux
Microsoft Azure
Multithreading
NoSQL
Performance Tuning
Python
Real-time
SQL
Scala
Shell Scripting
Streaming
Testing
Unix
Workflow

Job Details

Job Title: Java Developer (Apache Spark)
Location: Richmond, VA- Hybrid
Employment Type: Contract

Overview

Seeking an experienced Java Developer with strong proficiency in Apache Spark to design and develop scalable, high-performance data processing applications. The candidate will collaborate with data engineers, solution architects, and DevOps teams to build reliable distributed systems supporting enterprise analytics and real-time data platforms.

Key Responsibilities

Design, develop, and optimize batch and streaming data pipelines using Apache Spark (Core, SQL, or Streaming).

Write clean, reusable, and efficient Java code for backend and data processing components.

Integrate Spark applications with modern data ecosystems HDFS, S3, Kafka, and relational or NoSQL databases.

Employ performance tuning techniques for Spark jobs and JVM-based applications.

Collaborate with data engineering teams to define data models, transformations, and ingestion workflows.

Implement automated CI/CD pipelines and testing strategies to enhance code quality and deployment reliability.

Participate in design reviews, architecture discussions, and production troubleshooting.

Required Skills and Experience

5+ years of experience developing in Java (Core Java, multithreading, collections, and concurrency).

2+ years of hands-on experience with Apache Spark (RDD, DataFrame, and Dataset APIs).

Strong understanding of distributed computing and data partitioning principles.

Experience with data sources such as Kafka, Hive, SQL/NoSQL databases, and cloud data lakes.

Familiarity with build tools (Maven, Gradle) and CI/CD tools (Jenkins, GitLab CI, or similar).

Working knowledge of Linux/Unix environments and shell scripting.

Strong debugging and performance optimization skills.

Nice to Have

Experience with Scala or Python for Spark.

Exposure to AWS (EMR, Glue, Lambda, S3) or Azure Databricks.

Understanding of containerization (Docker, Kubernetes).

Familiarity with modern data governance and ETL frameworks.

Education

Bachelor s or Master s degree in Computer Science, Engineering, or related field.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.