Sr. Pyspark developer with spark streaming and AWS

Overview

Remote

$40 - $50

Contract - Independent

Contract - W2

Contract - 12 Month(s)

Skills

API

Agile

Amazon EKS

Amazon S3

Amazon Web Services

Apache Airflow

Apache HTTP Server

Apache NiFi

Apache Spark

Communication

Continuous Delivery

Continuous Improvement

Continuous Integration

Customer Facing

Data Analysis

Data Engineering

Data Extraction

Data Flow

Data Governance

Docker

Electronic Health Record (EHR)

FOCUS

Innovation

Kubernetes

Management

Performance Tuning

PostgreSQL

PySpark

Python

RESTful

Real-time

Streaming

Telecommunications

Transact-SQL

Workflow

aws

Job Details

The ideal candidate will have deep expertise in building scalable data pipelines using PySpark and Spark Streaming, along with experience working on modern AWS-based data platforms. This role requires strong consulting and communication skills, a passion for innovation, and the ability to work independently in a fast-paced, client-facing environment.

Telecommunications industry experience is highly preferred.

Key Responsibilities:

Design and develop scalable data pipelines using PySpark and Spark Streaming
Implement data workflows on AWS EMR, orchestrated via Apache Airflow
Deploy and manage streaming applications on Amazon EKS with Spark Streams
Integrate and manage data flows using Apache NiFi
Work with Apache Iceberg tables stored in Amazon S3
Enable data access via AWS Aurora PostgreSQL for downstream API consumption
Write and optimize complex T-SQL queries for data extraction and transformation
Identify and implement improvements in pipeline performance, resource tuning, and automation
Drive innovation by streamlining workflows and increasing automation
Collaborate with cross-functional teams and clients to deliver high-quality data solutions
Participate in code reviews, architectural discussions, and performance tuning sessions
Independently manage priorities and deliverables in a dynamic, agile environment

Required Skills & Qualifications:

5+ years of experience in data engineering with a strong focus on PySpark
Proficient in Spark Streaming and real-time data processing
Hands-on experience with AWS EMR, Apache Airflow, and Amazon EKS
Strong skills with Apache NiFi for data ingestion and transformation
Experience working with Apache Iceberg tables and S3-based data lakes
Solid understanding of AWS Aurora PostgreSQL and REST API integration
Proficient in T-SQL, Python, and SQL
Demonstrated expertise in performance tuning for distributed data pipelines
Previous experience in client-facing roles or consulting environments
Strong verbal and written communication skills
Self-starter with a passion for innovation, continuous improvement, and problem-solving

Preferred Qualifications:

AWS Certification (e.g., AWS Certified Data Analytics Specialty)
Experience with CI/CD pipelines and containerization (Docker, Kubernetes)
Familiarity with data governance and security best practices
Telecommunications industry experience is a strong plus

To apply, please send your updated resume to:

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Telecommunications industry experience is highly preferred.

Key Responsibilities:

Design and develop scalable data pipelines using PySpark and Spark Streaming

Implement data workflows on AWS EMR, orchestrated via Apache Airflow

Deploy and manage streaming applications on Amazon EKS with Spark Streams

Integrate and manage data flows using Apache NiFi

Work with Apache Iceberg tables stored in Amazon S3

Enable data access via AWS Aurora PostgreSQL for downstream API consumption

Write and optimize complex T-SQL queries for data extraction and transformation

Identify and implement improvements in pipeline performance, resource tuning, and automation

Drive innovation by streamlining workflows and increasing automation

Collaborate with cross-functional teams and clients to deliver high-quality data solutions

Participate in code reviews, architectural discussions, and performance tuning sessions

Independently manage priorities and deliverables in a dynamic, agile environment

Required Skills & Qualifications:

5+ years of experience in data engineering with a strong focus on PySpark

Proficient in Spark Streaming and real-time data processing

Hands-on experience with AWS EMR, Apache Airflow, and Amazon EKS

Strong skills with Apache NiFi for data ingestion and transformation

Experience working with Apache Iceberg tables and S3-based data lakes

Solid understanding of AWS Aurora PostgreSQL and REST API integration

Proficient in T-SQL, Python, and SQL

Demonstrated expertise in performance tuning for distributed data pipelines

Previous experience in client-facing roles or consulting environments

Strong verbal and written communication skills

Self-starter with a passion for innovation, continuous improvement, and problem-solving

Preferred Qualifications:

AWS Certification (e.g., AWS Certified Data Analytics Specialty)

Experience with CI/CD pipelines and containerization (Docker, Kubernetes)

Familiarity with data governance and security best practices

Telecommunications industry experience is a strong plus

To apply, please send your updated resume to:

About Bridge Flair LLC

Sr. Pyspark developer with spark streaming and AWS

Job Details

Telecommunications industry experience is highly preferred.

Key Responsibilities:

Design and develop scalable data pipelines using PySpark and Spark Streaming

Implement data workflows on AWS EMR, orchestrated via Apache Airflow

Deploy and manage streaming applications on Amazon EKS with Spark Streams

Integrate and manage data flows using Apache NiFi

Work with Apache Iceberg tables stored in Amazon S3

Enable data access via AWS Aurora PostgreSQL for downstream API consumption

Write and optimize complex T-SQL queries for data extraction and transformation

Identify and implement improvements in pipeline performance, resource tuning, and automation

Drive innovation by streamlining workflows and increasing automation

Collaborate with cross-functional teams and clients to deliver high-quality data solutions

Participate in code reviews, architectural discussions, and performance tuning sessions

Independently manage priorities and deliverables in a dynamic, agile environment

Required Skills & Qualifications:

5+ years of experience in data engineering with a strong focus on PySpark

Proficient in Spark Streaming and real-time data processing

Hands-on experience with AWS EMR, Apache Airflow, and Amazon EKS

Strong skills with Apache NiFi for data ingestion and transformation

Experience working with Apache Iceberg tables and S3-based data lakes

Solid understanding of AWS Aurora PostgreSQL and REST API integration

Proficient in T-SQL, Python, and SQL

Demonstrated expertise in performance tuning for distributed data pipelines

Previous experience in client-facing roles or consulting environments

Strong verbal and written communication skills

Self-starter with a passion for innovation, continuous improvement, and problem-solving

Preferred Qualifications:

AWS Certification (e.g., AWS Certified Data Analytics Specialty)

Experience with CI/CD pipelines and containerization (Docker, Kubernetes)

Familiarity with data governance and security best practices

Telecommunications industry experience is a strong plus

To apply, please send your updated resume to:

About Bridge Flair LLC

Share