Senior Data Engineer/ Remote / PySpark

Remote in Remote, CA, US • Posted 30+ days ago • Updated 4 hours ago
Full Time
On-site
$160000 - $190000/yr
Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

  • Analytics
  • Customer Facing
  • Data Processing
  • Data Quality
  • FOCUS
  • Data Engineering
  • Training
  • Data Modeling
  • SEO
  • Customer Engagement
  • Use Cases
  • Workflow
  • Orchestration
  • PySpark
  • Apache Spark
  • Amazon Web Services
  • Electronic Health Record (EHR)
  • API
  • Machine Learning (ML)
  • Machine Learning Operations (ML Ops)
  • Management
  • Collaboration
  • Insurance
  • SAP BASIS

Summary

A well-known data aggregation company is seeking a Senior Data Engineer to join their engineering team on. The company builds large-scale platforms that ingest, normalize, and connect data from thousands of sources, powering analytics, machine learning, and customer-facing products. The role is heavily focused on PySpark, AWS EMR, distributed data processing, API development, and data systems that support modern ML and LLM-driven workflows.

This is a chance to work on high-impact data systems at serious scale. The team is looking for a senior engineer who enjoys tackling messy data problems, shaping architecture, and collaborating closely with ML and product teams. The #1 draw of this role is ownership. You'll have real influence over pipeline design, data quality, and how aggregated data is used across SEO, search, and customer interaction use cases. Engineers here grow by solving non-trivial problems, learning modern MLOps practices, and working in a balanced, low-meeting environment that values deep focus.

Required Skills & Experience
7+ years of professional experience in data engineering
5+ years of hands-on experience with PySpark in production
Strong experience running distributed workloads on AWS EMR
Experience building and maintaining data pipelines at scale
Exposure to MLOps workflows (model training, deployment, or monitoring)
Experience developing and supporting APIs
Strong understanding of distributed systems and data modeling

Desired Skills & Experience
Experience in data aggregation, enrichment, or entity-level data systems
Familiarity with LLMs for SEO optimization or customer interaction use cases
Experience working with large-scale consumer or web data
Exposure to workflow orchestration tools (Airflow or similar)
Experience supporting ML or search-related platforms

What You Will Be Doing

Tech Breakdown
60% PySpark / Spark on AWS EMR
20% API development and backend services
20% ML / LLM data support and MLOps collaboration

Daily Responsibilities
75% Hands On
5% Management Duties
20% Team Collaboration

The Offer
Competitive salary and bonus eligibility
You will receive the following benefits:
Medical, Dental, and Vision Insurance
Vacation Time
Stock Options

Applicants must be currently authorized to work in the US on a full-time basis now and in the future.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10105282
  • Position Id: 862238
  • Posted 30+ days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote or Long Beach, California

Today

Full-time

Remote

Today

Full-time

Remote or San Francisco, California

Today

Full-time

USD 155,584.00 - 320,320.00 per year

Remote

26d ago

Easy Apply

Full-time

50 - 55

Search all similar jobs