Big Data Engineer(Spark/Scala)

Sunnyvale, CA, US • Posted 1 day ago • Updated 2 hours ago
Contract W2
12 Months
No Travel Required
On-site
$50/hr
Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

  • Big Data
  • Apache Spark
  • Artificial Intelligence
  • Machine Learning (ML)
  • Root Cause Analysis
  • Incident Management
  • Scala

Summary

Description
Responsible for developing and operating big data platform using open source or other solutions to aid critical applications, such as analytics, reporting, and AI/ML apps. This includes working to optimize performance and cost, automate operations, and identifying and resolving production errors and issues to ensure the best data platform experience.

Responsibilities
Develop and operate large-scale big data platforms using open source and other solutions.
Support critical applications including analytics, reporting, and AI/ML apps.
Optimize platform performance and cost efficiency.
Automate operational tasks for big data systems.
Identify and resolve production errors and issues to ensure platform reliability and user experience

Minimum Qualifications
3+ years of professional software engineering experience with large-scale big data platforms, including strong programming skills in Java, Scala, Python, or Go.
Proven expertise in designing, building, and operating large-scale distributed data processing systems with a strong focus on Apache Spark.
Hands-on experience with table formats and data lake technologies such as Apache Iceberg, ensuring scalability, reliability, and optimized query performance.
Skilled at coding for distributed systems and developing resilient data pipelines.
Strong background in incident management, including troubleshooting, root cause analysis, and performance optimization in complex production environments.
Proficient with Unix/Linux systems and command-line tools for debugging and operational support.

Preferred Qualifications
Expertise in designing, building, and operating critical, large-scale distributed systems with a focus on low latency, fault-tolerance, and high availability.
Experience with contribution to Open Source projects is a plus.
Experience with multiple public cloud infrastructure, managing multi-tenant Kubernetes clusters at scale and debugging Kubernetes/Spark issues.
Experience with workflow and data pipeline orchestration tools (e.g., Airflow, DBT).
Understanding of data modeling and data warehousing concepts.
Familiarity with the AI/ML stack, including GPUs, MLFlow, or Large Language Models (LLMs).
A learning attitude to continuously improve the self, team, and the organization.
Solid understanding of software engineering best practices, including the full development lifecycle, secure coding, and experience building reusable frameworks or libraries.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91099677
  • Position Id: 8991390
  • Posted 1 day ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Hybrid in Sunnyvale, California

Yesterday

Easy Apply

Contract

50

Cupertino, California

Today

Full-time

San Jose, California

Today

Full-time

USD 323,000.00 - 428,000.00 per year

Foster City, California

Today

Full-time

USD 146,200.00 - 233,700.00 per year

Search all similar jobs