Databricks Subject Matter Expert (SME) with Real time Streaming & Spark Performance Optimization

Hybrid in Houston, TX, US • Posted 19 hours ago • Updated 19 hours ago
Contract W2
Contract Corp To Corp
Contract Independent
No Travel Required
Hybrid
Depends on Experience
Company Branding Image
Fitment

Dice Job Match Score™

👾 Reticulating splines...

Job Details

Skills

  • Databricks
  • Apache Spark
  • Spark Structured Streaming
  • Spark Performance Tuning
  • Spark Internals
  • Kafka
  • Delta Lake
  • Python
  • Scala
  • Real-Time Streaming
  • Distributed Systems
  • Latency Optimization

Summary

Position: Databricks Subject Matter Expert (SME) with Real time Streaming & Spark Performance Optimization
Location: Houston, TX (2 days hybrid onsite)

Role Overview:

We are seeking a highly experienced Databricks Subject Matter Expert (SME) with deep expertise in Apache Spark internals, real-time streaming architectures, and performance optimization. The ideal candidate will play a critical role in designing, troubleshooting, and optimizing large-scale real-time data processing pipelines on the Databricks platform.

This role requires strong hands-on experience diagnosing latency, throughput, and scalability challenges in high-volume streaming environments. The SME will work closely with data engineering, platform, and architecture teams to ensure efficient processing of real-time data while improving overall system performance and reliability.

Key Responsibilities:
  • Act as the Databricks and Spark performance expert, providing guidance on architecture, optimization, and troubleshooting of large-scale data pipelines.
  • Design, implement, and optimize real-time streaming solutions using Apache Spark on Databricks.
  • Analyze and improve Spark job performance, focusing on latency reduction, resource utilization, and throughput optimization.
  • Perform deep Spark internals analysis, including execution plans, DAG optimization, shuffle operations, memory management, and partitioning strategies.
  • Troubleshoot complex real-time data processing issues, including streaming delays, backpressure, and processing bottlenecks.
  • Conduct system-level performance tuning for Spark workloads running on Databricks clusters.
  • Optimize Spark Structured Streaming pipelines for high-volume, low-latency workloads.
  • Collaborate with platform and infrastructure teams to tune cluster configurations and resource allocation.
  • Establish best practices for Spark architecture, cluster configuration, and performance tuning.
  • Provide technical leadership and mentoring to data engineering teams on Spark optimization and streaming architecture.

Required Skills & Experience:
  • Extensive hands-on experience with Databricks in enterprise-scale environments.
  • Deep knowledge of Apache Spark internals, including execution engine, query optimization, memory management, and shuffle mechanics.
  • Strong experience with real-time streaming architectures using Spark Structured Streaming or similar frameworks.
  • Proven expertise in performance tuning and troubleshooting Spark workloads at the system level.
  • Experience resolving latency issues and optimizing throughput for real-time data processing pipelines.
  • Strong understanding of distributed computing concepts and large-scale data processing frameworks.
  • Experience analyzing Spark execution plans, DAGs, and cluster resource usage.
  • Proficiency in Python, Scala, or SQL for Spark-based data engineering workloads.
  • Experience working in large enterprise data platforms and complex distributed systems.

Preferred Qualifications:
  • Hands-on experience with streaming platforms such as Kafka or similar event-streaming technologies.
  • Experience with cloud-based Databricks deployments (AWS, Azure, or Google Cloud Platform).
  • Strong understanding of data pipeline architecture, data lakehouse concepts, and scalable streaming frameworks.
  • Experience working with large-scale real-time analytics platforms.

Certifications (Optional):
  • Databricks Champion Certification or other Databricks certifications are highly preferred.
 
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 91138973
  • Position Id: 8910824
  • Posted 19 hours ago

Company Info

About Kainos Innovative Solutions Inc

Kainos was established in early 2019 and is headquartered in Falls Church, Virginia. Though a humble beginning, we bring proven expertise in-depth knowledge gained through working in key roles in multinational companies serving global customers in the IT industry over the last few decades.

With our expertise, we understand the complexity of today’s and next generation technologies. This enables us to deliver Innovative solutions for global challenges that are scalable, optimal and secure for Government Agencies and Commercial Clients delivered on-time customized to fit the budget and meeting the business needs. Our areas of expertise include providing Digital Services, Application, Data & Infrastructure Services, Cyber Security Services, Professional Services, and Customer Support Services.

Our focus is to offer customized solutions and service with cost effective solutions. We faithfully strive to be a customer-centric organization.

Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

It looks like there aren't any Similar Jobs for this job yet.

Search all similar jobs