W2: Pleasanton, CA (Onsite) :: Databricks Data Engineer / Developer (U.S.C & G.C Only)

Overview

Hybrid
Depends on Experience
Full Time
100% Travel

Skills

Data
Databricks
Unity
Catalog
data lake
Python
Scala
SQL
aws
S3
Azure Data Lake
Google BigQuery
Git
CI
CD
MLFlow

Job Details

Databricks Data Engineer / Developer (U.S.C & G.C Only)

6+Months

Pleasanton, CA (Onsite 4 days a week)

 

Position Overview:
We are on the hunt for a Data Engineer with mastery in Databricks to play a pivotal role in constructing and refining high-scale data workflows and analytical architectures. This role entails collaborating with various teams to facilitate efficient data processing through the Databricks Lakehouse Platform hosted on Azure.

Essential Duties:
Engineer and execute ETL/ELT frameworks utilizing Databricks, Delta Lake, and Apache Spark
Join forces with data technologists, analysts, and business associates to ensure data integrity, clarity, and structured modeling
Craft and maintain data orchestration through Databricks functionalities such as Jobs, Notebooks, and Workflows
Fine-tune Spark engines for maximum performance and cost-effectiveness, along with guaranteed reliability
Supervise data processing infrastructures for high availability and quality assurance
Deploy and uphold continuous integration and deployment standards for Databricks notebooks and infrastructure coded practices (e.g., via Terraform, Databricks CLI)
Systematically document data processes, repositories, and operational methodologies
Uphold strict adherence to protocols concerned with data handling, privacy, and security regulations

Professional Requirements
Holding a BS or MS degree in Computer Science, Data Engineering, or analogous discipline
Accumulating at least 5 years of specialized knowledge in data engineering or related fields
Proven proficiency in Databricks and Apache Spark leveraging Python, Scala, or SQL capacities
Knowledgeable about Delta Lake, Unity Catalog, and contemporary data lake frameworks
Adeptness with cloud infrastructure (Azure, AWS, Google Cloud Platform), specifically with data management services (e.g., Amazon S3, Azure Data Lake Storage, Google BigQuery)
Acquaintance with CI/CD systems, source code management (Git), and orchestration utilities (Airflow, DB Workflows)
In-depth comprehension of data storage principles, query performance optimization, and expansive data operations

Desired Expertise
Practical familiarity with machine learning toolsets within Databricks, such as MLFlow and Feature Store
Acumen in data policy enactment technologies, for instance, Unity Catalog or Azure Purview
Proficiency in fusing BI utilities (Power BI, Tableau) with Databricks architectures
Holding Databricks-acknowledged credentials (Data Engineer Associate/Professional, Machine Learning, etc.)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.