Apply Now

Sr. Data Engineer/Databricks SME (15+ Years relevant)

Remote • Posted 59 minutes ago • Updated 50 minutes ago

Contract Independent

Contract W2

No Travel Required

Remote

$100 - $110/hr

Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

Apache Kafka
Apache Spark
Databricks
Data Governance
Extract
Transform
Load
Migration
Microsoft Azure
Python
SQL
Azure Purview
Apache Atlas
Azure Data Factory
Apache NiFi
Data Lake

Summary

Job Title: Sr. Data Engineer Databricks SME Location: Raleigh, NC preferred Remote candidate might be considered.

Description:
We are seeking a Senior Data Engineer to support our client with data ingestion, data deduplication and data tagging for migration of a large-scale data environment into Databricks.

The ideal candidate will also bring hands-on expertise in end-to-end data pipeline management, including data ingestion from diverse sources, de-duplication of large-scale datasets, and data tagging to support downstream analytics, governance, and machine learning workflows.

Roles and Responsibilities (including but not limited to):
Design, develop, and maintain scalable data ingestion pipelines to onboard structured, semi-structured, and unstructured data from batch and streaming sources (e.g., APIs, databases, flat files, message queues) into the Azure/Databricks environment.
Implement de-duplication strategies across large-scale datasets using deterministic and probabilistic matching techniques to ensure data integrity and reduce redundancy within the Data Lake.
Develop and enforce data tagging frameworks to classify, label, and annotate datasets with appropriate metadata (e.g., sensitivity, source, domain, lineage) to support data governance, discoverability, and compliance requirements.
Assist with Operationalizing deployments and support of Cloud services for ETL Operations. This will include standardizing and automating processes and workflows, creating documentation/knowledge articles, and overall assisting Operations staff who have limited experience in Cloud.
Written and oral presentations to high-level CIO management on status of current efforts.
Possesses skills and experience related to business management, systems engineering, operations research, and management engineering. Typically has specialization in a particular technology or business application. Keeps abreast of technological developments and industry trends.
Assist with deployment, configuration, and management of Azure Cloud environment.
Assist with migration efforts of existing ETL jobs into Azure/Databricks cloud environment.
Ability to share optimization and efficiencies with the larger team and management.
Ability to automate solutions to repetitive problems/tasks.

Basic Qualifications:
Must be eligible for a Position of Public Trust, five years of U.S. residency, and no more than six months of international travel in the past five years (excluding travel for U.S.-based work).
Bachelor s degree and 13 years of experience. A degree from an accredited College/University in the applicable field of services is preferred. Four additional years of relevant experience in lieu of a college degree is required. If Degree is not in the applicable field, then four additional years of related experience is required.
5+ years demonstrated experience designing and implementing data ingestion pipelines using tools such as Azure Data Factory, Apache Kafka, Apache NiFi, Spark Structured Streaming, or equivalent technologies.
5+ years of experience applying de-duplication techniques at scale, including record linkage, fuzzy matching, and entity resolution across structured and unstructured datasets.
5+ Hands-on experience with data tagging and metadata management, including the use of tagging schemas, data catalogs (e.g., Azure Purview, Apache Atlas), and automated classification tools to support data governance and lineage tracking.
5 + Demonstrated experience working with unstructured data.
2+ years of experience in using Databricks or other Spark-based platforms.
Fluency in at least one scripting language (Python, Perl, Ruby, or equivalent).

Desired Skills:
Integration of Git in continuous deployment and experience with DevOps monitoring tools.
Experience with one or more of the following products and technologies: SAS, Python, C++, Hadoop, SQL Database/Coding, Teradata, Oracle, Amazon S3, Apache Spark, Machine Learning, Natural Language Processing, and visualization tools such as Tableau, Strategy and QLIK.
Strong skills and experience in Cloud Operations support in Azure.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10329198
Position Id: 8965797
Posted 59 minutes ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Data Engineer Databricks SME

Remote

•

Yesterday

We are seeking an experienced Data Engineer with strong expertise in Databricks and Azure Cloud to support large-scale data migration and modernization initiatives. The ideal candidate will have hands-on experience with data ingestion, ETL/ELT pipelines, data deduplication, metadata tagging, and Spark-based technologies. Required Skills: Strong experience with Azure Databricks, PySpark, and Azure Data Factory Experience building scalable data ingestion and ETL pipelines Expertise in data dedupli

Easy Apply

Contract

Depends on Experience

Databricks Data Engineer (AWS)

Remote

•

9d ago

We are looking for a hands-on Databricks Data Engineer with strong AWS experience to design, build, and optimize scalable data pipelines and lakehouse solutions. The role focuses on implementing robust batch and streaming data solutions using Databricks, Delta Lake, and AWS cloud-native services, ensuring high performance, scalability, and security. Key Responsibilities (Data Engineering & Pipeline Development) Build and maintain end-to-end data pipelines using Databricks, Delta Lake, and AWS se

Easy Apply

Contract

Depends on Experience

Senior DBT Cloud / Databricks Architect (Azure Cloud only) - 15+ years required

Remote

•

Today

Job Title: Senior DBT Cloud / Databricks Architect (Azure Stack only) Location: Remote (USA) Employment Type: Contract Role Overview We are looking for a Senior DBT Cloud / Databricks Developer/Architect to lead the design and implementation of modern data transformation and analytics solutions. The ideal candidate will have strong expertise in DBT Cloud, Databricks, and cloud data platforms, and will play a key role in building scalable, high-performance data pipelines and data models. Key Res

Easy Apply

Contract, Third Party

Data Engineer (Databricks | Healthcare Analytics)

Remote

•

4d ago

Data Engineer (Databricks | Healthcare Analytics)The Data Engineer is responsible for designing, building, and maintaining scalable data pipelines and data platforms that support enterprise analytics, business intelligence, and clinical reporting. This role plays a critical part in enabling data-driven decision-making by integrating data from Electronic Medical Record (EMR) systems and other healthcare sources into a modern data ecosystem. The ideal candidate has strong experience withDatabricks

Easy Apply

Third Party, Contract

60 - 70

Search all similar jobs