Databricks SME

Remote • Posted 2 days ago • Updated 2 days ago
Contract W2
Contract Corp To Corp
Contract Independent
Remote
$60+
Fitment

Dice Job Match Score™

🔗 Matching skills to job...

Job Details

Skills

  • Azure
  • Databricks
  • Data Engineering
  • Senior Data Engineer
  • Data Ingestion
  • ETL
  • ETL Migration
  • Data Pipelines
  • Scalable Pipelines
  • Batch Processing
  • Streaming Processing
  • Structured Data
  • Semi-Structured Data
  • Unstructured Data
  • Azure Data Factory
  • Apache Kafka
  • Apache NiFi
  • Spark Structured Streaming
  • Apache Spark
  • Spark
  • Data Lake
  • Azure Cloud
  • Cloud Migration
  • Cloud Operations
  • Data Deduplication
  • De-duplication
  • Record Linkage
  • Fuzzy Matching
  • Entity Resolution
  • Deterministic Matching
  • Probabilistic Matching
  • Data Integrity
  • Metadata Management
  • Data Tagging
  • Tagging Frameworks
  • Tagging Schemas
  • Data Governance
  • Data Lineage
  • Data Catalogs
  • Azure Purview
  • Apache Atlas
  • Automated Classification
  • Compliance
  • Discoverability
  • Workflow Automation
  • Process Automation
  • Operational Support
  • Deployment Automation
  • Documentation
  • Knowledge Articles
  • Python
  • Perl
  • Ruby
  • SQL
  • SQL Coding
  • Hadoop
  • Teradata
  • Oracle
  • Amazon S3
  • SAS
  • C++
  • Machine Learning
  • NLP
  • Natural Language Processing
  • Tableau
  • MicroStrategy
  • Qlik
  • DevOps
  • CI/CD
  • Continuous Deployment
  • Git
  • Monitoring Tools
  • Azure Administration
  • Databricks Migration
  • Data Governance Frameworks
  • Metadata Cataloging
  • Streaming Pipelines
  • API Integration
  • Flat Files
  • Message Queues
  • Executive Communication
  • CIO Presentations
  • Stakeholder Management
  • Systems Engineering
  • Operations Research
  • Management Engineering
  • Enterprise Data Platforms
  • Data Modernization
  • Cloud Enablement
  • Data Platform Optimization
  • Data Operations
  • Enterprise Data Lake
  • Data Classification
  • Lineage Tracking
  • Automation Solutions
  • Big Data
  • Distributed Data Systems
  • Public Trust Eligible
  • U.S. Citizenship
  • Permanent Residency

Summary

Job Title: Databricks SME

Location: Raleigh, NC/Remote

Duration: 12+ Months
Must have 15+ Years of experience

Description:

We are seeking a Senior Data Engineer to support our client with data ingestion, data deduplication, and data tagging for migration of a large-scale data environment into Databricks.


Roles and Responsibilities (including but not limited to):

Design, develop, and maintain scalable data ingestion pipelines to onboard structured, semi-structured, and unstructured data from batch and streaming sources (e.g., APIs, databases, flat files, message queues) into the Azure/Databricks environment.

Implement de-duplication strategies across large-scale datasets using deterministic and probabilistic matching techniques to ensure data integrity and reduce redundancy within the Data Lake.

Develop and enforce data tagging frameworks to classify, label, and annotate datasets with appropriate metadata (e.g., sensitivity, source, domain, lineage) to support data governance, discoverability, and compliance requirements.

Assist with Operationalizing deployments and support of Cloud services for ETL Operations. This will include standardizing and automating processes and workflows, creating documentation/knowledge articles, and overall assisting Operations staff who have limited experience in Cloud.

Written and oral presentations to high-level CIO management on status of current efforts.

Possesses skills and experience related to business management, systems engineering, operations research, and management engineering. Typically has specialization in a particular technology or business application. Keeps abreast of technological developments and industry trends.

Assist with deployment, configuration, and management of Azure Cloud environment.

Assist with migration efforts of existing ETL jobs into Azure/Databricks cloud environment.

Ability to share optimization and efficiencies with the larger team and management.

Ability to automate solutions to repetitive problems/tasks.


Basic Qualifications:

Must be eligible for a Position of Public Trust, including U.S. citizenship or permanent residency, five years of U.S. residency, and no more than six months of international travel in the past five years (excluding travel for U.S.-based work).

Bachelor s degree and 13 years of experience. A degree from an accredited College/University in the applicable field of services is preferred. Four additional years of relevant experience in lieu of a college degree is required. If Degree is not in the applicable field, then four additional years of related experience is required.

5+ years demonstrated experience designing and implementing data ingestion pipelines using tools such as:

Azure Data Factory

Apache Kafka

Apache NiFi

Spark Structured Streaming

or equivalent technologies


5+ years of experience applying de-duplication techniques at scale, including:

Record linkage

Fuzzy matching

Entity resolution

across structured and unstructured datasets.


5+ years Hands-on experience with:

Data tagging

Metadata management

Tagging schemas

Data catalogs (e.g., Azure Purview, Apache Atlas)

Automated classification tools

to support data governance and lineage tracking.


5+ years Demonstrated experience working with unstructured data.

2+ years of experience using Databricks or other Spark-based platforms.

Fluency in at least one scripting language:

Python

Perl

Ruby

or equivalent


Desired Skills:

Integration of Git in continuous deployment and experience with DevOps monitoring tools.

Experience with one or more of the following products and technologies:

SAS

Python

C++

Hadoop

SQL Database/Coding

Teradata

Oracle

Amazon S3

Apache Spark

Machine Learning

Natural Language Processing (NLP)

Visualization tools such as:

Tableau

Strategy

QLIK

Strong skills and experience in Cloud Operations support in Azure.





About IDEXCEL, INC
Idexcel is an IT services organization, with a mission to bring great people and great organizations together. Our diverse client base represents a wide range of industries, including technology, telecom, insurance, healthcare, manufacturing, banking & financial services, food & commodities trading and federal organizations. Our teams of experienced recruiters directly work with client companies seeking exceptional people to help with their business initiatives. Idexcel, Inc. is an Equal Opportunity Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, disability, military status, national origin or any other characteristic protected under federal, state, or applicable local law.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: eveva001
  • Position Id: IDXL_BH_767440
  • Posted 2 days ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

2d ago

Easy Apply

Contract

$100 - $110

Remote

Today

Easy Apply

Contract

Depends on Experience

Remote

6d ago

Easy Apply

Contract

Depends on Experience

Remote

11d ago

Easy Apply

Contract

Depends on Experience

Search all similar jobs