Data Bricks Data Engineer

Plano, TX, US • Posted 5 hours ago • Updated 5 hours ago
Full Time
No Travel Required
On-site
$120,000 - $140,000/yr
Fitment

Dice Job Match Score™

🧠 Analyzing your skills...

Job Details

Skills

  • Databricks
  • Data Engineer
  • Datastage

Summary

Role: “Data Bricks Data Engineer”

Location: Seattle, WA / Bellevue, WA / Everett, WA / Renton, WA / Richardson, TX / Plano, TX / Dallas, TX / St. Louis, MO / Charleston, SC / Arlington, VA (Onsite from Day 1)

Position: Fulltime

  

Job Description:

 

Must Have Technical/Functional Skills

              Awareness of IBM Data Stage ETL/ELT data integration tool to understand existing code.

              Develop , Test , Deploy ,Optimize, and monitor large-scale data processing workloads in Azure  Data Bricks ETL.

              Ensure and lead the efforts to review Legacy Data Stage legacy code and migrated Data bricks code to ensure functionality is not deviated

              Strong programming skills in Python and PySpark.

              Advanced proficiency writing SQL for analytics and ETL processes.

              Proven experience building and optimizing complex data pipelines in Azure.

              Hands-on experience with Azure Databricks: cluster management, job scheduling, workspace governance.

              Strong working knowledge of core Azure services: Storage Account, Synapse, Key Vault, VMSS, Function Apps, Web Apps, Log Analytics Workspace, service principals, and managed identities.

              Experience with container services (ACA, container instances) and containerized data workloads.

              Familiarity with Azure networking concepts and secure network integration for data platforms.

              Experience creating Azure infrastructure using ARM templates.

              Proficient with GitLab and Azure DevOps for CI/CD and source control workflows.

              Strong analytical, problem-solving, and communication skills; proven ability to work cross-functionally.

              Experience working in Agile teams and understanding of data governance frameworks.

              Hands-on experience provisioning Databricks resources with Terraform; ability to author and maintain Terraform templates and modules.

              Demonstrated experience implementing cluster autoscaling and autoscaling policies through Terraform.

              Experience creating reusable Terraform modules and implementing infrastructure-as-code best practices (module structure, state management, remote backends).

              Proven experience working on Databricks platform operations, including cluster configuration, job orchestration, and platform optimization.

              Experience configuring high-availability Databricks deployments and operating across multiple availability zones/regions.

              Familiarity with Metastore/Unity Catalog configuration and metadata governance in Databricks.

              Hands-on experience building data pipelines and ingestion workflows into medallion-layer architectures (bronze/silver/gold).

              Strong scripting skills (Python, Bash, or similar) and familiarity with CI/CD for Terraform and Databricks deployments.

              Strong troubleshooting, performance tuning, and cost optimization skills.

 

Responsibilities

              Design, develop, and maintain end-to-end data pipelines and ETL/ELT workflows using PySpark and Python.

              Ensure and lead the efforts to review Legacy Data Stage legacy code and migrated Data bricks code to ensure functionality is not deviated

              Implement, optimize, and monitor large-scale data processing workloads in Azure Databricks, including cluster configuration, autoscaling, and governance.

              Build and maintain data integration and orchestration solutions using Azure services to meet performance, availability, and security requirements.

              Collaborate with data consumers, thread authors/owners, and stakeholders to gather business requirements, prioritize needs, and translate analytical objectives into technical designs.

              Implement secure data access patterns using Azure Active Directory, Managed Identities, and service principals.

              Author Infrastructure-as-Code for Azure resources (ARM templates) and deploy consistent, repeatable environments.

              Configure and operate Azure components including Storage Account, Synapse, Key Vault, VMSS, Function Apps, Web Apps, Log Analytics Workspace, Azure Container Apps / container instances, and related services.

              Collaborate with networking and security teams to design and implement Azure networking for data solutions.

              Implement monitoring, alerting, and cost optimization for data workloads (Log Analytics, metrics, and dashboards).

              Use GitLab and Azure DevOps for source control, CI/CD pipelines, and release management.

              Follow Agile/Scrum practices and participate in sprint planning, standups, and retrospectives.

              Ensure solutions meet data governance, lineage, and compliance requirements.

              Operations Support and Oncall Support for Production Issues and Deployments.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10530321
  • Position Id: 8982238
  • Posted 5 hours ago
Contact the job poster
Amit Singh

Amit Singh

Recruiter @ Centraprise Corp
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Plano, Texas

5d ago

Easy Apply

Full-time

$110,000 - $160,000

Plano, Texas

Today

Full-time

Plano, Texas

Today

Full-time

USD 152,000.00 - 215,000.00 per year

Plano, Texas

Today

Easy Apply

Full-time

$DOE

Search all similar jobs