Sr Data Engineer (Data Bricks)

Overview

Remote
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 8 Month(s)

Skills

Implement ETL/ELT workflows for both structured and unstructured data
Automate deployments using CI/CD tools
cross-functional teams including data scientists
analysts
and stakeholders
data models
schemas
database structures to support analytical and operational use cases
Evaluate and implement appropriate data storage solutions
including data lakes (Azure Data Lake Storage) and data warehouses
Implement data validation and quality checks to ensure accuracy and consistency
metadata management
data lineage
data cataloging
data security measures
Python and R programming languages
Strong SQL querying and data manipulation skills
Experience with Azure cloud platform
DevOps
CI/CD pipelines
and version control systems
agile
multicultural environments
Strong troubleshooting and debugging capabilities
Design and develop scalable data pipelines using Apache Spark on Databricks
Optimize Spark jobs for performance and cost-efficiency
Integrate Databricks solutions with cloud services (Azure Data Factory)
Ensure data quality
governance
security using Unity Catalog or Delta Lake
Deep understanding of Apache Spark architecture
RDDs
DataFrames
and Spark SQL
Databricks notebooks
clusters
jobs
Delta Lake
ML libraries (MLflow Scikit-learn TensorFlow)
Databricks Certified Associate Developer for Apache Spark
Azure Data Engineer Associate

Job Details

Hi,

Greetings from DIA SOFTWARE SOLUTIONS LLC!

We reaching out about an exciting Direct client opportunity with one of our clients. Please review the requirements and let me know if you are interested in this position?

Direct client Req:: Need Sr Data Engineer (Data Bricks) , Remote, TX

PLEASE SEND THE RESUMES TO SKUMAR AT DIASOFTWARESOLUTIONS DOT COM !

Job Description:

  • The Worker is responsible for developing, maintaining, and optimizing big data solutions using the Databricks Unified Analytics Platform.
  • This role supports data engineering, machine learning, and analytics initiatives within this organization that relies on large-scale data processing.

Duties include:

  • Designing and developing scalable data pipelines
  • Implementing ETL/ELT workflows
  • Optimizing Spark jobs
  • Integrating with Azure Data Factory
  • Automating deployments
  • Collaborating with cross-functional teams
  • Ensuring data quality, governance, and security.

SKILLS MATRIX

Minimum Requirements: Candidates that do not meet or exceed the minimum stated requirements (skills/experience) will be displayed to customers but may not be chosen for this opportunity.

Actual
Years
Experience

Years
Experience
Needed

Required/
Preferred

Skills/Experience

8

Required

Implement ETL/ELT workflows for both structured and unstructured data

8

Required

Automate deployments using CI/CD tools

8

Required

Collaborate with cross-functional teams including data scientists, analysts, and stakeholders

8

Required

Design and maintain data models, schemas, and database structures to support analytical and operational use cases

8

Required

Evaluate and implement appropriate data storage solutions, including data lakes (Azure Data Lake Storage) and data warehouses

8

Required

Implement data validation and quality checks to ensure accuracy and consistency

8

Required

Contribute to data governance initiatives, including metadata management, data lineage, and data cataloging

8

Required

Implement data security measures, including encryption, access controls, and auditing; ensure compliance with regulations and best practices

8

Required

Proficiency in Python and R programming languages

8

Required

Strong SQL querying and data manipulation skills

8

Required

Experience with Azure cloud platform

8

Required

Experience with DevOps, CI/CD pipelines, and version control systems

8

Required

Working in agile, multicultural environments

8

Required

Strong troubleshooting and debugging capabilities

5

Required

Design and develop scalable data pipelines using Apache Spark on Databricks

5

Required

Optimize Spark jobs for performance and cost-efficiency

5

Required

Integrate Databricks solutions with cloud services (Azure Data Factory)

5

Required

Ensure data quality, governance, and security using Unity Catalog or Delta Lake

5

Required

Deep understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL

5

Required

Hands-on experience with Databricks notebooks, clusters, jobs, and Delta Lake

1

Preferred

Knowledge of ML libraries (MLflow, Scikit-learn, TensorFlow)

1

Preferred

Databricks Certified Associate Developer for Apache Spark

1

Preferred

Azure Data Engineer Associate

DIA SOFTWARE SOLUTIONS LLC.

Austin, TX 78727| Direct:

DIA SOFTWARE SOLUTIONS is an Affirmative Action/Equal Opportunity Employer that supports workplace diversity. All employment decisions are made without regard to race, color, religion, sex, national origin, age, disability, veteran status, marital or family status, sexual orientation, gender identity, or genetic information. All Diasoft staff must be able to demonstrate the legal right to work in the United States. DIA SOFTWARE SOLUTIONS is an E-Verify employer

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Dia Software Solutions