Data Bricks Engineer

Overview

On Site

Depends on Experience

Full Time

Accepts corp to corp applications

Skills

Communication

Data Engineering

Databricks

Apache Spark

Python

SQl

pyspark

cloud platforms

agile

git

Version Control

Job Details

Job Title: Data/Data Bricks Engineer
Client: Tech Mahindra
Location: Toronto, Canada
Experience: 8+ Years
Duration: Long term
Rate: Open DOE

Role Overview: We are seeking a highly skilled and experienced Senior Databricks/Data Engineer to join our dynamic team in Toronto. This is an onsite role that requires deep expertise in Databricks and its ecosystem, along with a strong background in building scalable and robust data pipelines. The ideal candidate will be a hands-on engineer with a passion for data, a problem-solving mindset, and the ability to drive technical solutions from concept to production.

Key Responsibilities:

Databricks Platform Expertise:Design, develop, and optimize data solutions primarily on the Databricks Unified Analytics Platform, leveraging its full capabilities including Apache Spark, Delta Lake, Unity Catalog, and Databricks SQL.
Data Pipeline Development:Lead the end-to-end design, development, and maintenance of highly scalable and robust ETL/ELT data pipelines for batch and streaming data ingestion, transformation, and delivery.
Performance Optimization:Identify and implement performance tuning strategies for Databricks workloads, Spark jobs, and data models to ensure efficiency and cost-effectiveness.
Data Architecture:Contribute to the architectural design and evolution of the data lakehouse, ensuring data quality, governance, security, and compliance best practices are integrated.
Coding and Scripting:Write clean, efficient, and well-documented code primarily in Python (PySpark) and SQL for data manipulation, transformation, and orchestration. Experience with Scala is a plus.
Cloud Integration:Work extensively with cloud platforms (Azure preferred, AWS/Google Cloud Platform experience is a bonus) and integrate Databricks with various cloud services for data storage, compute, and orchestration (e.g., Azure Data Factory, Azure Data Lake Storage, Azure Synapse).
Collaboration:Work closely with data scientists, data analysts, business stakeholders, and other engineering teams to understand data requirements, provide technical solutions, and enable advanced analytics and machine learning initiatives.
CI/CD and DevOps:Implement and promote CI/CD practices and DevOps principles for automated deployment, testing, and monitoring of data pipelines and infrastructure on Databricks.
Troubleshooting and Support:Proactively monitor data pipeline performance, troubleshoot issues, and provide timely resolutions.
Mentorship:Potentially mentor junior data engineers and contribute to best practices and knowledge sharing within the team.
Documentation:Create comprehensive technical documentation for data pipelines, architecture, and processes.

Required Skills and Qualifications:

Bachelor s or master s degree in computer science, Data Engineering, Information Technology, or a related quantitative field.
8+ years of extensive experience in Data Engineering, with a strong focus on Databricks and Apache Spark.
Exceptional proficiency in Python (PySpark) and SQLfor data processing and analysis.
Demonstrable hands-on experience with Databricks features, including Delta Lake, Databricks SQL, Databricks Jobs, and Notebooks.
Solid understanding of data warehousing concepts, data modeling (dimensional, star/snowflake schemas), and ETL/ELT principles.
Experience with cloud platforms (Azure is highly preferred), including services like Azure Data Factory, Azure Data Lake Storage, Azure Synapse Analytics, and Azure DevOps.
Strong knowledge of big data technologies and distributed computing concepts.
Experience with version control systems (e.g., Git) and agile development methodologies.
Excellent problem-solving, analytical, and critical thinking skills.
Strong communication and interpersonal skills, with the ability to articulate complex technical concepts to both technical and non-technical audiences.
Ability to work independently and collaboratively in a fast-paced, onsite environment.

Desired Skills (Good to Have)

Databricks certifications (e.g., Databricks Certified Data Engineer Associate/Professional).
Experience with streaming data technologies (e.g., Kafka, Azure Event Hubs).
Familiarity with containerization (Docker, Kubernetes).
Exposure to MLflow and machine learning concepts.
Experience with other data visualization tools (e.g., Power BI, Tableau).

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share