Databricks Architect

Remote • Posted 4 hours ago • Updated 4 hours ago
Full Time
Remote
Depends on Experience
Company Branding Image
Fitment

Dice Job Match Score™

⭐ Evaluating experience...

Job Details

Skills

  • Data Architecture
  • Data Lake
  • Data Engineering
  • Databricks
  • SQL
  • PySpark

Summary

Job Title: Data Architect

Location: Remote

Duration: 6 months contract to hire

The Databricks Data Architect is a senior technical lead responsible for building and optimizing a robust data platform in a financial services environment. You will lead a team of 10+ data engineers and own the end-to-end architecture and implementation of the Databricks Lakehouse platform. You will collaborate closely with application development and analytics teams to design scalable data solutions that drive business insights. This position demands deep expertise in Databricks (Azure), hands-on experience with PySpark and Delta Lake, and strong leadership to ensure best practices in data engineering, performance tuning, and governance.

Key Responsibilities:

  • Lead Data Engineering Team: Lead, mentor, and manage a team of 10+ data engineers, providing technical guidance, code reviews, and career development to foster a high-performing team.
  • Databricks Platform Ownership: Own the Databricks platform architecture and implementation, ensuring the environment is secure, scalable, and optimized for the organization s data processing needs. Design and oversee the Lakehouse architecture leveraging Delta Lake and Apache Spark.
  • Unity Catalog Implementation: Implement and manage Databricks Unity Catalog for unified data governance. Ensure fine-grained access controls and data lineage tracking are in place to secure sensitive financial data and comply with industry regulations.
  • Cluster Provisioning & Policies: Provision and administer Databricks clusters (in Azure), including configuring cluster sizes, auto-scaling, and auto-termination settings. Set up and enforce cluster policies to standardize configurations, optimize resource usage, and control costs across different teams and projects.
  • Databricks SQL Optimization: Collaborate with analytics teams to develop and optimize Databricks SQL queries and dashboards. Tune SQL workloads and caching strategies for faster performance and ensure efficient use of the query engine.
  • Performance Tuning: Lead performance tuning initiatives for Spark jobs and ETL pipelines. Profile data processing code (PySpark/Scala) to identify bottlenecks and refactor for improved throughput and lower latency. Implement best practices for incremental data processing with Delta Lake, and ensure compute cost efficiency (e.g., by optimizing cluster utilization and job scheduling).
  • Data Solutions Collaboration: Work closely with application developers, data analysts, and data scientists to understand requirements and translate them into robust data pipelines and solutions. Ensure that data architectures support analytics, reporting, and machine learning use cases effectively.
  • DevOps Integration: Integrate Databricks workflows into the CI/CD pipeline using Azure DevOps and Git. Develop automated deployment processes for notebooks, jobs, and clusters (infrastructure-as-code) to promote consistent releases. Manage source control for Databricks code (using Git integration) and collaborate with DevOps engineers to implement continuous integration and delivery for data projects.
  • Data Governance & Security: Collaborate with security and compliance teams to uphold data governance standards. Implement data masking, encryption, and audit logging as needed, leveraging Unity Catalog and Azure security features to protect sensitive financial data.
  • Platform Innovation: Stay up-to-date with the latest Databricks features and industry best practices. Proactively recommend and implement improvements (such as new performance optimization techniques or cost-saving configurations) to continuously enhance the platform s reliability and efficiency.

Minimum Qualifications:

  • Education & Experience: Bachelor s degree in Computer Science, Information Systems, or a related field. 7+ years of experience in data engineering, data architecture, or related roles, with a track record of designing and deploying data pipelines and platforms at scale.
  • Databricks & Spark Expertise: Significant hands-on experience with Databricks (preferably Azure Databricks) and the Apache Spark ecosystem. Proficient in building data pipelines using PySpark/Scala and managing data in Delta Lake format.
  • Cloud Platform: Strong experience working with cloud data platforms (Azure preferred, or AWS/Google Cloud Platform). Familiarity with Azure data services (such as Azure Data Lake Storage, Azure Blob Storage, etc.) and managing resources in an Azure environment.
  • SQL and Data Warehousing: Advanced SQL skills with the ability to write and optimize complex queries. Solid understanding of data warehousing concepts and performance tuning for SQL engines.
  • Job Optimization & Performance: Proven ability to optimize ETL jobs and Spark processes for performance and cost efficiency. Experience tuning cluster configurations, parallelism, and caching to improve job runtimes and resource utilization.
  • Unity Catalog & Security: Demonstrated experience implementing data security and governance measures. Comfortable configuring Unity Catalog or similar data catalog tools to manage schemas, tables, and fine-grained access controls. Able to ensure compliance with data security standards and manage user/group access to data assets.
  • Leadership Skills: Experience leading and mentoring engineering teams. Excellent project leadership abilities to coordinate multiple projects and priorities. Strong communication skills to effectively collaborate with cross-functional teams and present architectural plans or results to stakeholders.
  • Problem-Solving: Excellent analytical and problem-solving skills, with the ability to troubleshoot complex data pipeline issues quickly. Attention to detail in maintaining data quality and reliability across the platform.

Equal Opportunity Statement:

  • We are committed to diversity and inclusivity.

If interested in applying kindly send your resume to

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90891546
  • Position Id: 9000522
  • Posted 4 hours ago

Company Info

About NewVision Software & Consultancy Pvt. Ltd

Our Approach
How we transform a leader’s digital investment to drive optimum value
Our team works with diverse functions and leaders across the enterprise hierarchy. We leverage diverse digital competencies to foster three categories of opportunities – growth, scalability, and optimal performance, with the end goal of delivering value against their digital investments.

Contact the job poster
AG

Ajinkya Gunjal

Recruiter @ NewVision Software & Consultancy Pvt. Ltd
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

It looks like there aren't any Similar Jobs for this job yet.

Search all similar jobs