Overview
Skills
Job Details
Hello,
SpiceOrb is looking for Azure Databricks Developer and Lead Azure Databricks
Job Title: Senior Azure Databricks Developer - Azure Databricks/Python/Spark Streaming
Location: Pleasanton, CA (3 Days/Week Onsite)
Duration: 12+ Months Contract
Department: Data Engineering/Cloud Analytics
About the Role:
We are looking for a highly skilled Senior Azure Databricks (ADB) Developer to join our Data Engineering team. This role involves developing large-scale batch and streaming data pipelines on Azure Cloud. The ideal candidate will have strong expertise in Python, Databricks Notebooks, Apache Spark (including Structured Streaming), and real-time integration with Kafka. You will work with both relational databases like DB2 and NoSQL systems such as MongoDB, focusing on performance optimization and scalable architecture.
Key Responsibilities: Design and Develop: Create real-time and batch data pipelines using Azure Databricks, Apache Spark, and Structured Streaming.
Data Processing: Write efficient ETL scripts and automate workflows using Python.
Data Integration: Integrate with various data sources and destinations, including DB2, MongoDB, and other enterprise-grade data systems.
Performance Optimization: Tune Spark jobs for optimal performance and cost-effective compute usage on Azure.
Collaboration: Work with platform and architecture teams to ensure secure, scalable, and maintainable cloud data infrastructure.
CI/CD Support: Implement CI/CD for Databricks pipelines and notebooks using tools like GitHub and Azure DevOps.
Stakeholder Communication: Interface with product owners, data scientists, and business analysts to translate data requirements into production-ready pipelines.
Required Skills:
10+ years of experience in data engineering
Python Proficiency:
Data Manipulation: Using libraries like Pandas and NumPy for data manipulation and analysis.
Data Processing: Writing efficient ETL scripts.
Automation: Automating repetitive tasks and workflows
Debugging: Strong debugging skills to troubleshoot and optimize code
Database Management:
SQL: Advanced SQL skills for querying and managing relational databases.
NoSQL: Experience with NoSQL databases like MongoDB or Cassandra.
Data Warehousing: Knowledge of data warehousing solutions like Google BigQuery or Snowflake
Big Data Technologies:
Kafka : Knowledge of data streaming platforms like Apache Kafka.
Version Control:
Git : Using version control systems for collaborative development.
Data Modeling:
Schema Design: Designing efficient and scalable database schemas.
Data Governance: Ensuring data quality, security, and compliance
Database Management
DB2: Understanding of DB2 architecture, SQL queries, and database management
MongoDB: Knowledge of MongoDB schema design, indexing, and query optimization.
Programming Skills:
Proficiency in languages such as Java, Python, or JavaScript to write scripts for data extraction and transformation. Experience with BSON (Binary JSON) for data conversion.
Cloud Services:
Experience with cloud platforms like AWS or Azure for deploying and managing databases.
Preferred Skills:
Experience with Java or Scala in Spark streaming.
Familiarity with Azure services like Data Lake, Data Factory, Synapse, and Event Hubs.
Background in building data platforms in regulated or large-scale enterprise environments.