Overview
Skills
Job Details
Azure Data Engineer (Databricks & PySpark)
- Job Title: Azure Data Engineer
- Location: Hybrid Houston, TX (Remote Dallas option with travel)
- Client: Insight Global/ Halliburton
- Experience: 5+ Years
Job Summary
We are seeking an Azure Data Engineer to join a high-impact project modernizing business data into a cloud-native architecture. This role focuses on moving digitalized business data to a modern Azure structure to deliver BI insights, advanced analytics, and AI capabilities. You will be responsible for building complex, scalable end-to-end pipelines using Azure Data Factory (ADF) and Databricks. The ideal candidate has deep experience in data ingestion, curation using PySpark, and a background in Oil & Gas (O&G), specifically working with time-series data and real-time use cases.
Key Responsibilities & Required Skills
Data Pipeline Engineering & Orchestration
- ADF Implementation: Design and develop end-to-end scalable pipelines in Azure Data Factory for seamless data ingestion into Azure Data Lake Storage (ADLS).
- Scalable ETL: Build and implement robust ETL processes to move data across environments while ensuring data integrity and high availability.
- Real-time Use Cases: Optimize pipelines to handle time-series data and support real-time analytics requirements.
Databricks & PySpark Development
- Data Curation: Build curated datasets within Databricks using PySpark and SQL, applying complex pivot logic and aggregations.
- Advanced Analytics: Enhance data management by extracting data from multiple disparate sources and transforming it for downstream AI and BI consumption.
- Coding Mastery: Write complex SQL queries from scratch and develop reusable Python/PySpark scripts for data transformation.
Technical Environment & Standards
- Cloud Modernization: Support the transition of legacy digital data to a modern cloud structure within the complete Azure ecosystem.
- Collaboration: Work closely with BI and AI teams to ensure curated data meets the requirements for advanced analytics and insights.
- Emerging Tech: Leverage or integrate with Microsoft Fabric (Nice to Have) to further streamline the data engineering workspace.
Mandatory Technical Skills
- Azure Data Factory: Experience with complex, scalable ingestion pipelines.
- Databricks: Expert-level PySpark, SQL, and curated dataset building.
- SQL: Proven ability to write high-performance queries from scratch.
Thanks,
Aditya Jain | New York Technology Partners
Email: Direct: EXT: 482