Data Engineer

Hybrid in Cranberry Township, PA, US • Posted 30+ days ago • Updated 14 days ago
Full Time
Hybrid
$90000/yr
Company Branding Image
Fitment

Dice Job Match Score™

🔗 Matching skills to job...

Job Details

Skills

Summary

This is a full-time, direct-hire Data Engineer role focused on Microsoft Fabric and Azure data engineering, based about 20 minutes north of Downtown Pittsburgh with a hybrid schedule and local residency required.

Position overview

We are seeking a Data Engineer to design, build, and maintain modern enterprise data platforms using Microsoft Fabric and Azure services in support of advanced analytics, real-time reporting, and AI/ML use cases. This is a direct-hire opportunity located approximately 20 minutes north of Downtown Pittsburgh, offering a hybrid work schedule (on-site several days per week). Candidates must currently reside in the greater Pittsburgh area or be willing to relocate prior to starting; this role is not open to C2C, third parties, or visa sponsorship.

Key responsibilities

  • Design, develop, and maintain scalable, production-grade data pipelines and integrations using Microsoft Fabric, Azure Data Factory, Fabric Data Factory, Azure Databricks, Azure Event Hubs, OneLake, Fabric Lakehouse, and Fabric Data Warehouse.

  • Build analytics-ready datasets to support pricing, supply chain, POS sales, customer behavior analytics, executive dashboards, and AI/ML workloads.

  • Implement dual-engine data pipelines leveraging Azure Data Factory for structured batch workloads and Azure Event Hubs / Kafka for real-time event ingestion.

  • Support multiple ingestion patterns including batch ETL/ELT, CDC/database mirroring, streaming ingestion, API-based integrations, and SaaS connectors.

  • Develop near real-time analytics solutions using Eventstream and Real-Time Intelligence capabilities in Microsoft Fabric.

  • Design and optimize PySpark workloads in Azure Databricks and Fabric Spark to process high-volume historical datasets, XML/JSON log files, streaming transactional events, and operational telemetry data.

  • Build scalable transformation logic that supports both streaming and batch architectures.

  • Model and transform enterprise data using ANSI SQL, T-SQL, dbt, and Lakehouse design principles.

  • Design star and snowflake schemas, semantic models, and curated analytical datasets to enable governed self-service analytics across the organization.

  • Maintain and optimize Azure Data Lake Storage Gen2 environments, including Delta Lake formats, ACID-compliant patterns, schema evolution, partitioning, and performance tuning.

  • Support enterprise Lakehouse architecture leveraging Microsoft Fabric OneLake.

  • Partner closely with Analytics and Business stakeholders to deliver Power BI dashboards, executive scorecards, KPI reporting, and self-service analytics solutions, including semantic models, Direct Lake datasets, row-level security, and data governance standards.

  • Enable Copilot-driven analytics and AI-assisted reporting capabilities on top of governed datasets.

  • Deploy and manage cloud infrastructure using Terraform, Azure Resource Manager (ARM), and Infrastructure-as-Code practices.

  • Automate CI/CD workflows for data pipelines and analytics assets using Azure DevOps, Git, and Docker.

  • Orchestrate and schedule enterprise workflows with Azure Data Factory, Fabric Pipelines, Managed Apache Airflow, and Control-M (where applicable).

  • Implement robust data observability, including automated monitoring and alerting for batch failures, streaming interruptions, data quality issues, schema drift, and pipeline latency.

  • Build checksum and reconciliation frameworks between source systems and analytics platforms to support enterprise data governance and operational resiliency initiatives.

Required qualifications

  • Local to the Pittsburgh region and able to work in a hybrid on-site schedule approximately 20 minutes north of Downtown Pittsburgh; no relocation assistance, C2C, third parties, or visa sponsorship is available for this role.

  • 3–5 years of hands-on experience in data engineering, cloud analytics, or enterprise data platforms with a strong focus on Azure services.

  • Proven experience with:

    • Microsoft Azure and Microsoft Fabric

    • Azure Data Lake Storage Gen2 (ADLS Gen2)

    • Azure Databricks and Fabric Spark (including Spark Structured Streaming)

    • Azure Data Factory and Fabric Data Factory

    • Azure Event Hubs and Kafka for real-time ingestion

    • Azure Synapse Analytics and/or Fabric Data Warehouse

  • Strong proficiency in:

    • Python and PySpark

    • ANSI SQL and T-SQL

    • Batch and streaming data processing (Spark Structured Streaming, Azure Stream Analytics, event-driven architectures)

  • Hands-on experience with:

    • dbt (Data Build Tool)

    • Delta Lake and Lakehouse architectures

    • Data warehousing concepts and dimensional modeling (star and snowflake schemas)

    • Semantic layer design and enterprise data modeling

  • Demonstrated ability to build and support production data pipelines, troubleshoot performance issues, and optimize large-scale data processing workloads.

Preferred qualifications

  • Bachelor’s or Master’s degree in Data Science, Computer Science, Information Systems, Engineering, Statistics, Mathematics, or a related technical field.

  • Experience delivering and supporting Power BI analytics, including semantic models, Direct Lake datasets, and row-level security.

  • Hands-on work with Fabric Real-Time Intelligence, OneLake, REST APIs, XML/JSON processing, and event-driven architectures.

  • Exposure to AI/ML workloads and tools such as Azure OpenAI, Copilot integrations, and predictive analytics solutions.

  • Experience supporting large-scale enterprise analytics environments with complex operational datasets and strict SLAs.

Work authorization and engagement terms

  • Full-time, salaried, direct-hire position with the client organization (not contract or contract-to-hire).

  • Candidates must be currently authorized to work in the United States on a permanent basis; we are unable to provide visa sponsorship or work with C2C arrangements or third-party agencies for this role

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10115369
  • Position Id: 478000
  • Posted 30+ days ago

Company Info

About Vaco by Highspring

Vaco is the Talent Solutions division of Highspring, a leading global professional services organization. With expertise in Accounting and Finance, Technology and Digital, and Human Resources and Operations, Vaco provides Contract Staffing and Direct Hire solutions. Vaco’s parent company, Highspring, helps clients with two additional integrated service offerings: Consulting and Managed Services. With more than 10,000 employees across more than 45 offices worldwide, Highspring gives partners the agility to thrive, address challenges, and seize opportunities in a rapidly changing world. Get to know us at vaco.com. 

About_Company_OneAbout_Company_Two
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Easy Apply

Full-time

$170000

Remote

Today

Easy Apply

Contract

$110

Hybrid in Pennsylvania

12d ago

Easy Apply

Full-time

$130000

Remote

Today

Easy Apply

Contract

$130

Search all similar jobs