Lead Data Engineer

Remote • Posted 2 hours ago • Updated 2 hours ago
Full Time
Remote
$0 - $0
Company Branding Image
Fitment

Dice Job Match Score™

⏳ Almost there, hang tight...

Job Details

Skills

  • Coaching
  • Attention To Detail
  • Automated Testing
  • Business Intelligence
  • Business Process
  • Collaboration
  • Accountability
  • Adaptability
  • Agile
  • Analytics
  • Artificial Intelligence
  • Auditing
  • Communication
  • Computer Networking
  • Computer Science
  • Continuous Delivery
  • Continuous Integration
  • Data Quality
  • Data Science
  • Data Security
  • Data Structure
  • Databricks
  • Data Engineering
  • Data Flow
  • Data Governance
  • Data Modeling
  • Data Processing
  • Encryption
  • Facilitation
  • Forecasting
  • GitHub
  • IT Management
  • Information Technology
  • Design Review
  • DevOps
  • Development Testing
  • Documentation
  • Embedded Systems
  • Storage
  • Streaming
  • Root Cause Analysis
  • SQL
  • Scrum
  • Semantics
  • Python
  • RBAC
  • Regulatory Compliance
  • Optimization
  • Performance Tuning
  • Mentorship
  • Microsoft Azure
  • Microsoft Power BI
  • Offshoring
  • Operational Efficiency
  • KPI
  • Knowledge Sharing
  • PySpark
  • Release Management
  • Reporting
  • Leadership
  • Machine Learning (ML)
  • Management
  • Process Improvement
  • Productivity
  • Unstructured Data
  • Workflow
  • SLA
  • Stakeholder Management
  • Systems Analysis
  • Team Building
  • Unity
  • Use Cases

Summary

VENDORS OR C_2_C CANDIDATES WILL NOT BE CONSIDERED
Job Description:
As a Lead Data Engineer, you will own the design, development, and operationalization of data pipelines, ingestion frameworks, and Lakehouse engineering patterns across the enterprise data platform. You will serve as the primary hands-on technical lead for data engineering delivery, working closely with data, IT, and business teams and implementation partners to build production-grade pipelines that power analytics, BI, and AI/ML use cases. While this role is primarily focused on pipeline engineering, platform build-out, and technical leadership, you will also collaborate closely with the Databricks Architect on platform standards, the Data Modeler on Gold layer design, and the AI Engineer on feature engineering and model-ready data structures. This role is ideal for someone who takes pride in building reliable, scalable, and well-governed data systems, thrives on solving complex ingestion and transformation challenges, and can connect engineering decisions to measurable business outcomes including project profitability, operational efficiency, safety and quality insights, and forecasting.

Key Responsibilities
  • Pipeline Engineering & Platform Design, build, and maintain scalable batch and streaming data pipelines using Auto Loader, Delta Live Tables (DLT), and Structured Streaming on the Databricks Lakehouse, handling structured, semi-structured, and unstructured data.
  • Implement and enforce Medallion architecture patterns (Bronze, Silver, Gold) across ingestion, transformation, and serving layers, in alignment with standards set by the Databricks Architect.
  • Build and optimize multi-language notebooks and modular pipeline code (SQL, Python, PySpark) with a strong emphasis on performance, reusability, maintainability, and testability.
  • Collaborate with the Data Modeler to design and deliver the Gold layer, ensuring data structures are optimized for BI reporting, analytics, and AI/ML consumption.
  • Collaborate with the AI Engineer to build feature engineering pipelines and model-ready datasets that integrate with the Databricks Feature Store and downstream inference workflows.
  • Partner with source system owners, application designers, and data architects to understand upstream data flows and ensure ingestion patterns reliably reflect key business processes.
  • Delivery & Operations Lead end-to-end delivery of data engineering workloads from requirements and source system analysis through pipeline development, testing, deployment, and production operations.
  • Serve as the first point of contact for technical challenges related to pipeline development and production workloads, including triage, root cause analysis, and remediation.
  • Provide technical leadership across hybrid teams including internal engineers, consulting partners, and offshore resources, ensuring clear accountability and consistent engineering outcomes.
  • Establish and champion data engineering standards CI/CD, automated testing, observability, monitoring and alerting, and incident runbooks to ensure production-grade delivery and operations.
  • Design and review pipelines and jobs for robustness, recoverability, and SLA adherence, proactively identifying and resolving performance and reliability issues before they impact downstream consumers.
  • Deliver work within Agile/Scrum practices and software release cycles, managing workstreams across tight deadlines with a high degree of organization and follow-through.

Data Quality & Governance
  • Implement data quality frameworks completeness, validity, consistency, timeliness, and accuracy embedded directly into pipeline logic and enforced across the Medallion layers.
  • Apply and maintain data security controls including row-level security (RLS), column masking, and data encryption across data stores and pipeline layers, in alignment with Unity Catalog and enterprise security standards.
  • Support data lineage, auditing, and access governance practices in Unity Catalog, working with the Databricks Architect and security stakeholders to ensure compliance with regulatory and contractual requirements.
  • Identify, resolve, and escalate data quality issues including root cause analysis and implementation of preventative controls to avoid recurrence.

Stakeholder Management & Communication
  • Collaborate with business leaders, data owners, and domain SMEs to understand source systems, data flows, and priority use cases translating findings into clear engineering requirements and delivery plans.
  • Participate in elaboration and design review sessions, facilitating technical discussions with cross-functional teams and communicating pipeline design and data structure decisions in clear, accessible language.
  • Provide regular updates on engineering progress, risks, and blockers to technical leadership with clear options, mitigations, and recommendations.
  • Demonstrated experience leading technical code reviews for SQL and PySpark, and driving end-to-end process improvements that strengthen data engineering practices, governance, and delivery efficiency.

Coaching, Standards & Reuse
  • Define and evangelize pragmatic data engineering standards including coding patterns, pipeline design, data quality practices, cost optimization, and workspace conventions.
  • Mentor engineers through design and code reviews, knowledge-sharing sessions, and pairing on complex ingestion and transformation challenges.
  • Develop and maintain reusable assets pipeline templates, ingestion frameworks, DLT patterns, data quality rulesets, and runbooks to accelerate delivery and ensure consistency across the platform.
  • Demonstrate proactive ownership: surface risks early, drive decisions, and remove blockers to maintain delivery momentum and stakeholder confidence.

Qualifications:
  • Bachelor's degree in Computer Science, Information Technology, Data Science, Engineering, or equivalent practical experience.
  • 5 years of hands-on experience in data engineering roles, with 2 years in a technical lead or senior individual contributor capacity accountable for end-to-end pipeline delivery and engineering standards.
  • Proven experience designing and building production-grade data pipelines on Databricks, including Auto Loader, Delta Live Tables (DLT), and Structured Streaming across structured, semi-structured, and unstructured data.
  • Hands-on expertise implementing Databricks on Microsoft Azure (required), including familiarity with Azure security, networking, identity, and storage patterns.
  • Proficiency in Python (including PySpark) and SQL, with strong command of performance tuning and best practices for large-scale distributed data processing.
  • Demonstrated experience implementing Medallion architecture patterns and working with Delta Lake, Unity Catalog, and related data governance capabilities (RBAC, lineage, auditing).
  • 2 years of experience applying systematic approaches to data quality including completeness, validity, consistency, timeliness, and accuracy embedded directly into pipeline and platform design.
  • Demonstrated experience using AI-enabled platforms and LLM tools (e.g., Copilot, ChatGPT, Gemini) for data engineering productivity and modernization use cases.
  • Good knowledge of CI/CD practices and deployment processes for data engineering workloads, including automated testing and release management using tools such as Azure DevOps or GitHub Actions.
  • 3 years of experience delivering work with Agile/Scrum practices and software release cycles, managing client deliverables on tight deadlines. Prior experience on projects of similar size and scope, including working independently and as part of hybrid teams with internal, partner, and offshore resources.
  • Demonstrated ability to attribute business value and outcomes to specific pipeline deliverables and technical KPIs.

Nice to have
  • One or more Databricks certifications (e.g., Databricks Certified Data Engineer Professional).
  • Experience with Power BI semantic models and the ability to collaborate effectively with BI and data modeling teams on pipeline outputs that feed reporting layers.
  • Familiarity with ML pipeline patterns, feature stores, or AI/ML data preparation workflows on Databricks.
  • Background in construction, engineering, or infrastructure industry data and business processes.

Skills and Competencies

Technical leadership: Ability to set engineering direction, make trade-off decisions, and guide engineers and partners toward production-ready pipeline solutions.

Stakeholder management: Ability to manage expectations, influence decisions, and build trusted relationships with both technical and non-technical stakeholders across data, IT, business, and security teams.

Communication: Excellent written and verbal communication skills; able to explain complex pipeline design, data flows, and technical trade-offs in clear, outcome-oriented language.

Organization & execution: Exceptional organizational and time management skills; able to manage multiple pipeline workstreams simultaneously and drive them to completion on tight deadlines.

Attention to detail & quality: High standards for pipeline robustness, data quality, reliability, security, and documentation.

Team building & mentoring: Strong leadership and team-building capabilities; experience mentoring engineers, establishing data engineering best practices, and raising the technical bar of the broader data organization.

Adaptability: Ability to adapt to changing priorities, work under pressure, and thrive in a fast-paced, evolving data platform environment.

Ownership mindset: Bias toward action, proactive identification of issues, and a strong sense of accountability for outcomes rather than just deliverables.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: ylcon
  • Position Id: 9003009
  • Posted 2 hours ago

Company Info

About Y & L Consulting Inc.

Y&L Consulting is a provider of the highest-caliber local and international IT professionals, positioned to help you maximize profitability and growth.

As a global corporation with local presence and sensitivity, Y&L offers IT services and support primarily to medium and large companies. Knowledgeable and proficient in all primary technologies, we customize programs to fit your current business needs while maintaining the flexibility to ramp up or down as your situations change. By avoiding stringent service provider contracts and rigid methodologies, you enjoy flexible capabilities, local relationships, international resources, and significant cost savings.

IT Consulting San Antonio Expertise

Design, architecture, development, and maintenance of custom applications using ATG, MS, .NET, Java/J2EE, and other popular technologies
Implementation, upgrade, and maintenance for packaged application solutions SAP, PeopleSoft, Oracle Apps, Cognos, and Actuate
Quality Assurance and Testing Services Mercury partner
Programmers and Database Administrators
In addition, we have a number of highly-qualified Project Managers, Business Analysts, and Technical Writers on our bench.

Contact the job poster
Michael Tennell

Michael Tennell

Recruiter @ Y & L Consulting Inc.
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

It looks like there aren't any Similar Jobs for this job yet.

Search all similar jobs