Lead Data Engineer (AI/ML)

Hybrid in Carolina Beach, NC, US • Posted 22 hours ago • Updated 22 hours ago
Contract W2
24 Months
No Travel Required
On-site
$55 - $60/hr
Fitment

Dice Job Match Score™

👾 Reticulating splines...

Job Details

Skills

  • Continuous Integration
  • Apache Spark
  • Artificial Intelligence
  • Decision-making
  • Continuous Improvement
  • Cloud Computing
  • Collaboration
  • Computer Science
  • Data Science
  • Continuous Delivery
  • Data Architecture
  • Data Engineering
  • Data Validation
  • Advanced Analytics
  • Agile
  • Microsoft Azure
  • Python
  • Real-time
  • PySpark
  • Retail
  • Machine Learning Operations (ML Ops)
  • Machine Learning (ML)
  • Health Care
  • Information Systems
  • Software Modernization
  • Semantics
  • Mentorship
  • Migration
  • Data Warehouse
  • Databricks
  • DevOps
  • Generative Artificial Intelligence (AI)
  • Software Design
  • Star Schema
  • Streaming
  • Neo4j
  • Scalability
  • Good Clinical Practice
  • Google Cloud Platform
  • Graph Databases
  • Scrum
  • ELT
  • Extract, Transform, Load
  • Financial Services
  • Vertex
  • Large Language Models (LLMs)
  • Snow Flake Schema
  • Amazon Web Services
  • Analytics
  • Workflow
  • Data Modeling
  • Banking
  • Database
  • Google Cloud
  • Sprint
  • Training
  • SQL
  • ADF
  • Web Services

Summary

Job Title: Lead Data Engineer (AI/ML)

Location: Charlotte, NC (Hybrid/Onsite)
Contract: W2
Duration: 12+ Months


Position Summary

We are seeking a highly experienced Lead Data Engineer with 12+ years of experience in enterprise data engineering and AI/ML data platforms. The ideal candidate will lead the design, development, and implementation of scalable data architectures, cloud-native data platforms, and AI/ML data pipelines supporting advanced analytics, Generative AI, and Machine Learning initiatives.

This role requires strong expertise in Python, PySpark, Spark, Snowflake, Databricks, Azure/AWS/Google Cloud Platform, AI/ML frameworks, data lakes, ETL/ELT, MLOps, and modern cloud technologies. The candidate will collaborate with Data Scientists, ML Engineers, Architects, DevOps, and business stakeholders to deliver enterprise-grade AI-enabled data solutions.


Key Responsibilities

  • Lead the design and implementation of enterprise-scale data engineering and AI/ML data platforms.
  • Architect scalable batch and real-time data pipelines supporting analytics and machine learning workloads.
  • Build and optimize cloud-native data lakes, data warehouses, and Lakehouse architectures.
  • Design and implement ETL/ELT pipelines using modern cloud technologies.
  • Develop feature engineering pipelines supporting ML model training and inference.
  • Build scalable data pipelines for Large Language Models (LLMs), Generative AI, and AI-powered applications.
  • Design data ingestion frameworks for structured, semi-structured, and unstructured datasets.
  • Implement data validation, profiling, governance, lineage, and quality monitoring solutions.
  • Optimize Spark, SQL, and distributed processing workloads for performance and scalability.
  • Lead cloud migration and application modernization initiatives.
  • Collaborate with Data Scientists and ML Engineers to productionize AI/ML models.
  • Build and maintain MLOps pipelines for automated model deployment, monitoring, and retraining.
  • Integrate AI-powered solutions using OpenAI, Azure OpenAI, AWS Bedrock, or Vertex AI.
  • Implement CI/CD pipelines for data engineering and machine learning workflows.
  • Mentor junior engineers and establish engineering best practices.
  • Participate in architecture reviews, code reviews, and technical decision-making.

Required Technical Skills

Programming Languages

  • Python
  • SQL
  • PySpark
  • Scala
  • Java

Big Data Technologies

  • Apache Spark
  • PySpark
  • Hadoop
  • Hive
  • Kafka
  • Delta Lake
  • Apache Airflow

Cloud Platforms

  • Microsoft Azure
  • Amazon Web Services (AWS)
  • Google Cloud Platform (Google Cloud Platform)

Data Engineering

  • Snowflake
  • Databricks
  • Azure Data Factory (ADF)
  • Azure Synapse Analytics
  • AWS Glue
  • Amazon Redshift
  • Google BigQuery
  • dbt
  • Informatica
  • Matillion

AI / Machine Learning

  • Machine Learning Pipelines
  • Feature Engineering
  • Model Training Pipelines
  • Model Deployment
  • Model Monitoring
  • MLOps
  • MLflow
  • Kubeflow
  • SageMaker
  • Azure ML
  • Vertex AI

Generative AI

  • OpenAI APIs
  • Azure OpenAI
  • AWS Bedrock
  • LangChain
  • LlamaIndex
  • Vector Databases (Pinecone, FAISS, ChromaDB)
  • RAG (Retrieval-Augmented Generation)
  • Prompt Engineering
  • AI Agents
  • MCP (Model Context Protocol)

Streaming Technologies

  • Kafka
  • Spark Streaming
  • Azure Event Hub
  • AWS Kinesis

Databases

  • SQL Server
  • PostgreSQL
  • Oracle
  • MongoDB
  • Cassandra
  • NoSQL
  • Cosmos DB

DevOps / CI/CD

  • Azure DevOps
  • GitHub Actions
  • Jenkins
  • GitLab CI/CD
  • Docker
  • Kubernetes
  • Terraform

Data Governance

  • Collibra
  • Alation
  • Microsoft Purview
  • Apache Atlas
  • Data Lineage
  • Metadata Management
  • Data Catalog

BI & Analytics

  • Power BI
  • Tableau
  • Looker

Leadership Responsibilities

  • Lead and mentor a team of Data Engineers and ML Engineers.
  • Drive enterprise AI and Data Engineering strategy.
  • Define data architecture standards and engineering best practices.
  • Conduct architecture and code reviews.
  • Collaborate with enterprise architects, business leaders, and product owners.
  • Lead Agile ceremonies, sprint planning, and technical estimations.
  • Drive continuous improvement initiatives across data engineering and AI platforms.
  • Ensure security, scalability, reliability, and governance of enterprise data assets.

Required Qualifications

  • Bachelor''s or Master''s degree in Computer Science, Data Science, Information Systems, Engineering, or a related field.
  • 12+ years of Data Engineering experience.
  • 8+ years of Python and SQL development.
  • 6+ years of PySpark and Spark development.
  • 5+ years of Snowflake or Databricks experience.
  • 5+ years of cloud platform experience (Azure, AWS, or Google Cloud Platform).
  • 4+ years of AI/ML data engineering experience.
  • Strong experience developing enterprise ETL/ELT pipelines.
  • Experience implementing Lakehouse architectures.
  • Hands-on experience with MLOps platforms and AI model deployment.
  • Strong understanding of Data Modeling (Star Schema, Snowflake Schema, Data Vault).
  • Experience supporting enterprise AI initiatives.
  • Strong Agile/Scrum experience.

Preferred Qualifications

  • Experience with Large Language Models (LLMs).
  • Experience building RAG-based applications.
  • Experience with AI Agents and autonomous workflows.
  • Knowledge of Agentic AI architectures.
  • Experience with graph databases (Neo4j).
  • Experience with vector search and semantic retrieval.
  • SnowPro, Databricks, Azure, AWS, Google Cloud Platform, or AI/ML certifications.
  • Financial Services, Banking, Healthcare, or Retail domain experience.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10477291
  • Position Id: 9014056
  • Posted 22 hours ago
Contact the job poster
RC

Rahul Chourasia

Recruiter @ Raas Infotek LLC
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote or Minnesota

Today

Full-time

USD 188,251.00 - 230,084.00 per year

Illinois

Today

Full-time

USD 150,000.00 - 224,000.00 per year

Remote

19d ago

Easy Apply

Contract

Depends on Experience

No location provided

Today

Full-time

USD 94,900.00 - 135,600.00 per year

Search all similar jobs