Data Lead (RAG-based applications)

Overview

Remote
Depends on Experience
Full Time
No Travel Required
Unable to Provide Sponsorship

Skills

RAG
ETL

Job Details

Data Lead (RAG-based applications)

Full-time | Remote (US-based) - Direct Client
Work Authorization: s and  

 

The Data Lead will be responsible for designing and maintaining our data infrastructure, including ETL pipelines, vector databases, and retrieval systems for RAG-based applications. You will guide data quality, governance, and performance optimization efforts, ensuring our platform delivers accurate, scalable, and cost-efficient data-driven experiences.

 

What you'll do:

       Data Engineering: Strong SQL and Python, ETL pipeline design, and data normalization/cleaning.

       Vector Databases & Retrieval: Hands-on with Pinecone, Weaviate, Milvus, or pgvector. Knowledge of index strategies (HNSW, IVF, PQ).

       RAG (Retrieval Augmented Generation): Designing retrieval strategies (chunking, embeddings selection, reranking).

       Embedding Models: Understanding how to choose and evaluate embedding models for domain-specific tasks.

       Data Modeling & Knowledge Graphs (nice-to-have): For linking structured/unstructured data.

       Data Quality & Governance: Setting standards for metadata, access controls, lineage, and freshness.

        Performance Optimization: Benchmark and tune latency, recall/precision, and cost/performance trade-offs.

 

About you:

       6+ years in data engineering, data platform, or ML data roles.

       Strong SQL and Python skills for ETL and data workflows.

       Experience with vector databases (Pinecone, Weaviate, Milvus, pgvector).

       Proven ability to design retrieval pipelines for RAG.

       Deep understanding of embedding models and their evaluation.

       Familiarity with data quality and governance frameworks.

       Ability to optimize systems for latency, accuracy, and cost-efficiency.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.