Role: AI Data Engineer
Location: Rockville, MD (3 days onsite / 2 days remote)
Duration: 6 months (high likelihood of extension)
Job Description.
Interview Process: 2 rounds Offer
Note: Preference for candidates local to Rockville or Tysons. Relocation to these locations is acceptable.
We are looking for an AI Data Engineer to design and implement data pipelines and retrieval systems for a generative AI platform. This role focuses on ingesting, transforming, and indexing domain-specific data to enable accurate and context-aware AI responses.
You will work closely with AI/agent developers and platform engineering teams to continuously enhance retrieval quality and knowledge coverage.
Key Responsibilities
Data Engineering & ETL
Design and build scalable ETL pipelines for structured and unstructured data ingestion
Develop robust workflows for document parsing, transformation, and large-scale data loading
Implement data validation and quality checks to ensure accuracy and completeness
Leverage AWS services such as S3, Lambda, Step Functions, OpenSearch, and Bedrock
RAG Pipeline Development & Search Optimization
Architect and optimize Retrieval-Augmented Generation (RAG) pipelines
Define document chunking strategies and generate vector embeddings
Improve retrieval quality through tuning ranking, filtering, and hybrid search approaches
Evaluate performance using benchmarks and retrieval accuracy metrics
Experiment with embedding models and retrieval techniques to enhance response quality
Quality Engineering & Testing
Design test strategies for validating data pipelines and ingestion accuracy
Develop automated regression tests to monitor retrieval performance
Build evaluation frameworks to measure precision, recall, and relevance
Promote test-driven development (TDD) practices
Generative AI & Innovation
Stay updated with advancements in RAG, embeddings, and retrieval systems
Explore new approaches such as hybrid search, reranking, and contextual retrieval
Collaborate with AI developers to ensure high-quality, contextually relevant outputs
Security & Compliance
Follow secure coding practices, especially when handling sensitive and PII data
Ensure adherence to organizational security policies and compliance requirements
Participate in threat modeling and secure system design discussions