Overview
On Site
120k - 150k
Full Time
Skills
SaaS
Law
Recruiting
Extract
Transform
Load
Generative Artificial Intelligence (AI)
Workflow
Management
Unstructured Data
Data Science
Industry-specific
Use Cases
Data Engineering
SQL
Python
Amazon Web Services
Amazon S3
Amazon RDS
Remote Desktop Services
LangChain
LlamaIndex
Vector Databases
Data Architecture
Normalization
Legal
Professional Services
Orchestration
Prompt Engineering
Artificial Intelligence
Analytics
Collaboration
Job Details
GenAI Data Engineer
Location: Fully Remote (U.S.-based, New England preferred)
Industry: LegalTech / SaaS
Level: Mid-Level (3-6 years experience)
About Us
We're a fast-growing LegalTech and analytics SaaS company used by the nation's top law firms, law schools, and legal recruiters to track and analyze legal talent flows. Our platform helps drive smarter hiring, retention, and market insights in the legal industry using modern data infrastructure and emerging AI capabilities.
As we expand our product offering with more GenAI features, we're looking for a mid-level Data Engineer with hands-on experience in LLMs, RAG architecture, and vector databases to help us scale.
What You'll Do
Location: Fully Remote (U.S.-based, New England preferred)
Industry: LegalTech / SaaS
Level: Mid-Level (3-6 years experience)
About Us
We're a fast-growing LegalTech and analytics SaaS company used by the nation's top law firms, law schools, and legal recruiters to track and analyze legal talent flows. Our platform helps drive smarter hiring, retention, and market insights in the legal industry using modern data infrastructure and emerging AI capabilities.
As we expand our product offering with more GenAI features, we're looking for a mid-level Data Engineer with hands-on experience in LLMs, RAG architecture, and vector databases to help us scale.
What You'll Do
- Build, optimize, and maintain scalable data pipelines and ETL processes
- Implement RAG-based GenAI workflows using tools like LangChain and OpenAI
- Integrate and manage vector databases (e.g., Pinecone, Weaviate, FAISS)
- Work with structured and unstructured data to support analytics and AI-driven search
- Collaborate cross-functionally with backend engineers, product, and data science
- Support legal industry-specific use cases like entity resolution, summarization, and document classification
- 3-6 years of experience in data engineering or data infrastructure roles
- Proficiency in SQL and Python (especially for AI/data pipelines)
- Hands-on experience with AWS services (e.g., S3, Lambda, RDS, ECS)
- Experience with LLM tools and frameworks (e.g., LangChain, LlamaIndex, OpenAI APIs)
- Comfortable working with vector databases and retrieval-based AI
- Strong understanding of scalable data architecture and data normalization
- Experience with RAG pipelines in production
- Familiarity with legal data or professional services industries
- Exposure to orchestration tools like Airflow, Prefect, or similar
- Experience with embeddings, chunking strategies, and prompt engineering
- Make an impact on the future of LegalTech and AI-powered analytics
- Collaborate with a small, passionate, high-performance team
- Own meaningful projects that ship fast and evolve quickly
- 100% remote with flexible hours (New England-based candidates preferred)
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.