Overview
Remote
Depends on Experience
Contract - W2
Contract - Independent
Contract - 6 Month(s)
Skills
Data Science / Machine Learning
Generative AI / NLP / LLM
Job Details
- We are seeking an experienced Data Scientist with deep expertise in Generative AI and a strong background in the Insurance domain to join our advanced analytics team. The ideal candidate will focus on building and optimizing classification models and information extraction systems, enabling automation and intelligence across insurance workflows such as claims processing, underwriting, and document analysis.
- You will collaborate closely with data engineers responsible for orchestration, data pipelines, and storage infrastructure, allowing you to focus on model innovation, experimentation, and accuracy.
Key Responsibilities:
- Design, build, and deploy machine learning classifiers and information extraction models tailored for insurance use cases (e.g., claims triage, policy text extraction, risk categorization).
- Develop and fine-tune Generative AI and LLM-based models (e.g., using LangChain, LlamaIndex, or custom embeddings) to extract insights from unstructured data such as claim notes, policy documents, and correspondence.
- Collaborate with ML engineers to integrate models into production workflows while maintaining scalability and reliability.
- Evaluate and improve model precision, recall, and interpretability using rigorous statistical and ML methodologies.
- Partner with domain experts to identify patterns, define taxonomies, and establish ground truth datasets for supervised and semi-supervised training.
- Build and maintain evaluation frameworks for model performance tracking and continuous improvement.
- Work alongside orchestration and data storage teams to ensure clean, accessible data pipelines that support rapid experimentation.
- Stay current on advancements in GenAI, document understanding, and insurance analytics to guide internal innovation.
Required Skills & Experience:
- 6+ years of professional experience in Data Science / Machine Learning, with 2+ years in Generative AI / NLP / LLM projects.
- Strong experience developing classification and entity extraction models (NER, text classification, OCR-based extraction).
- Proven track record working with insurance datasets claims, underwriting, or policy text.
- Proficiency in Python and ML/NLP libraries such as PyTorch, TensorFlow, Hugging Face, spaCy, Scikit-learn.
- Familiarity with vector databases (Pinecone, FAISS, Chroma) and LLM frameworks (LangChain, LlamaIndex, RAG).
- Solid grasp of data modeling and feature engineering for structured and unstructured data.
- Experience with model evaluation, MLOps collaboration, and versioning tools (MLflow, DVC, etc.).
- Excellent communication and documentation skills, especially translating model results into business impact for insurance stakeholders.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.