Principal Data Scientist

Overview

Remote
Depends on Experience
Accepts corp to corp applications
Contract - W2

Skills

data science
machine learning
GenAI
Medical Domain
GraphRAG
Multi model agents

Job Details

  • Job Role - Principal Data Scientist
    Location - Minneapolis, MN(Remote)

    Job Details:

    Lead endtoend training and fine-tuning of Large Language Models LLMs, including both open-source e.g., Qwen, LLaMA, Mistral and closed-source (e.g., OpenAI, Gemini, Anthropic) ecosystems.
                Architect and implement GraphRAG pipelines, including knowledge graph representation and retrieval for enhanced contextual grounding.
                Build and scale distributed training environments using NCCL and InfiniBand for multi-GPU and multi-node training.
                Apply reinforcement learning techniques (e.g., RLHF, RLAIF) to align model behavior with human preferences and domain-specific goals.
    Qualifications
                PhD or Master’s degree in Computer Science, Machine Learning, or related field.
                8+ years of experience in applied AIML, with a strong track record of delivering production-grade models.
                Deep expertise in:
                LLM training and fine-tuning (e.g., GPT, LLaMA, Mistral, Qwen)
                Graph-based retrieval systems (GraphRAG, knowledge graphs)
                Embedding models (e.g., BGE, E5, SimCSE)
                Semantic search and vector databases (e.g., FAISS, Weaviate, Milvus)
                Document segmentation and preprocessing (OCR, layout parsing)
                Distributed training frameworks (NCCL, Horovod, DeepSpeed)
                High-performance networking (InfiniBand, RDMA)
                Model fusion and ensemble techniques (stacking, boosting, gating)
                Optimization algorithms (Bayesian, Particle Swarm, Genetic Algorithms)
                Symbolic AI and rule-based systems
                Meta-learning and Mixture of Experts architectures
                Reinforcement learning (e.g., RLHF, PPO, DPO)
    Bonus Skills
                Experience with healthcare data and medical coding systems (e.g., CPT, CM, PCS).
                Familiarity with regulatory and compliance frameworks in AI deployment.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.