Apply Now

Databricks Data Scientist-Clinical

Indianapolis, IN, US • Posted 13 hours ago • Updated 13 hours ago

Contract W2

Contract Corp To Corp

6 Months

No Travel Required

Able to Sponsor

On-site

$50 - $60/hr

Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

Data Modeling
Clinical Trials
Data Processing
Business Intelligence
Clinical Data Analysis
HIPAA
GxP
Generative Artificial Intelligence (AI)
Natural Language Processing
Machine Learning (ML)
Health Care
Databricks
Data Science
Data Quality
Data Governance
Regulatory Compliance
Release Management

Summary

Data Science & Machine Learning

Location: ~Indianapolis, IN~

Duration: 6 months

Key Responsibilities

• Design, develop, and deploy machine learning models using Databricks (MLflow, Spark ML, Python) for pharma and life sciences use cases

• Implement end-to-end ML pipelines covering data ingestion, feature engineering, model training, deployment, and monitoring

• Build predictive models for patient identification, HCP segmentation, market access analytics, pharmacovigilance, and safety signal detection

• Apply NLP and generative AI techniques (LLMs, RAG pipelines) to extract insights from medical literature, clinical notes, and regulatory documents

• Conduct A/B testing, model validation, and statistical analysis to evaluate model performance and business impact

• Collaborate with data engineers to ensure reliable, high-quality, production-ready datasets in the Lakehouse

Databricks & Lakehouse Architecture

• Leverage Databricks Lakehouse (Delta Lake, Unity Catalog) for scalable, governed, and high-performance analytics

• Design and optimize Spark jobs for performance and cost efficiency across large-scale pharma datasets

• Apply best practices for data governance, data lineage, and security within Unity Catalog

• Build and maintain Bronze / Silver / Gold Medallion architecture for clinical, claims, and commercial data

• Implement Delta Live Tables (DLT) pipelines with data quality checks for real-time and batch processing

• Configure and manage Databricks Workflows, Repos, and cluster policies for production ML workloads

Genie (AI/BI & Natural Language Analytics)

• Configure and enable Databricks Genie for self-service analytics across business and scientific teams

• Design semantic layers and curated Gold datasets optimized for natural language queries via Genie

• Define certified questions, trusted assets, and business glossary terms to improve Genie response quality

• Partner with business stakeholders to translate complex pharma questions into Genie-enabled insights

• Monitor and iterate on Genie Spaces based on user feedback, query accuracy, and adoption metrics

• Enable non-technical users across Medical Affairs, Commercial, and R&D to self-serve data insights

Real-World & Clinical Data Analysis

• Analyze real-world data (RWD), electronic health records (EHR), claims data, and clinical trial datasets to generate actionable insights

• Build scalable data pipelines for pharma-specific sources including IQVIA, Symphony Health, Komodo, and specialty pharmacy data

• Apply survival analysis, mixed models, and Bayesian methods for epidemiology and health economics (HEOR) studies

• Ensure all models and data processes comply with HIPAA, GxP, and 21 CFR Part 11 regulations

Business Enablement & Stakeholder Collaboration

• Work closely with product owners, analysts, and business leaders to identify and prioritize high-value data science use cases

• Communicate complex analytical results and model outputs in a clear, business-friendly manner to non-technical audiences

• Produce analytical documentation: model cards, design specs, performance reports, and executive summaries

• Lead sprint ceremonies as analytics owner: architecture reviews, estimation sessions, and release planning

Required Qualifications

• Experience: 4+ years of professional experience in data science or advanced analytics, preferably in pharma, biotech, or life sciences

• Education: Bachelor''s or Master''s degree in Data Science, Computer Science, Statistics, Engineering, or a related field

• Databricks: Hands-on experience with Databricks and Apache Spark for large-scale data processing and ML workloads

• Python: Strong programming skills in Python — PySpark, Pandas, NumPy, Scikit-learn — for data science and ML development

• MLflow: Experience building and deploying ML models in production using MLflow for experiment tracking and model lifecycle management

• SQL: Solid understanding of SQL and data modeling for analytical and reporting workloads on large datasets

• Delta Lake: Experience with Delta Lake, Unity Catalog, and Medallion architecture (Bronze / Silver / Gold) for Lakehouse analytics

• Genie / AI-BI: Familiarity with Databricks Genie or AI/BI tools for natural language querying and self-service analytics

• Healthcare Data: Experience working with clinical, claims, or real-world healthcare data (EHR, RWD, specialty pharmacy)

• Compliance: Familiarity with HIPAA compliance and handling of sensitive patient data in regulated environments

• Communication: Strong communication skills — ability to translate complex models and analysis into clear, actionable business insights

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91122878
Position Id: 8997089
Posted 13 hours ago

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Indianapolis, Indiana

•

Today

We are seeking a Databricks Data Scientist with strong experience in Databricks Lakehouse, advanced analytics, and Genie (AIBI) to design, build, and deploy scalable data science and AI solutions. This role will focus on transforming enterprise data into actionable insights using machine learning, natural language analytics, and self-service BI powered by Databricks Genie. You will work closely with medical, commercial, and R&D teams across the pharma and life sciences industry to build intellig

Third Party, Contract

$50 - $60

Senior Clinical Data Scientist

West Lafayette, Indiana

•

Today

Why Join Us We believe in taking care of our people so they can take care of others. As a global, family-owned company that has been improving patients' lives since 1963, we're committed to doing business with integrity and making a positive impact in the communities where we live and work. We're guided by strong ethical standards and core values, and we believe the best ideas come from creating solutions together. Our company culture values every voice, brings together inclusive teams who coll

Full-time

Resident Solutions Architect — Databricks & Generative AI

Remote

•

17d ago

Key Responsibilities Design scalable data, analytics, machine learning, and GenAI solutions using the Databricks Lakehouse Platform. Provide architecture guidance for Databricks workspaces, clusters/serverless compute, workflows, Delta Lake, and Unity Catalog. Build and optimize data pipelines using Spark, PySpark, SQL, Delta Lake, and Databricks Workflows. Support implementation of medallion architecture patterns across bronze, silver, and gold layers. Design and guide GenAI solutions such

Easy Apply

Contract

Depends on Experience

Databricks Engineer - Remote

Remote

•

Yesterday

Title: Databricks Engineer - Remote Mandatory skills: operating solutions, Databricks, ETL, ELT architectures, data ingestion patterns, Python, Spark, data processing, dimensional data models, data lakehouse, AI-driven analytics platforms, semantic modeling, NLP techniques, data governance, data security, data-driven solutions, data technologies, data models data structures, data pipelines, data quality, SAFe Agile, reliable analytics solutions, SQL queries, notebooks, data ingestion, transforma