Senior Data Scientist (with PhD)

Overview

Hybrid

$75 - $85

Accepts corp to corp applications

Contract - Independent

Contract - W2

Contract - 12 Month(s)

No Travel Required

Skills

Data Scientist

NLP

GenAI

Generative AI

symbolic AI

causal AI

Python

Amazon

AWS

Python Scikit

Deep Learning

Machine Learning

TensorFlow

OpenAI

GPT

OpenAI APIs

Llama

Large language Models

LLM

natural language processing

SQL

CNN

RNN

LSTM

Claude

Cohere

LoRA

LangChain

RAG

PEFT

Jupyter Notebook

AWS Sagemaker

Domino Datalab

SciPy

NumPy

PySpark

Scala

topic modeling

bag of words

text classification

TF/IDF

Sentiment analysis

Python NLTK

LLM Agents

Solr

ElasticSearch

AWS OpenSearch

Zeno

OWL

RDF

SparQL

Keras

Caffe

PyTorch

Theano

H2O

Docker

Kubernetes

API

Amazon RDS

Amazon Redshift

Amazon SageMaker

Amazon Web Services

Apache Hadoop

Apache Hive

Artificial Intelligence

Cloud Computing

Data Science

Data Cleansing

Data Visualization

Data Analysis

Computer Science

Electronic Health Record (EHR)

Generative Artificial Intelligence (AI)

Large Language Models (LLMs)

Unstructured Data

Natural Language

Object-Oriented Programming

Git

GitHub

GitLab

Image Processing

Jenkins

Jupyter

Kibana

Knowledge Management

Linux

Machine Learning (ML)

Mathematics

Microservices

NLTK

PostgreSQL

Oracle

RDBMS

Regression Analysis

Operations Research

Analytics

Apache Solr

Apache Spark

Automated Testing

BERT

Business Analytics

Clustering

Communication

Named-Entity Recognition (NER)

Normalization

Computational Linguistics

Remote Desktop Services

Database

IBM Lotus Domino

Modeling

MySQL

Ontologies

Resource Description Framework

Scripting

Shell Scripting

Statistics

Tableau

scikit-learn

tf-idf

PhD

Job Details

Senior Data Scientist
Washington, D.C (Hybrid)
Long-term contract role

Note:

Must have a PhD degree
Must have experience with GenAI and NLP

Minimum Qualifications:

Work or educational background in one or more of the following areas: machine learning, computational linguistics, deep learning, ratification intelligence, data science, and/or data analytics, generative AI, symbolic AI, causal AI, operations research, computer science, Mathematics, business analytics, or knowledge management.
8-12 years of demonstrated experience programming with R/Python, Linux, and Spark in AWS cloud environment, or knowledge and algorithmic design experience in Python (3+ years)
Proficient with Amazon AWS Sagemaker, Jupyter Notebook, and Python Scikit, Deep Learning, and Machine Learning tools such as TensorFlow
Experience with image processing models such as Coco, CLIP, ResNet, or comparable models
Demonstrated experience with machine learning techniques, including natural language processing, and Large language Models (GPTv4-o1, o3, OpenAI APIs, Llama, Claude, etc).
Experience developing AI agents and development proficiency using agentic programming
Proficient in Natural language processing (NLP) and Natural language generation (NLG,) including prior projects in any of the following categories: top modeling of text, sentiment analysis of text, part of speech tagging, Name Entity Recognition (NER), Bag of Words, text extraction
Experience building and working with any of these components: Vector DB, BERT, RoBERTa (or comparable tools), Spacy, LLM, and GenAI tools. Experience with LoRA, LangChain, RAG, LLM Fine Tuning, and PEFT, Knowledge Graphs.
Strong skills in developing GraphRAG, Chain of Thought (CoT), Tree of Thought (ToT), Reinforcement learning, and AI development architectures with Human-in-the-Loop (HITL
Demonstrated experience with SQL and any relational database technologies, such as Oracle, PostgreSQL, MySQL, RDS, Redshift, Hadoop EMR, Hive, etc.
Demonstrated experience processing structured and unstructured data sources, data cleansing, data normalization, and prep for analysis
Demonstrated experience with code repositories and build/deployment pipelines, specifically Jenkins and/or Git/GitHub/GitLab.
Demonstrated experience using Tableau, Kibana, Quicksight or other similar data visualization tools.
Very comfortable working with ambiguity (e.g. ,imperfect data, loosely defined concepts, ideas, or goals)

Qualifications & Requirements

Education: MS in Computer Science, Statistics, Math, Engineering, or related field, PhD preferred.
3+ years of relevant experience in building large-scale machine learning or deep learning models and/or systems
1+ year of experience specifically with deep learning (e.g., CNN, RNN, LSTM)
1+ year of experience building NLP and NLG tools.
Experience with a wide range of LLMs (Llama, Claude, OpenAI, Cohere, etc.), LoRA, LangChain, RAG, LLM Fine Tuning, and PEFT is preferred.
Demonstrated skills with Jupyter Notebook, AWS Sagemaker, Domino Datalab, or comparable environments
Passion for solving complex data problems and generating cross-functional solutions in a fast-paced environment
Knowledge in Python and SQL, object-oriented programming, and service-oriented architectures
Strong scripting skills with Shell script and SQL
Strong coding skills and experience with Python (including SciPy, NumPy, and/or PySpark) and/or Scala.
Knowledge and implementation experience with NLP techniques (topic modeling, bag of words, text classification, TF/IDF, Sentiment analysis) and NLP technologies such as Python NLTK, Spacy, or comparable technologies
Knowledge and implementation experience with statistical and machine learning models (regression, classification, clustering, graph models, etc.)

Preferred Qualifications

Hands-on experience building models with deep learning frameworks like Tensorflow, Keras, Caffe, PyTorch, Theano, H2O, or similar
Experience with LLM Agents, Agentic programming
Experience with search architecture (for instance: Solr, ElasticSearch, AWS OpenSearch)
Experience with building querying ontologies such as Zeno, OWL, RDF, SparQL, or comparable is preferred
Knowledge & experience with microservices, service mesh, API development ,and test automation are preferred
Demonstrated experience using Docker, Kubernetes, and/or other similar container frameworks is preferred

Additional Job Qualifications:

Ability to translate business ideas into analytics models that have a major business impact.
Demonstrated experience working with multiple stakeholders.
Demonstrated communication skills, e.g., explaining complex technical issues to more junior data scientists, in graphical, verbal, or written formats.
Demonstrated experience developing tested, reusable and reproducible work.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share