Data Scientist

Overview

On Site
$80,000 - $165,000
Full Time
No Travel Required

Skills

R
Python
Generative Artificial Intelligence (AI)
GraphQL
Artificial Intelligence

Job Details

Title: Data Scientist Location: Alpharetta, GA, Austin, TX, Bridgewater, NJ, Charlotte, NC, Denver, CO, Houston, TX, New York, NY, Phoenix, AZ, Raleigh, NC, Richardson, TX, Sunnyvale, CA, Tempe, AZ Position type: Full-time

Job Description:
Strong proficiency in Python programming.
Experience of deploying the Gen AI applications with one of the Agent Frameworks like LangGraph, AutoGen or Crew AI.
Experience in deploying the Gen AI stack/services provided by various platforms such as AWS, Google Cloud Platform, Azure, IBM Watson
Experience in Generative AI and working with multiple Large Language Models and implementing Advanced RAG based solutions.
Experience in processing/ingesting unstructured data from PDFs, HTML, Image files, audio to text etc.
Experience with Hybrid Retrieval, Semantic Chunking, Metadata Filtering and Vector Search Optimization
Experience with Advanced LLM concepts like Prompt Compression, Fine-Tuning, Caching, etc.
Experience working with multimodal models integrating image, text, or audio into GenAI workflows.
Experience working with Vector Databases (such as FAISS, Pinecone, Weaviate, or Azure AI Search).
Experience in Model evaluation tools like DeepEval, FMeval, RAGAS , Bedrock model evaluation, including human-in-the-loop feedback loops.
Experience with LLMOps practices including prompt versioning, caching, observability, cost tracking, and production deployment of LLMs.
Strong understanding of AI governance (like GDPR, explainability), data privacy, model safety (e.g., hallucination, toxicity, bias), and enterprise-grade compliance requirements.
Experience with data gathering, data quality, system / microservices architecture, coding best practices
Experience with Lean / Agile development methodologies

Preferred Data Scientist Qualifications:
4 years of hands-on experience with more than one programming language; Python, R, Scala, Java, SQL
Hands-on experience with CI/CD pipelines and DevOps tools like Jenkins, GitHub Actions, or Terraform.
Proficiency in NoSQL and SQL databases (PostgreSQL, MongoDB, CosmosDB, DynamoDB).
Deep Learning experience with CNNs, RNN, LSTMs and the latest research trends
Experience in Python AI/ML frameworks such as TensorFlow, PyTorch, or LangChain.
Strong understanding and experience of LLM fine-tuning, local deployment of open-source models
Proficiency in building RESTful APIs using FastAPI, Flask, or Django.
Experience in Model evaluation tools like DeepEval, FMeval, RAGAS , Bedrock model evaluation.
Experience with perception (e.g. computer vision), time series data (e.g. text analysis)
Big Data Experience strongly preferred, HDFS, Hive, Spark, Scala
Data visualization tools such as Tableau, Query languages such as SQL, Hive
Good applied statistics skills, such as distributions, statistical testing, regression, etc.
Exposure to Front -End / Full-Stack Integration (React / Angular, TypeScript, REST APIs, GraphQL, Event-Driven etc.)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.