Overview
Remote
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 12 Month(s)
No Travel Required
Skills
Analytical Skill
Artificial Intelligence
Cloud Computing
Collaboration
Communication
Computer Science
Conflict Resolution
Continuous Delivery
Continuous Integration
Data Analysis
Data Lake
Data Processing
Data Science
Data Wrangling
Databricks
EDA
Evaluation
Generative Artificial Intelligence (AI)
Knowledge Sharing
Large Language Models (LLMs)
Training
Unstructured Data
Use Cases
Vector Databases
Workflow
PySpark
Python
Statistical Models
Statistics
TensorFlow
Testing
scikit-learn
Natural Language Processing
Problem Solving
Prompt Engineering
PyTorch
Machine Learning (ML)
Machine Learning Operations (ML Ops)
Microsoft Azure
Optimization
Job Details
Job Title: Data Scientist
Location: Atlanta, GA (Remote) Job Type: Long-Term Contract
About the Role
We are seeking a highly motivated and skilled Data Scientist with strong expertise in data science fundamentals, machine learning (ML), and large language models (LLMs). The ideal candidate will have hands-on experience working with Databricks and Azure ecosystems, including PySpark for data processing and LLM tuning within Databricks. This role involves building and optimizing data science solutions that leverage cloud-based technologies to deliver business value.
Key Responsibilities
- Design, develop, and deploy data science and ML solutions on Databricks (Azure environment).
- Work on end-to-end ML lifecycle, from data preparation and feature engineering to model training, evaluation, and deployment.
- Apply LLM fine-tuning and optimization techniques within Databricks for domain-specific use cases.
- Utilize PySpark for distributed data processing, cleaning, and transformation.
- Collaborate with data engineers, cloud architects, and business stakeholders to ensure seamless integration of ML models into production workflows.
- Conduct exploratory data analysis (EDA), statistical modeling, and hypothesis testing to extract insights from structured and unstructured data.
- Stay updated on the latest advancements in AI/ML, LLMs, and Databricks capabilities to bring innovative solutions.
- Document methodologies, experiments, and best practices for knowledge sharing.
Required Skills & Qualifications
- Bachelor s/Master s degree in Computer Science, Data Science, Statistics, AI/ML, or related field.
- Proven experience as a Data Scientist with exposure to ML and NLP projects.
- Strong hands-on experience with Databricks on Azure (MLflow, Delta Lake, Databricks ML).
- Proficiency in PySpark for large-scale data processing.
- Experience in training, fine-tuning, and deploying LLMs within Databricks environment.
- Strong programming skills in Python and familiarity with ML frameworks (TensorFlow, PyTorch, Scikit-learn, Hugging Face).
- Solid understanding of data science workflows: data wrangling, feature engineering, model development, and evaluation.
- Working knowledge of Azure cloud services (Azure Data Lake, Azure Synapse, Azure ML).
- Strong problem-solving, analytical thinking, and communication skills.
Good-to-Have Skills
- Experience with MLOps practices and tools (CI/CD for ML, MLflow).
- Knowledge of vector databases and LLM deployment pipelines.
- Familiarity with prompt engineering and RAG (Retrieval-Augmented Generation) techniques.
- Exposure to generative AI projects on cloud platforms.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.