Sr. Data Scientist

Overview

On Site
$100000 - $120000
Full Time
No Travel Required

Skills

AWS
SageMaker
Google
Vertex AI
TensorFlow
PyTorch
Pandas
NumPy
Scikit-learn
Data Scientist

Job Details

Role :: Data Scientist

Location :: Tampa, FL/Dallas ,TX - Onsite

Type :: Fulltime

Job Description

Must Have Technical/Functional Skills

  • Programming & Libraries: Expert-level proficiency in Python and its core data science libraries (Pandas, NumPy, Scikit-learn). Strong proficiency in SQL for complex data extraction and manipulation.
  • Machine Learning Frameworks: Hands-on experience with modern deep learning frameworks such as TensorFlow or PyTorch.
  • Statistical Modeling: Deep understanding of statistical concepts and a wide range of machine learning algorithms, with proven experience in time-series forecasting and anomaly detection.
  • Big Data Technologies: Demonstrable experience working with large datasets using distributed computing frameworks, specifically Apache Spark.
  • Database Systems: Experience querying and working with data from multiple relational database systems (e.g., PostgreSQL, Oracle, MS SQL Server).
  • Cloud Platforms: Experience building and deploying data science solutions on a major cloud platform (AWS, Google Cloud Platform, or Azure). Familiarity with their native ML services (e.g., AWS SageMaker, Google Vertex AI) is a strong plus.
  • MLOps Tooling: Practical experience with MLOps principles and tools for model versioning, tracking, and deployment (e.g., MLflow, Docker).
  • Communication and Storytelling: Excellent verbal and written communication skills, with a proven ability to explain complex technical concepts to a non-technical audience through visual storytelling.

Roles & Responsibilities

Druid Data Modeling & Schema Design:

o Design and implement efficient data schemas, dimensions, and metrics within Apache Druid for various analytical use cases (e.g., clickstream, IoT, application monitoring).

o Determine optimal partitioning, indexing (bitmap indexes), and rollup strategies to ensure sub-second query performance and efficient storage.

Data Ingestion Pipeline Development:

o Develop and manage real-time data ingestion pipelines into Druid from streaming sources like Apache Kafka, Amazon Kinesis, or other message queues.

o Implement batch data ingestion processes from data lakes (e.g., HDFS, Amazon S3, Azure Blob, Google Cloud Storage) or other databases.

o Ensure data quality, consistency, and exactly-once processing during ingestion.

Query Optimization & Performance Tuning:

o Write and optimize complex SQL queries (Druid SQL) for high-performance analytical workloads, including aggregations, filters, and time-series analysis.

o Analyze query plans and identify performance bottlenecks, implementing solutions such as segment optimization, query rewriting, or cluster configuration adjustments.

Programming & Libraries: Expert-level proficiency in Python and its core data science libraries (Pandas, NumPy, Scikit-learn). Strong proficiency in SQL for complex data extraction and manipulation.

Machine Learning Frameworks: Hands-on experience with modern deep learning frameworks such as TensorFlow or PyTorch.

Statistical Modeling: Deep understanding of statistical concepts and a wide range of machine learning algorithms, with proven experience in time-series forecasting and anomaly detection.

Big Data Technologies: Demonstrable experience working with large datasets using distributed computing frameworks, specifically Apache Spark.

Database Systems: Experience querying and working with data from multiple relational database systems (e.g., PostgreSQL, Oracle , MS SQL Server).

Cloud Platforms: Experience building and deploying data science solutions on a major cloud platform (AWS, Google Cloud Platform, or Azure). Familiarity with their native ML services (e.g., AWS SageMaker, Google Vertex AI) is a strong plus.

MLOps Tooling: Practical experience with MLOps principles and tools for model versioning, tracking, and deployment (e.g., MLflow, Docker).

Communication and Storytelling: Excellent verbal and written communication skills, with a proven ability to explain complex technical concepts to a non-technical audience through visual storytelling



Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Stanley David and Associates