Senior Level Data Scientist with Agentic AI

Remote • Posted 3 hours ago • Updated 3 hours ago
Contract W2
6 Months
Remote
$50 - $60/hr
Fitment

Dice Job Match Score™

📊 Calculating match score...

Job Details

Skills

  • Python
  • Data Science
  • Agentic AI
  • Databricks
  • Snowflake
  • AWS
  • AI/OCR

Summary

Role: Senior Level Data Scientist
Duration: Remote
Location: Contract through end of year + extensions.

This person will help build the Agentic AI solution to organize all of this data into a tabular format that they can build models off of this in Databricks to create quality data to share to Meta and Google for their ad campaigns.

Overview: At a high level, they are automating workflows in various areas of their Media Operations. As part of media campaigns, they pay Google & Meta to bring traffic to their site and need a way to track the effectiveness of the traffic hits they're getting. Google/Meta have options to send feedback on users to them (i.e. how "useful" was this user). This is easier for an ecommerce company to measure - this user spend $x on our products after being directed from Google/Meta and thus was useful/not useful to us. Client has a much less clear-cut means to provide this feedback. They essentially need to determine whether the user who was directed to them has the particular disease for which the advertisement was shown (was it relevant to this specific user?) - measuring the prevalence of the disease they have to determine if the campaign is effective. Their audience quality data comes from a third party and is received in PDF, email, unstructured data sources that are very messy. They are developing an AI-automated solution to process and ingest that data into a tabular format in Databricks (part 1); and then they are building classic data science models to determine what users are doing on their site at an aggregate level and build a model that compares on-site behavior to audience quality (part 2).

  • Manager and their team of Engineers are making great progress against this workstream - AI stakeholders are happy with what they've accomplished, but trying to accelerate with additional resources.
  • Exploring onshore, nearshore, and offshore options - they are cost-conscious, but also looking for the right resources. Timelines are of consideration, ideal onboarding timeline by end of June, but willing to extend that for the right person.
  • Would like to see candidates from each geography and will onboard the best talent, but looking for mid-level resources regardless of location. If someone is over the provided rate, they do want to see them.
  • Their tech stack is: Databricks for housing and processing data, AWS for foundation models and AI orchestration (Lambdas), Python, Jira, Asana, Snowflake is used in other parts of the org, but not being used for this project, Cursor or Claude Code for code development/enhancement.
  • Data Engineering experience is not required, may be good to have for the nature of the agents they are building, but the data engineering side will build out the actual schemas, fact and dimension tables in Databricks, help them understand the partitioning and indexing.
  • They should have worked with large datasets (billions of rows, roughly 100s of GBs or several TBs of data).
  • Immediate goal is to build an automated process, using AI native tools, be able to process the messy data from unstructured formats and process, load into Databricks tables. Future state would be to where they could go directly from PDF, extract the necessary data and load it into a table.
  • Want to deploy on-site and in real time - will involve integration with engineering team and in house tech stack. This role will not be responsible for any platform components, or for building APIs to deploy the models into production. Their goal will be to get to a deployable model so more Data/AI Science in that sense, but they do not want code only in notebooks - should be production-grade code.
  • Background will be a blend of agentic AI experience and traditional data science.
  • AI agent will need to:
    • Extract key fields from the unstructured sources (PDF, email, attachment, etc) such as audience quality metrics, disease prevalence rates, campaign performance data and convert into structured schema (Databricks tables).
    • Standardize and clean data - normalize formats across document types, resolve naming inconsistencies, handle missing data/missing fields
    • Enrich data - joining extracted data with campaign data, Google/Meta data, audience demographic data from 3rd party.

Must Have:

  • Python
  • Agentic AI experience
  • LLMs (Bedrock foundation models)
  • Data science modeling experience
  • Databricks or Snowflake
  • Experience working with large datasets (100s of GBs to TBs, billions of rows)
  • AWS (Lambdas for orchestration)

Nice to Have:

  • Document AI/OCR familiarity
  • Experience in data engineering - unstructured data processing, data cleaning, standardization, joining/enriching datasets
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 90941404
  • Position Id: 8978834
  • Posted 3 hours ago
Contact the job poster
Vijay Pothiwal

Vijay Pothiwal

Tekfortune Inc. Recruiter @ Tekfortune Inc.
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote

Today

Easy Apply

Contract

$65+

Remote

14d ago

Easy Apply

Contract

75 - 90

Remote

10d ago

Easy Apply

Contract

Depends on Experience

Remote

10d ago

Easy Apply

Contract

Depends on Experience

Search all similar jobs