Apply Now

Mid Level Data Scientist

Remote • Posted 3 hours ago • Updated 3 hours ago

Contract W2

6 Months

Remote

$60 - $65/hr

Tech Tandem Inc

Fitment

Dice Job Match Score™

⏳ Almost there, hang tight...

Job Details

Skills

Python
AgenticAI
Agentic AI
AI Agent
Agentic-AI
LLM
LLMs
Bedrock Foundation Models
Data Science Modelling
Data Modelling
AWS
Lambdas

Summary

Job Role: Senior/Mid-Level Data Scientist

Location: 100% Remote Role

If you're interested, please send me a copy of your resume and the following details as soon as possible

Overview: At a high level, they are automating workflows in various areas of their Media Operations. As part of media campaigns, they pay Google & Meta to bring traffic to their site and need a way to track the effectiveness of the traffic hits they're getting. Google/Meta have options to send feedback on users to them (i.e. how "useful" was this user). This is easier for an ecommerce company to measure - this user spend $x on our products after being directed from Google/Meta and thus was useful/not useful to us. RVO has a much less clear-cut means to provide this feedback. They essentially need to determine whether the user who was directed to them has the particular disease for which the advertisement was shown (was it relevant to this specific user?) - measuring the prevalence of the disease they have to determine if the campaign is effective. Their audience quality data comes from a third party and is received in PDF, email, unstructured data sources that are very messy. They are developing an AI-automated solution to process and ingest that data into a tabular format in Databricks (part 1); and then they are building classic data science models to determine what users are doing on their site at an aggregate level and build a model that compares on-site behavior to audience quality (part 2).

Their tech stack is: Databricks for housing and processing data, AWS for foundation models and AI orchestration (Lambdas), Python, Jira, Asana, Snowflake is used in other parts of the org, but not being used for this project, Cursor or Claude Code for code development/enhancement.

Data Engineering experience is not required, may be good to have for the nature of the agents they are building, but the data engineering side will build out the actual schemas, fact and dimension tables in Databricks, help them understand the partitioning and indexing.

They should have worked with large datasets (billions of rows, roughly 100s of GBs or several TBs of data).

My immediate goal is to build an automated process, using AI native tools, be able to process the messy data from unstructured formats and process, load into Databricks tables. Future state would be to where they could go directly from PDF, extract the necessary data and load it into a table.

Want to deploy on-site and in real time - will involve integration with engineering team and in house tech stack. This role will not be responsible for any platform components, or for building APIs to deploy the models into production. Their goal will be to get to a deployable model so more Data/AI Science in that sense, but they do not want code only in notebooks - should be production-grade code.

Background will be a blend of agentic AI experience and traditional data science.
AI agent will need to:
Extract key fields from the unstructured sources (PDF, email, attachment, etc) such as audience quality metrics, disease prevalence rates, campaign performance data and convert into structured schema (Databricks tables).
Standardize and clean data - normalize formats across document types, resolve naming inconsistencies, handle missing data/missing fields
Enrich data - joining extracted data with campaign data, Google/Meta data, audience demographic data from 3rd party.

Must Have:

Python
Agentic AI experience
LLMs (Bedrock foundation models)
Data science modeling experience
Databricks or Snowflake
Experience working with large datasets (100s of GBs to TBs, billions of rows)
AWS (Lambdas for orchestration)

Nice to Have:

Document AI/OCR familiarity
Experience in data engineering - unstructured data processing, data cleaning, standardization, joining/enriching datasets

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Dice Id: 91172969
Position Id: 8979896
Posted 3 hours ago

Company Info

About Tech Tandem Inc

Founded in Austin, Texas, Tech Tandem helps corporations hire qualified candidates for their many recruitment needs. With global resources in Bangalore, India, we provide comprehensive staffing solutions worldwide.

Our Mission

Tech Tandem is a leading provider of IT staffing and strategic talent solutions. Our mission is to help organizations build high-performing teams by providing top-quality talent, industry expertise, and comprehensive support services.

Our Vision

To be the preferred strategic partner for our clients in building high-performing IT teams that drive business success through innovative and tailored staffing solutions.

Go to company profile

Contact the job poster

Abhishek Singh

Recruiter @ Tech Tandem Inc

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Tech Tandem Inc
Senior/Mid Data Scientist - Agentic AI
Remote
•
Today
Overview:We are seeking a highly skilled Senior Data Scientist with expertise in Generative AI, Agentic AI workflows, and large-scale data processing to support the development of AI-driven automation solutions within Media Operations.This role will focus on building intelligent systems capable of extracting, structuring, standardizing, and enriching highly unstructured data received from third-party sources including PDFs, emails, attachments, and semi-structured documents. The processed data w
Easy Apply
Contract
$65+

Search all similar jobs