Data Scientist - Data Annotation AI Specialist

Overview

Accepts corp to corp applications

Skills

Generative Artificial Intelligence (AI)

Deep Learning

Data Science

Unstructured Data

Extraction

Computer Vision

Artificial Intelligence

Open Source

Stacks Blockchain

Privacy

User Experience

Due Diligence

Legal

Procurement

Access Control

Taxonomy

Data Extraction

Modeling

Business Intelligence

PASS

Change Management

Quality Management

Quality Assurance

Dashboard

Data Quality

Documentation

Budget

Onboarding

Machine Learning Operations (ML Ops)

Training

Storage

Databricks

Snow Flake Schema

Interfaces

Encryption

Machine Learning (ML)

SSO

RBAC

Auditing

System On A Chip

Named-Entity Recognition (NER)

Text Classification

Prompt Engineering

Python

Pandas

NLTK

Active Learning

Test Methods

Cloud Computing

Amazon Web Services

Google Cloud Platform

Google Cloud

Microsoft Azure

Data Governance

Communication

Collaboration

Large Language Models (LLMs)

Evaluation

Financial Services

Analytics

Regulatory Compliance

Management

Multilingual

Natural Language Processing

Workflow

PDF

Optical Character Recognition

Job Details

Job Description: Data Scientist - Data Annotation AI Specialist

Location: Canada (Remote)

Duration: 3 Months contract

Job Description: Position Title: Data Annotation AI Specialist
Client AI group is seeking a Data Annotation AI Specialist to be part of a team that will be dedicated to build and support Generative AI, Machine learning, Deep Learning and Data science solutions across the organization. The position could be based out of our Chicago or NY offices. We are seeking a Data Annotation AI Specialist to lead the evaluation, selection, and onboarding of a data annotation platform, and to establish best-in-class annotation workflows for our NLP and CV initiatives. This role will bridge product, data science, MLOps, and compliance to ensure high-quality labeled datasets that accelerate model development for tasks such as text classification, entity extraction, unstructured data extraction, document summarization, and prompt/response curation.
What We Offer:
This will be a high impact role with significant visibility where the candidate will work on some flagship Fitch products
The candidate will have an excellent opportunity to work in the cutting-edge field of AI, NLP, Computer vision and MLOPs/LLMOps
Fitch promotes an excellent work culture and is known for providing a good work life balance
We'll Count on You To:
Platform Evaluation and Onboarding:
o Assess and compare data annotation platforms (e.g., Labelbox, Prodigy, Snorkel, Scale AI, SuperAnnotate, LightTag, custom open-source stacks) against business and technical requirements.
o Lead proof-of-concept trials; define evaluation criteria (quality, throughput, cost, security, privacy, compliance, UI/UX, workflow features, integrations, auditability).
o Drive vendor due diligence, security reviews, and coordinate procurement/contracting with Legal, Security, and Procurement.
o Plan and execute platform deployment, integrations (SSO, data lakes, MLOps pipelines), and role-based access controls.
Workflow and Taxonomy Design:
o Collaborate with NLP and CV scientists and product owners to define labeling taxonomies, guidelines, and rubrics for tasks such as NER, data extraction, intent classification, topic modeling, toxicity/BI risk tagging, and document QA.
o Establish annotation protocols, inter-annotator agreement measures (IAA), and quality gates; design multi-pass review processes and adjudication steps.
o Develop gold standards and calibration sets; maintain versioning and change management of label schemas.
Quality Management:
o Implement QA metrics and dashboards (precision/recall on labeled subsets, IAA, disagreement analysis, drift detection, sampling strategies).
o Design active learning and human-in-the-loop strategies to continually improve data quality and labeling efficiency.
o Conduct audits, bias checks, and error analyses; enforce data governance and documentation (data sheets, model cards inputs).
Operations and Scale:
o Build and manage a hybrid workforce model (in-house annotators, expert reviewers, external vendors) including training, SLAs, throughput planning, and budget tracking.
o Create training materials and onboarding programs for annotators, SMEs, and reviewers; run calibration sessions and periodic res.
o Optimize throughput and cost with workflow automation, pre-labeling, heuristics, and annotation tooling features.
Integration and MLOps:
o Integrate the annotation platform with data pipelines, model training loops, experiment tracking, and storage (e.g., Databricks, Snowflake, AWS/Google Cloud Platform/Azure, MLflow).
o Implement programmatic interfaces (APIs/SDKs) for data ingestion/export, schema management, and reproducibility.
o Collaborate on dataset curation, splitting strategies, and governance (PII handling, encryption, retention policies).
What You Need to Have:
4 7+ years of experience in data annotation, data operations, or applied NLP/CV/ML, with direct responsibility for building and managing labeling programs.
Hands-on experience with annotation platforms and workflows for NLP tasks; familiarity with enterprise deployment considerations (SSO, RBAC, audit, SOC2).
Strong understanding of NLP and CV techniques: tokenization, embeddings, NER, text classification, sentiment, summarization, prompt engineering, and evaluation.
Proficiency in Python and data tooling (Pandas, spaCy, Hugging Face, NLTK); experience using APIs/SDKs to automate annotation and active learning loops.
Experience defining label taxonomies, guidelines, and measuring IAA; practical knowledge of QA methodologies and error/bias analysis.
Familiarity with cloud platforms (AWS/Google Cloud Platform/Azure), data governance, and secure data handling.
Excellent communication skills; ability to collaborate with data scientists, product managers, engineers, SMEs, and vendors.
What Would Make You Stand Out:
Experience with large language model (LLM) data curation, RLHF/RLAIF pipelines, and prompt/response quality evaluation.
Background in financial services, risk analytics, or regulated industries with strong compliance requirements.
Prior experience building hybrid annotation teams and managing external vendors.
Knowledge of annotation for multilingual NLP and document-heavy workflows (PDF parsing, OCR)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Data Scientist - Data Annotation AI Specialist

Job Details

Share