Data Scientist - Data Annotation AI Specialist

Overview

Accepts corp to corp applications

Skills

Generative Artificial Intelligence (AI)
Deep Learning
Data Science
Unstructured Data
Extraction
Computer Vision
Artificial Intelligence
Open Source
Stacks Blockchain
Privacy
UI
User Experience
Due Diligence
Legal
Procurement
Access Control
Taxonomy
Data Extraction
Modeling
Business Intelligence
PASS
Change Management
Quality Management
Quality Assurance
Dashboard
Data Quality
Documentation
Budget
Onboarding
Machine Learning Operations (ML Ops)
Training
Storage
Databricks
Snow Flake Schema
Interfaces
Encryption
Machine Learning (ML)
SSO
RBAC
Auditing
System On A Chip
Named-Entity Recognition (NER)
Text Classification
Prompt Engineering
Python
Pandas
NLTK
Active Learning
Test Methods
Cloud Computing
Amazon Web Services
Google Cloud Platform
Google Cloud
Microsoft Azure
Data Governance
Communication
Collaboration
Large Language Models (LLMs)
Evaluation
Financial Services
Analytics
Regulatory Compliance
Management
Multilingual
Natural Language Processing
Workflow
PDF
Optical Character Recognition

Job Details

Job Description: Data Scientist - Data Annotation AI Specialist

Location: Canada (Remote)

Duration: 3 Months contract


Job Description: Position Title: Data Annotation AI Specialist
Client AI group is seeking a Data Annotation AI Specialist to be part of a team that will be dedicated to build and support Generative AI, Machine learning, Deep Learning and Data science solutions across the organization. The position could be based out of our Chicago or NY offices. We are seeking a Data Annotation AI Specialist to lead the evaluation, selection, and onboarding of a data annotation platform, and to establish best-in-class annotation workflows for our NLP and CV initiatives. This role will bridge product, data science, MLOps, and compliance to ensure high-quality labeled datasets that accelerate model development for tasks such as text classification, entity extraction, unstructured data extraction, document summarization, and prompt/response curation.
What We Offer:
This will be a high impact role with significant visibility where the candidate will work on some flagship Fitch products
The candidate will have an excellent opportunity to work in the cutting-edge field of AI, NLP, Computer vision and MLOPs/LLMOps
Fitch promotes an excellent work culture and is known for providing a good work life balance
We'll Count on You To:
Platform Evaluation and Onboarding:
o Assess and compare data annotation platforms (e.g., Labelbox, Prodigy, Snorkel, Scale AI, SuperAnnotate, LightTag, custom open-source stacks) against business and technical requirements.
o Lead proof-of-concept trials; define evaluation criteria (quality, throughput, cost, security, privacy, compliance, UI/UX, workflow features, integrations, auditability).
o Drive vendor due diligence, security reviews, and coordinate procurement/contracting with Legal, Security, and Procurement.
o Plan and execute platform deployment, integrations (SSO, data lakes, MLOps pipelines), and role-based access controls.
Workflow and Taxonomy Design:
o Collaborate with NLP and CV scientists and product owners to define labeling taxonomies, guidelines, and rubrics for tasks such as NER, data extraction, intent classification, topic modeling, toxicity/BI risk tagging, and document QA.
o Establish annotation protocols, inter-annotator agreement measures (IAA), and quality gates; design multi-pass review processes and adjudication steps.
o Develop gold standards and calibration sets; maintain versioning and change management of label schemas.
Quality Management:
o Implement QA metrics and dashboards (precision/recall on labeled subsets, IAA, disagreement analysis, drift detection, sampling strategies).
o Design active learning and human-in-the-loop strategies to continually improve data quality and labeling efficiency.
o Conduct audits, bias checks, and error analyses; enforce data governance and documentation (data sheets, model cards inputs).
Operations and Scale:
o Build and manage a hybrid workforce model (in-house annotators, expert reviewers, external vendors) including training, SLAs, throughput planning, and budget tracking.
o Create training materials and onboarding programs for annotators, SMEs, and reviewers; run calibration sessions and periodic res.
o Optimize throughput and cost with workflow automation, pre-labeling, heuristics, and annotation tooling features.
Integration and MLOps:
o Integrate the annotation platform with data pipelines, model training loops, experiment tracking, and storage (e.g., Databricks, Snowflake, AWS/Google Cloud Platform/Azure, MLflow).
o Implement programmatic interfaces (APIs/SDKs) for data ingestion/export, schema management, and reproducibility.
o Collaborate on dataset curation, splitting strategies, and governance (PII handling, encryption, retention policies).
What You Need to Have:
4 7+ years of experience in data annotation, data operations, or applied NLP/CV/ML, with direct responsibility for building and managing labeling programs.
Hands-on experience with annotation platforms and workflows for NLP tasks; familiarity with enterprise deployment considerations (SSO, RBAC, audit, SOC2).
Strong understanding of NLP and CV techniques: tokenization, embeddings, NER, text classification, sentiment, summarization, prompt engineering, and evaluation.
Proficiency in Python and data tooling (Pandas, spaCy, Hugging Face, NLTK); experience using APIs/SDKs to automate annotation and active learning loops.
Experience defining label taxonomies, guidelines, and measuring IAA; practical knowledge of QA methodologies and error/bias analysis.
Familiarity with cloud platforms (AWS/Google Cloud Platform/Azure), data governance, and secure data handling.
Excellent communication skills; ability to collaborate with data scientists, product managers, engineers, SMEs, and vendors.
What Would Make You Stand Out:
Experience with large language model (LLM) data curation, RLHF/RLAIF pipelines, and prompt/response quality evaluation.
Background in financial services, risk analytics, or regulated industries with strong compliance requirements.
Prior experience building hybrid annotation teams and managing external vendors.
Knowledge of annotation for multilingual NLP and document-heavy workflows (PDF parsing, OCR)

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.