Overview
Skills
Job Details
Data Science Taxonomist
Location: Remote
Duration: 6 months contract with possibility of extension
We are seeking a highly analytical and detail-oriented Data Science Taxonomist to design, develop, and maintain structured taxonomies and ontologies that power our data products and AI systems. In this role, you will work at the intersection of data science, information architecture, and domain-specific categorization to enable smarter data discovery, classification, and model training.
Key Responsibilities: • Design and maintain taxonomies, ontologies, and metadata schemas that organize structured and unstructured data across domains.
• Collaborate with data scientists, ML engineers, and domain experts to build labeling systems that enhance AI/ML model training, especially in supervised and semi-supervised learning workflows.
• Curate and normalize datasets using consistent classification logic to improve data quality, interpretability, and retrieval.
• Lead efforts to align taxonomy strategies with business and product goals, including search optimization, knowledge graphs, and recommender systems.
• Apply natural language processing (NLP) and semantic analysis techniques to categorize content and refine metadata models.
• Evaluate and integrate 3rd-party ontologies and taxonomies where appropriate. • Support human-in-the-loop labeling workflows, guidelines, and quality control processes.
Required Qualifications:
• Bachelor's or Master s degree in Library Science, Information Architecture, Linguistics, Data Science, or a related field.
• 3+ years of experience in taxonomy/ontology development, data classification, or metadata design.
• Working knowledge of Python, SQL, and data tools (e.g., Pandas, Jupyter, NLP libraries).
• Familiarity with taxonomy management tools (e.g., PoolParty, TopBraid, Synaptica) and version control systems (Git).
• Experience working with machine learning datasets, tagging schemes, or annotation workflows.
• Strong attention to detail and logical consistency.
• Excellent communication skills to collaborate with both technical and non-technical teams.
Preferred Qualifications:
• Experience with knowledge graphs, embedding models, or semantic search.
• Background in taxonomy for e-commerce, finance, healthcare, or media.
• Familiarity with labeling tools like Label Studio, Prodigy, or Snorkel.