Data Scientist Generative AI and Document Extraction

Overview

Hybrid
$60 - $70
Contract - W2
Contract - 12 Month(s)
75% Travel

Skills

Data Science
Deep Learning
Data Analysis
Artificial Intelligence
Algorithms
Agile
Data Cleansing
Collaboration
Computer Science
Communication
Document Processing
Content Management
Data Extraction
Data Processing
Document Management
Extraction
Ensemble
Generative Artificial Intelligence (AI)
Machine Learning (ML)
Management
Modeling
Named-Entity Recognition (NER)
Natural Language Processing
Open Source
Optical Character Recognition
Performance Metrics
Organized
Software Design
Presentations
Python
Research
Research and Development
Scalability
Software Development
Unsupervised Learning
Workflow
Team Leadership

Job Details

Job Title: Data Scientist Generative AI and Document Extraction

Onsite role 3 days Hybrid
Note: looking for someone really strong with Gen AI; background with python/AI

Locations: Addison TX / Charlotte NC

To make this efficient and productive for both, please see the two key aspects we are looking for in a candidate:

  1. Working experience with deploying model(s) into production. Have contributed significantly or have been a major part of a team which deployed an ML, NLP, GenAI model from start to finish into production. Be able to clearly describe the problem, the customer/user, requirements, data used, model dev, key model related considerations for putting the it into production (real life use).
  2. More of ML/NLP practitioner experience than academic or R&D knowledge.
  3. Have experience in working in a team lead role (even for a small team). Be able to taken an ML assignment and be able to run with it as a team lead. Experience in support tech manager in model development activities.


Resumes should be limited to two pages max. In summary, mature, technically proficient, junior team lead ML/NLP practitioner.

Key Responsibilities

  • Develop and implement AI models focused on data extraction, document processing, and content management within a large-scale, data-driven environment.
  • Leverage Generative AI and deep learning techniques to extract, interpret, and structure data from unstructured documents.
  • Collaborate with cross-functional teams to optimize data processing workflows and integrate AI-driven document management solutions.
  • Design, develop, and deploy machine learning models and algorithms in Python to improve accuracy, efficiency, and scalability.
  • Lead research on the latest advancements in Generative AI, ML, and NLP to enhance document understanding and content retrieval systems.
  • Work closely with stakeholders to identify pain points in document management and propose innovative, AI-powered solutions.

REQUIRED SKILLS
7+ years of experience as Data Scientist or related roles
Bachelor s degree in Computer Science, or a related technical field
Deep understanding and some exposure to new Gen AI Open-source Models
At least 5 years programming experience in software development and Agile process
At least 5 years Python (or equivalent) programming experience to work with ML/NLP models.
Experience in setting up supervised unsupervised learning ML/NLP models including data cleaning, data analytics, feature creation, model selection ensemble methods, performance metrics visualization
Experience in ML/NLP development pipelines of large data sets, both structured unstructured
At least 2 years experience in designing and developing enterprise-scale ML/NLP solutions in one or more of: Named Entity Recognition, Document Classification, Document Summarization, Topic Modelling, Dialog Systems, Sentiment Analysis, OCR text processing
Knowledge and hands-on experience working with OCR products

DESIRED SKILLS
Master s degree in Computer Science/Data Science, or a related technical field
Collaborating with a diverse set of partners and stakeholders from various Line of Business.
Highly motivated, proactive and a self-starter; strong sense of ownership ability to create and execute plans without daily oversight
Critical thinker; ability to analyze problems and identify issues and provide solutions
Ability to navigate the enterprise data assets across multiple functions.
Highly organized. Effectively prioritizes and balances multiple efforts in a fast-paced environment
Excellent communication and Presentation skills
Analytical abilities great problem solving

Email:

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About TIA Infotek Inc