Model Data Engineer

  • Morrisville, NC
  • Posted 2 days ago | Updated 7 hours ago

Overview

On Site
Full Time

Skills

Web Dynpro
Computer Hardware
Smartphones
Tablet
Storage
High Performance Computing
Microsoft Exchange
Lenovo
Innovation
Research and Development
Generative Artificial Intelligence (AI)
Extract
Transform
Load
Data Management
Training
Data Analysis
Data Quality
Data Integrity
Reasoning
Artificial Intelligence
Computer Science
Data Science
Python
SQL
Data Structure
Relational Databases
PostgreSQL
MySQL
NoSQL
Database
MongoDB
Apache Cassandra
Drawing
Machine Learning (ML)
Evaluation
Large Language Models (LLMs)
Transformer
Communication
Collaboration
Conflict Resolution
Problem Solving
SAP BASIS

Job Details

General Information

Req #

WD00090042

Career area:

Hardware Engineering

Country/Region:

United States of America

State:

North Carolina

City:

Morrisville

Date:

Monday, October 20, 2025

Working time:

Full-time

Additional Locations:
* United States of America - North Carolina - Morrisville

Why Work at Lenovo

We are Lenovo. We do what we say. We own what we do. We WOW our customers.

Lenovo is a US$69 billion revenue global technology powerhouse, ranked #196 in the Fortune Global 500, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver Smarter Technology for All, Lenovo has built on its success as the world's largest PC company with a full-stack portfolio of AI-enabled, AI-ready, and AI-optimized devices (PCs, workstations, smartphones, tablets), infrastructure (server, storage, edge, high performance computing and software defined infrastructure), software, solutions, and services. Lenovo's continued investment in world-changing innovation is building a more equitable, trustworthy, and smarter future for everyone, everywhere. Lenovo is listed on the Hong Kong stock exchange under Lenovo Group Limited (HKSE: 992) (ADR: LNVGY).

This transformation together with Lenovo's world-changing innovation is building a more inclusive, trustworthy, and smarter future for everyone, everywhere. To find out more visit , and read about the latest news via our StoryHub .

Description and Requirements

We are seeking a highly motivated and skilled Model Data Engineer to join our rapidly growing AI team in Morrisville, NC. You will play a critical role in the training of large language models (LLMs), large vision models (LVMs), and large multimodal models (LMMs), including fine-tuning and reinforcement learning. This is a challenging yet rewarding opportunity to contribute to cutting-edge research and development in generative AI. You'll be working with a collaborative team to push the boundaries of what's possible with AI models and deploy them into innovative products.

Responsibilities:
  • Data pipeline: Develop and implement data pipelines to build high quality training data for Large Language Model. Build Data management and DVC systems.
  • Data Types: SFT, DPO, and RL data implementation, distillation and filtering, with data domains in general chat, instruction following, agentic tool-calling and so on.
  • Data Quality & Evaluation: Perform thorough data analysis to assess data quality, identify anomalies, and ensure data integrity. Utilize machine learning tools and techniques to evaluate dataset performance and identify areas for improvement.
  • Model Evaluation: Help build model evaluation framework, conduct evaluations on LLM, Multimodal LLM, Reasoning Models and etc.
  • Domain Model evaluation and Data: Create domain dataset and evaluate model domain capabilities.
  • Work closely with researchers and cross functional teams to build SOTA models together.
  • Stay up-to-date with the latest advancements in large language models.
  • Collaborate with other engineers and researchers to design, implement, and deploy AI-powered products.
  • Contribute to the development of internal tools.

Qualifications:
  • Master's degree in Computer Science, Machine Learning, Data Science or a related field and 2+ years of relevant work experience or 4+ years of relevant work experience.
  • Strong programming skills in Python and SQL
  • Solid understanding of data structure, relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra).
  • Proven experience in designing and conducting experiments, analyzing data, and drawing meaningful conclusions.
  • Solid understanding of machine learning concepts and techniques, including data preprocessing, feature engineering, and model evaluation.
  • Experience with model evaluation frameworks such as Open-compass.
  • Familiarity with large language models, transformer architectures, and related concepts.
  • Excellent communication, collaboration, and problem-solving skills.

#LATC

We are an Equal Opportunity Employer and do not discriminate against any employee or applicant for employment because of race, color, sex, age, religion, sexual orientation, gender identity, national origin, status as a veteran, and basis of disability or any federal, state, or local protected class.

Additional Locations:
* United States of America - North Carolina - Morrisville

* United States of America

* United States of America - North Carolina

* United States of America - North Carolina - Morrisville
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.