"Data Engineer" AND "NLP"

Overview

Hybrid

Depends on Experience

Contract - W2

Contract - 12 Month(s)

Skills

Amazon Web Services

Apache Spark

Cloud Computing

Computer Science

Continuous Delivery

Continuous Integration

Data Engineering

Data Processing

Data Quality

Data Science

Data Storage

Docker

Git

Good Clinical Practice

Google Cloud Platform

Kubernetes

Machine Learning (ML)

Management

Microsoft Azure

NLTK

Natural Language Processing

Python

Unstructured Data

Version Control

Workflow

Job Details

Greeting from ABAL Technologies Inc

Role: Data Engineer with NLP

Location: North Carolina - Hybrid

Job Summary:

We are looking for a Data Engineer with experience in Natural Language Processing (NLP) to join our team. The ideal candidate will design, build, and maintain scalable data pipelines and infrastructure to support NLP-based applications. You ll work closely with data scientists, machine learning engineers, and product teams to extract insights from unstructured text data.

Key Responsibilities:

Build and maintain efficient data pipelines for text data processing
Collect, clean, and organize structured and unstructured data (e.g., documents, social media, logs)
Implement and optimize NLP pipelines using libraries like spaCy, NLTK, or Hugging Face Transformers
Work with large-scale datasets and ensure data quality and integrity
Collaborate with ML and data science teams to deploy NLP models into production
Optimize performance of data workflows and NLP model inference
Monitor and troubleshoot data pipelines and infrastructure

Required Skills:

Strong experience in Python and SQL
Knowledge of NLP libraries (e.g., spaCy, NLTK, Hugging Face, gensim)
Experience with data processing frameworks (e.g., Apache Spark, Airflow, or similar)
Understanding of text preprocessing techniques (tokenization, stemming, lemmatization, etc.)
Familiarity with cloud platforms (AWS, Google Cloud Platform, or Azure) and data storage solutions
Knowledge of version control (Git) and CI/CD practices

Preferred Qualifications:

Bachelor's or Master s degree in Computer Science, Data Engineering, or related field
Experience deploying NLP models in production environments
Familiarity with containerization tools (Docker, Kubernetes)
Exposure to MLflow or other model management tools

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share