Overview
On Site
Accepts corp to corp applications
Contract - W2
Contract - Independent
Skills
Artificial Intelligence
Unstructured Data
Collaboration
Machine Learning (ML)
Workflow
Data Extraction
Data Integrity
Management
Data Warehouse
Real-time
Data Processing
Data Engineering
Python
Pandas
PySpark
Large Language Models (LLMs)
LangChain
Data Architecture
Data Modeling
Extract
Transform
Load
Orchestration
Apache Airflow
Cloud Computing
Amazon Web Services
Microsoft Azure
Google Cloud
Google Cloud Platform
SQL
NoSQL
Database
PostgreSQL
MongoDB
Docker
Continuous Integration
Continuous Delivery
Version Control
Git
Job Details
Role: Data Engineer with LLM
Location: Denver, CO (Onsite)
Duration: 12+ Months
About the Role:
We are seeking a highly skilled Data Engineer with expertise in Python and experience working with Large Language Models (LLMs) to join our dynamic team in Denver, Colorado. As part of our AI & Data Engineering group, you will be responsible for designing, building, and maintaining scalable data pipelines and integrating advanced LLMs into our data infrastructure and applications.
Key Responsibilities:
- Design, develop, and maintain robust data pipelines for processing large volumes of structured and unstructured data.
- Collaborate with data scientists, ML engineers, and product teams to integrate LLMs into enterprise applications and workflows.
- Optimize and automate data extraction, transformation, and loading (ETL) processes.
- Develop APIs and services for LLM-based models and applications.
- Ensure data integrity, governance, and quality throughout the data lifecycle.
- Monitor and improve the performance of data infrastructure and LLM-integrated systems.
- Support and manage data lakes, data warehouses, and real-time data processing platforms.
Required Qualifications:
- 9+ years of professional experience in data engineering.
- Strong proficiency in Python and related libraries (e.g., Pandas, PySpark, FastAPI).
- Hands-on experience with Large Language Models (e.g., OpenAI, Hugging Face, LangChain).
- Deep understanding of data architecture, data modeling, and ETL best practices.
- Experience with orchestration tools like Apache Airflow or similar.
- Familiarity with cloud platforms such as AWS, Azure, or Google Cloud Platform.
- Experience working with SQL and NoSQL databases (e.g., PostgreSQL, MongoDB).
- Knowledge of containerization (Docker), CI/CD pipelines, and version control systems like Git.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.