Data Engineer (Generative AI)

Overview

Remote

Depends on Experience

Contract - W2

Skills

Data Engineer

GenAI

Generative Artificial Intelligence (AI)

Artificial Intelligence

Machine Learning Operations (ML Ops)

Gen AI

Amazon Web Services

Google Cloud Platform

GCP

MLOps

Apache Spark

DevOps

Git

Python

SQL

Job Details

Position: Data Engineer

Openings: 3

Location: Remote

Duration: Contract to end of year + likely extensions

Interview Process:

1 round with team leads 30 minutes coding exercise + 30 minutes technical questions
1 round with HM, technical & behavioral questions, no coding

Requirements BEFORE submittal to client:

Completed Feenyx assessment (will send out link once the assessment has been created)
Video call
Please include a managerial reference in the submittal and we will assist in reaching out

Start Date: ASAP this is an urgent need

Must Haves:

3+ years of experience writing SQL within database systems such as BigQuery, Teradata
1+ years of experience with Generative AI: Familiarity with GenAI concepts, prompt engineering, LLMs (preferably Gemini and Copilot) and frameworks.
3+ years of hands-on experience building modern data pipelines within a major cloud platform (Google Cloud Platform, AWS, Azure), preferably Google Cloud Platform.
3+ years of experience with Python or other comparable scripting language in a data engineering setting
3+ years of experience with data orchestration, and pipeline engineering services such as Airflow, Spark, preferably Composer / DataProc
3+ years of experience deploying to, and managing CI/CD within, cloud database platforms.
Demonstrates a keen awareness of the member experience while seeking opportunities for automation and innovation to automate and optimize business operations.

Plusses:

Experience in designing and building data engineering solutions in cloud environments with knowledge of Google Cloud Platform services like Gemini, Cloud Storage, Composer and BigQuery strongly preferred.
Experience with Git, CI/CD pipeline, and other DevOps principles/best practices
Experience with agentic frameworks and RAG approaches.
Knowledge of vector databases.
Familiarity with MLOps practices and AI model deployment
Strong collaboration and communication skills within and across teams

Responsibilities:

Designs, builds, and maintains the data infrastructure that supports the organization's data-related initiatives utilizing Gen AI and automation at its core. Collaborate with cross-functional teams, including BI developers, data scientists, analysts, and software engineers, to ensure efficient and reliable processing, storage, and retrieval of data. Develops scalable data pipelines, automates and optimizes data workflows, and ensures the quality and integrity of the data.

Designs scalable and efficient data pipelines to extract, transform, and load data from various sources into data warehouses or data lakes.
Integrates GenAI models into existing data systems and workflows, ensuring seamless data flow through automation
Design and optimize data architectures and processes to integrate AI workflows, ensuring models receive high-quality data necessary for optimal performance, including handling large, diverse datasets
Identifies use cases for Gen AI utilization, opportunities to streamline data engineering processes, improve efficiency, and enhance the quality of deliverables
Documents data engineering processes, workflows, and systems for reference and knowledge-sharing purposes.
Implements data quality checks and validation processes to ensure the accuracy, completeness, and consistency of the data.
Provides guidance and mentorship to junior data engineers to help them develop their technical skills and grow in their role

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share