Overview
Remote
Depends on Experience
Contract - W2
Skills
Data Engineer
GenAI
Generative Artificial Intelligence (AI)
AI
Artificial Intelligence
Machine Learning Operations (ML Ops)
Gen AI
Amazon Web Services
Google Cloud Platform
GCP
MLOps
Apache Spark
DevOps
Git
Python
SQL
Job Details
Position: Data Engineer
Openings: 3
Location: Remote
Duration: Contract to end of year + likely extensions
Interview Process:
- 1 round with team leads 30 minutes coding exercise + 30 minutes technical questions
- 1 round with HM, technical & behavioral questions, no coding
Requirements BEFORE submittal to client:
- Completed Feenyx assessment (will send out link once the assessment has been created)
- Video call
- Please include a managerial reference in the submittal and we will assist in reaching out
Start Date: ASAP this is an urgent need
Must Haves:
- 3+ years of experience writing SQL within database systems such as BigQuery, Teradata
- 1+ years of experience with Generative AI: Familiarity with GenAI concepts, prompt engineering, LLMs (preferably Gemini and Copilot) and frameworks.
- 3+ years of hands-on experience building modern data pipelines within a major cloud platform (Google Cloud Platform, AWS, Azure), preferably Google Cloud Platform.
- 3+ years of experience with Python or other comparable scripting language in a data engineering setting
- 3+ years of experience with data orchestration, and pipeline engineering services such as Airflow, Spark, preferably Composer / DataProc
- 3+ years of experience deploying to, and managing CI/CD within, cloud database platforms.
- Demonstrates a keen awareness of the member experience while seeking opportunities for automation and innovation to automate and optimize business operations.
Plusses:
- Experience in designing and building data engineering solutions in cloud environments with knowledge of Google Cloud Platform services like Gemini, Cloud Storage, Composer and BigQuery strongly preferred.
- Experience with Git, CI/CD pipeline, and other DevOps principles/best practices
- Experience with agentic frameworks and RAG approaches.
- Knowledge of vector databases.
- Familiarity with MLOps practices and AI model deployment
- Strong collaboration and communication skills within and across teams
Responsibilities:
Designs, builds, and maintains the data infrastructure that supports the organization's data-related initiatives utilizing Gen AI and automation at its core. Collaborate with cross-functional teams, including BI developers, data scientists, analysts, and software engineers, to ensure efficient and reliable processing, storage, and retrieval of data. Develops scalable data pipelines, automates and optimizes data workflows, and ensures the quality and integrity of the data.
- Designs scalable and efficient data pipelines to extract, transform, and load data from various sources into data warehouses or data lakes.
- Integrates GenAI models into existing data systems and workflows, ensuring seamless data flow through automation
- Design and optimize data architectures and processes to integrate AI workflows, ensuring models receive high-quality data necessary for optimal performance, including handling large, diverse datasets
- Identifies use cases for Gen AI utilization, opportunities to streamline data engineering processes, improve efficiency, and enhance the quality of deliverables
- Documents data engineering processes, workflows, and systems for reference and knowledge-sharing purposes.
- Implements data quality checks and validation processes to ensure the accuracy, completeness, and consistency of the data.
- Provides guidance and mentorship to junior data engineers to help them develop their technical skills and grow in their role
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.