Senior Data & AI Engineer

Overview

On Site
Depends on Experience
Contract - W2
Contract - Independent

Skills

Amazon Web Services
Apache Avro
Apache HTTP Server
Apache Parquet
Artificial Intelligence
Collaboration
Continuous Delivery
Continuous Integration
Data Engineering
Data Governance
Database
Databricks
File Formats
Generative Artificial Intelligence (AI)
LangChain
LangSmith
Machine Learning (ML)
Management
Orchestration
Performance Tuning
Prompt Engineering
Python
Reasoning
Regulatory Compliance
SQL
Semantic Search
Unity
Use Cases
Vector Databases
Workflow

Job Details

Role: Senior Data & AI Engineer Databricks Lakehouse and Agentic AI

Location: Pleasanton, US (Preference for candidates with AWS experience)

Job Summary:

We are seeking an experienced Senior Data & AI Engineer with deep expertise in Databricks Lakehouse architecture on AWS, Agentic AI systems, and advanced vector search and LLM orchestration frameworks like LangGraph and LangChain. The ideal candidate will have hands-on experience in implementing scalable Lakehouse solutions, optimizing data formats such as Apache Iceberg and Delta, and designing intelligent workflows using multi-agent AI architectures.

Key Responsibilities:

  • Architect and implement scalable data lakehouse solutions using Databricks on AWS, including Delta Lake, Unity Catalog, and Lakehouse AI features.
  • Develop and orchestrate multi-agent AI applications leveraging LangGraph, LangChain, and other LLM toolchains to support business workflows and automation.
  • Design and implement vector database search solutions for retrieval-augmented generation (RAG), semantic search, and contextual memory.
  • Work with multiple file formats including Apache Iceberg, Parquet, and Delta, ensuring performance optimization, schema evolution, and ACID compliance.
  • Integrate LLM orchestration pipelines into Lakehouse workflows, enabling intelligent data retrieval, transformation, and reasoning.
  • Collaborate with data scientists, ML engineers, and platform teams to build end-to-end AI-powered data pipelines.
  • Build reusable components, agent workflows, and vector indexing strategies for enterprise AI use cases.

Required Qualifications:

  • 5+ years of experience in data engineering or AI/ML engineering.
  • Hands-on expertise with Databricks on AWS, including workspace management, notebooks, MLflow, and Unity Catalog.
  • Experience with Lakehouse AI, Vector Search, and Agentic AI (LangGraph, CrewAI, OpenAgents, or similar frameworks).
  • Proficient in Python and SQL, with experience in developing data transformation logic and vector embedding workflows.
  • Strong knowledge of file formats like Apache Iceberg, Delta Lake, Parquet, and Avro.
  • Experience deploying and optimizing vector databases such as FAISS, Chroma, Weaviate, or native Databricks vector search.
  • Familiarity with LLMOps practices and tooling (e.g., LangSmith, PromptLayer, Weights & Biases).
  • Understanding of LLM APIs (OpenAI, HuggingFace), prompt engineering, and memory management in agentic systems.

Preferred Qualifications:

  • Databricks certifications (Data Engineer, Machine Learning Associate, Generative AI Engineer).
  • Experience with LangChain Expression Language (LCEL) and LangGraph DAGs for agent workflows.
  • Familiarity with Databricks Model Serving, Feature Store, and production-grade LLM integrations.
  • Exposure to event-driven architectures, CI/CD for ML, and secure data governance using Unity Catalog.

Best Regards

Janani Suresh

Senior HR Recruiter

Purple Drive Technologies LLC

Certified Minority Business Enterprise (MBE)

Cell: +1

Mailto:

Website:

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Purple Drive Technologies LLC