Overview
Skills
Job Details
Position: Machine Learning Architect (Must be local to MO)
Location: St. Louis, MO /local/nearby Duration: 6 Months -- Priority: High
Primary Skills: Databricks, MLflow, MosaicML, LLM
Description:
As a Machine Learning Architect, you'll design and implement pipelines that connect large language models (LLMs) with structured and unstructured enterprise data. You'll work at the intersection of data engineering, machine learning, and software development, enabling intelligent retrieval, summarization, and reasoning capabilities over company knowledge.
This position is ideal for someone eager to apply practical ML and LLM techniques in production, leveraging Databricks, Python, and modern vector database frameworks.
What You'll Do
- Develop Retrieval & Embedding Pipelines:
Build and deploy pipelines that transform enterprise documents (Confluence pages, OneDrive files, internal reports) into structured and vectorized data for semantic retrieval.
Use tools like Databricks MLflow, MosaicML, and LangChain to orchestrate workflows. - Integrate LLMs with Knowledge Bases:
Design and implement Retrieval-Augmented Generation (RAG) systems to ground LLM outputs in enterprise data.
Collaborate with AI agents on Databricks to provide contextualized responses from internal knowledge stores. - Experiment & Optimize Models:
Evaluate different embedding models, fine-tuning strategies, and retrieval mechanisms for efficiency, scalability, and accuracy.
Contribute to prompt engineering, model benchmarking, and performance tracking. - Collaborate Across Disciplines:
Work closely with Data Engineers on ingestion and cleaning pipelines, and with Software Engineers on API integration and front-end consumption of ML services. - Operationalize ML Solutions:
Use MLflow to track experiments, automate deployment pipelines, and ensure reproducibility across environments.
Contribute to testing, documentation, and continuous improvement of ML infrastructure.
What You Bring
- Solid foundation in machine learning, natural language processing, or applied AI.
- Proficiency in Python and familiarity with frameworks such as PyTorch, TensorFlow, or Hugging Face Transformers.
- Experience with Databricks, MLflow, or MosaicML.
- Familiarity with LangChain, LlamaIndex, or similar RAG frameworks.
- Understanding of vector databases (e.g., Chroma, Milvus, Pinecone, FAISS).
- Experience with API integration and data retrieval from enterprise systems (e.g., Confluence, SharePoint, OneDrive).
- Ability to collaborate in a cross-functional engineering team and communicate complex technical concepts clearly.
Bonus Skills
- Experience fine-tuning or evaluating LLMs (e.g., Llama, MPT, Falcon, or Databricks-hosted models).
- Knowledge of OCR pipelines for document ingestion and Databricks Unity Catalog for managing structured data.
- Background in cloud infrastructure, containerization (Docker), or CI/CD for ML systems.
- Prior work with embedding search optimization, semantic caching, or enterprise AI governance.
What You'll Gain
- Hands-on experience with production-grade AI infrastructure on Databricks.
- Exposure to enterprise-scale RAG systems and AI agent orchestration.
- Mentorship from experts in machine learning, data engineering, and software architecture.
- Opportunity to shape the foundation of a next-generation knowledge intelligence platform.