Job Description:
You'll work in a collaborative environment using cutting-edge technologies including Databricks, AWS, Collibra, DataMesh architecture, and PySpark to build scalable, production-ready AI systems.
This is a foundational role ? you'll establish our MLOps practices, GenAI frameworks, and production AI capabilities from the ground up in a highly regulated environment.
What You'll Be Doing
Consulting & Enablement (50%)
Your number one job will be to help advise economists and business teams on appropriate modeling approaches based on their use cases
Advise on appropriate modeling approaches for diverse scenarios: RAG/knowledge bases, anomaly detection, document understanding, audit analysis
Bridge the gap between econometric models (R, Stata) and production ML pipelines
Review and provide feedback on AI/ML architectural proposals
Train data engineers and business users on AI/ML best practices
Model Development (25%)
Build production-ready AI systems for document processing (PDFs, XLSX, DOCX, CSV etc.)
Develop and deploy 1-2 RAG/knowledge base systems in the first year
Create reusable GenAI frameworks and patterns for the organization
Implement solutions using AWS AI services (Bedrock, SageMaker, Textract, Databricks etc.)
Ensure models meet explainability requirements for regulated environments
MLOps & Support (25%)
Establish MLOps framework and model deployment patterns
Troubleshoot model performance issues (accuracy, latency, cost)
Act as an escalation point for AI/ML technical issues
Train Users by providing models and documentation, as well as consulting
Monitor and maintain production models
Stay current on AI/ML techniques and Federal regulatory requirements
Help other Support Team members advance their knowledge of Data Science and modeling
Minimum Qualifications
Education: Master?s degree in Data Science, Statistics, Computer Science, Mathematics, or related quantitative field
Experience: 4+ years in data science, ML engineering, or AI development roles
Production ML: Proven track record building and deploying ML/AI models in production environments
Programming: Strong Python proficiency; experience with SQL and at least one statistical language (R, Stata, Matlab, Sparkly R)
ML Frameworks: Hands-on experience with modern ML frameworks (scikit-learn, TensorFlow, PyTorch, Hugging Face)
Generative AI: Practical experience with LLMs, RAG architectures, and prompt engineering
Document AI: Experience processing and extracting insights from unstructured documents at scale
Cloud Platforms: Working knowledge of AWS AI/ML services (SageMaker, Bedrock preferred)
Communication: Ability to explain complex AI concepts to non-technical stakeholders and translate business problems into technical solutions
Tooling: Experience working with our tech stack (Databricks, AWS AI/ML tools, Starburst preferred)
Requirements
Must-Have Skills:
1) ?4 years in data science, ML engineering, or AI development roles
2) Strong Python proficiency
3) Experience with SQL and at least one statistical language (R, Stata, Matlab, Sparkly R)
4) Hands-on experience with modern ML frameworks (scikit-learn, TensorFlow, PyTorch, Hugging Face)
5) Practical experience with LLMs, RAG architectures, and prompt engineering
6) Experience processing and extracting insights from unstructured documents at scale
7) Experience working with our tech stack (Databricks, AWS AI/ML tools, Starburst preferred)
8) Proven track record building and deploying ML/AI models in production environments
9) Working knowledge of AWS AI/ML services (SageMaker, Bedrock preferred)
10) Ability to explain complex AI concepts to non-technical stakeholders and translate business problems into technical solutions
11) Master?s degree in Data Science, Statistics, Computer Science, Mathematics, or related quantitative field