No third party resume
Position Description: Responsible for developing techniques or analytics to transform raw data into meaningful information using statistical analysis, machine learning, and visualization software. The individual is responsible for the following tasks:
Collect, process, and analyze structured and unstructured data applying data mining, data modeling, natural language processing, and machine learning techniques.
Develop predictive models and algorithms to solve problems.
Design data pipelines and workflow to automate data processing.
Collaboration with multiple teams to understand requirements and objectives.
Develop data visualizations, dashboards, and reports.
Education: This position requires a Master’s Degree or PH.D. from an accredited college or university with a major in computer science, statistics, mathematics, economics, or related field. Three (3) years of equivalent experience in a related field may be substituted for the Bachelor’s degree.
General Experience: The proposed candidate must have a minimum of three (3) years of experience in data science.
Specialized experience: The candidate should have experience as a data scientist or similar role with a strong understanding of statistical analysis and machine learning techniques. The candidate should be proficient in SQL, Python, R, or related programming and experience with machine learning libraries and frameworks. They should possess an understanding of statistical concepts and methodologies, such as regression analysis, clustering, and classification.
AI Governance & Safety: Implement guardrails to mitigate hallucinations, bias, and security vulnerabilities (e.g., prompt injection) in public-sector AI applications.
Model Evaluation: Develop rigorous benchmarking suites to evaluate LLM performance on domain-specific tasks using both automated metrics and human-in-the-loop feedback.
Fine-Tuning & Optimization: Identify opportunities for Parameter-Efficient Fine-Tuning (PEFT) or LoRA to adapt foundation models to specific state government datasets.
Generative AI Ecosystem: At least 2 years of hands-on experience working with models such as GPT-4, Claude, Llama,Gemini, and their respective APIs.
Tooling & Frameworks: Expert-level proficiency with AI orchestration libraries such as LangChain, LlamaIndex, or Haystack.
Vector Databases: Experience with vector storage and indexing solutions such as Pinecone, Weaviate, Milvus, or pgvector.
Data Curating for AI: Experience in preparing high-quality synthetic data or curated instruction-tuning datasets for model alignment.
Public Sector/Compliance Awareness: (Preferred) Experience working within regulated environments, ensuring data privacy (PII/PHI) and ethical AI standards are met.