Overview
Skills
Job Details
Job Details
GEN AI Engineer
Remote
Long Term Contract
Job Description Key Responsibilities:
Perform automated scoring and qualitative evaluations of LLM-generated responses across multiple use cases.
Develop and maintain scoring guidelines and rubrics to ensure consistency and objectivity.
Collaborate with data scientists, product managers, and engineering teams to align scoring with project goals.
Assist in the creation and labeling of high-quality evaluation datasets for prompt tuning or model fine-tuning.
Utilize NLP-based metrics and tools (e.g., ROUGE, BLEU, cosine similarity) for automated scoring support.
Document scoring patterns, common model errors, and improvement opportunities.
Contribute to prompt experimentation and help compare effectiveness of different prompt strategies.
Qualifications:
Prior experience with LLMs (e.g., GPT, Claude, LLaMA, etc.) or AI/NLP projects is highly preferred.
Strong analytical skills and attention to detail, especially in assessing language quality.
Familiarity with prompt engineering, generative AI, or conversational AI tools is a plus.
Hands-on experience with Python, Jupyter, or evaluation libraries (optional but desirable).
Experience working with evaluation frameworks or annotation tools (Label Studio, Prodigy, etc.) is a bonus.
Excellent written and verbal communication skills