AI Tester / AI Quality Assurance Specialist
Location: Boston MA
Professional Summary
Detail-oriented AI Tester with experience evaluating Large Language Models (LLMs), generative AI applications, and conversational AI systems. Skilled in prompt engineering, AI response evaluation, bug reporting, and quality assurance testing. Proficient in using ChatGPT, Claude, Gemini, Jira, Postman, Label Studio, Excel, and Google Sheets to assess model performance, identify defects, and improve AI accuracy, safety, and user experience.
Key Responsibilities
· Conduct functional and quality testing of AI-powered applications, chatbots, and LLMs.
· Design and execute test cases to evaluate AI responses for accuracy, relevance, consistency, and safety.
· Perform prompt engineering and prompt variation testing to identify model limitations and edge cases.
· Document defects, hallucinations, bias issues, and performance gaps using Jira.
· Validate API responses and integrations using Postman.
· Review and annotate datasets using Label Studio to support AI model training and evaluation.
· Analyze testing results and generate quality reports using Excel and Google Sheets.
· Collaborate with product, engineering, and AI teams to improve model performance and user experience.
· Conduct regression testing after model updates and feature releases.
· Ensure compliance with AI safety, quality, and content guidelines.
Tools & Technologies
· AI Platforms: ChatGPT, Claude, Gemini, Microsoft Copilot
· QA & Bug Tracking: Jira
· API Testing: Postman
· Data Annotation: Label Studio
· Data Analysis & Reporting: Microsoft Excel, Google Sheets
· Testing: Manual Testing, Functional Testing, Regression Testing, UAT