Overview
Remote
Full Time
Skills
Knowledge Base
Scalability
Auditing
Workflow
Amazon S3
Remote Desktop Services
Amazon RDS
Amazon SQS
Regression Analysis
Reporting
Dashboard
Testing
Evaluation
Storage
Orchestration
Vector Databases
Semantics
KPI
Automated Testing
Python
Java
LangChain
LlamaIndex
Continuous Integration
Continuous Delivery
Leadership Development
Soft Skills
Google Cloud
Google Cloud Platform
Microsoft Azure
Amazon Web Services
LinkedIn
English
Job Details
We are looking for a skilled LLM Test Automation Engineer to design and implement an advanced testing framework for a Retrieval-Augmented Generation (RAG) Knowledge Base deployed in AWS.
This role is crucial in ensuring accuracy, relevance, safety, and scalability by leveraging cutting-edge metrics and techniques like RAGAS and LLM-as-a-judge.
This position offers remote setup with the flexibility to work from any location in Georgia, whether it's your home, well-equipped offices in Tbilisi and Batumi or a coworking space in Kutaisi.
RESPONSIBILITIES
REQUIREMENTS
NICE TO HAVE
WE OFFER
This role is crucial in ensuring accuracy, relevance, safety, and scalability by leveraging cutting-edge metrics and techniques like RAGAS and LLM-as-a-judge.
This position offers remote setup with the flexibility to work from any location in Georgia, whether it's your home, well-equipped offices in Tbilisi and Batumi or a coworking space in Kutaisi.
RESPONSIBILITIES
- Audit the current LLM and RAG test automation framework, identify gaps, and propose improved architecture
- Implement automated testing pipelines incorporating RAGAS metrics and LLM-as-a-judge methods for quality assessment
- Integrate workflows with AWS services (S3, Lambda, CloudWatch, OpenSearch, RDS, SQS) for seamless operations
- Define clear evaluation rubrics, metric thresholds, regression alerts, and reporting dashboards to improve monitoring
- Ensure reproducible, scalable testing solutions within CI/CD pipelines
REQUIREMENTS
- 3+ years of experience in LLM and RAG evaluation, including RAGAS and prompt-based judging automation
- Strong competency in AWS services (compute, storage, orchestration, monitoring)
- Background in vector databases and semantic similarity metrics
- Ability to establish KPIs and translate them into automated testing logic
- Proficiency in Python or Java
NICE TO HAVE
- Familiarity with LangChain, LlamaIndex, or similar RAG frameworks
- Understanding of CI/CD processes
WE OFFER
- We connect like-minded people
- Delivering innovative solutions to industry leaders, making a global impact
- Enjoyable working environment, whether it is the vibrant office or the comfort of your own home
- Opportunity to work abroad for up to two months per year
- Relocation opportunities within our offices in 55+ countries
- Corporate and social events
- We invest in your growth
- Leadership development, career advising, soft skills and well-being programs
- Certifications, including Google Cloud Platform, Azure and AWS
- Unlimited access to LinkedIn Learning and Get Abstract
- Free English classes with certified teachers
- We cover it all
- Participation in the Employee Stock Purchase Plan
- Monetary bonuses for engaging in the referral program
- Comprehensive medical & family care package
- Five trust days per year (sick leave without a medical certificate)
- Benefits package (sports activities, a variety of stores and services)
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.