Skip to Content

Docs Guides SDK Glossary Changelog Contribute SDK Reference

CTRL K

CTRL K

- Terms
Changelog
SDK Reference

Welcome
Core Concepts
Product Tour
Agent Skill
- Organizations & Team
Define
Knowledge
Behaviors
Generate
Playground
Improve
Insights
- Test Runs
- Test Execution
- Tasks
- Test Reviews
Connect
MCP
Models
Integrations
API Tokens
Acknowledgments

On This Page

The workflow
Platform structure

Questions, feedbacks? Contact us!

DocsCore Concepts

Core Concepts

Rhesis evaluates whether your AI application meets the quality bar you care about. This page has two parts: the workflow (what you do and in what order) and the platform structure (what each object is and how they connect).

The workflow

A normal workflow in Rhesis looks like this:

Connect your endpoint — link Rhesis to the system under test. See Connect your application.
Define behaviors — decide what good looks like for your application.
Set up metrics — assign judges to each behavior so responses can be scored automatically.
Create tests — write, import, or generate inputs; tag each test with a behavior.
Run a test set — execute a batch of tests against your endpoint.
Review results — inspect individual failures, then track trends across runs.

New to Rhesis? The Getting Started guide walks through environment setup and your first run.

Platform structure

Setup

Configure before you run

What to test

Test: One prompt or conversation goal, tagged with a behavior.

Test↗

One prompt or conversation goal, tagged with a behavior.

Learn more →Test set: Batch of tests for a run. A test can belong to many sets.

Test set↗

Batch of tests for a run. A test can belong to many sets.

How to measure

Behavior: What good looks like. Defaults: Reliability, Robustness, Compliance.

Behavior↗

What good looks like. Defaults: Reliability, Robustness, Compliance.

Learn more →Metric: Judge assigned to behaviors. Pass/fail, score, and reasoning.

Metric↗

Judge assigned to behaviors. Pass/fail, score, and reasoning.

↓

Your application

System under test

Endpoint: How Rhesis reaches your app. Sends input, receives output for evaluation.

Endpoint↗

How Rhesis reaches your app. Sends input, receives output for evaluation.

Connection options

REST API: HTTP endpoint with URL and UI mapping.

REST API↗

HTTP endpoint with URL and UI mapping.

Learn more →SDK Connector: @endpoint on your function, invoked over WebSocket.

SDK Connector↗

@endpoint on your function, invoked over WebSocket.

↓

Results

What you get after a run

Test run: One execution of a test set against an endpoint. Full snapshot.

Test run↗

One execution of a test set against an endpoint. Full snapshot.

Learn more →Results overview: Dashboard across runs: trends, pass rates by behavior.

Results overview↗

Dashboard across runs: trends, pass rates by behavior.

Solid flow: test set → endpoint → test run. Metrics score each test result after the response returns.

Ready to start? Create a project, connect an endpoint, and generate tests to run your first evaluation.

Default Chatbot (Rosalind)Product Tour

Features

Generation
Metrics
Conversations
Observability

Guides

Quick start guide
SDK connector
CI/CD integration
Testing user journeys

Glossary

LLM as a Judge
Test Generation
Trace
Agent

Company

About us
Careers
Contact us

Copyright ©2026 Rhesis AI GmbH • Made in Potsdam, Germany.

Imprint Privacy Terms