Skip to Content
PlatformOverview

Platform

The Rhesis platform provides comprehensive tools for testing and evaluating AI applications at scale.

Why Rhesis?

Testing AI applications is fundamentally different from traditional software testing. Rhesis is purpose-built for AI testing:

  • AI-Native Testing: Generate tests using AI, evaluate responses with LLMs
  • Scale: Test thousands of scenarios automatically
  • Insight: Track quality trends, catch regressions, compare models
  • Collaboration: Multi-user workflows with roles and permissions
  • Integration: Works with any AI model or framework

New to Rhesis? Start with Core Concepts to understand how everything fits together, or explore the platform locally by following the Getting Started guide.

Where to Start

New to Rhesis? Follow this path:

  1. Create a Project - Organize your testing work
  2. Configure Endpoints - Connect to your AI application
  3. Generate Tests - Create test cases with AI assistance
  4. Define Metrics - Set up evaluation criteria
  5. Run and Analyze - Execute tests and review results

Already familiar? Jump to any feature below.

Core Features

Organizations & Team

Manage organization settings, invite team members, and configure contact information and preferences.

Projects

Organize testing work into projects with environment management, visual icons, and comprehensive project settings.

Endpoints

Configure API endpoints that your tests execute against, with support for REST and WebSocket protocols.

Tests

Create and manage test cases manually or generate them using AI with document context and iterative feedback.

Test Sets

Organize tests into collections and execute them against endpoints with parallel or sequential execution modes.

Test Runs

View execution results for individual test runs with filtering, comparison, and detailed metric analysis.

Test Results

Dashboard for analyzing test result trends, metrics performance, and historical data with advanced filtering.

Metrics

Define and manage LLM-based evaluation criteria with behaviors, scoring types, and model-driven grading.

Integrations

Connect with your existing development workflow and external services.

Advanced Capabilities

Once you’re up and running, explore these advanced features:

Test Runs

Deep analysis with comparison and filtering

Test Results

Aggregate analytics and trend visualization

Test Sets

Organize and execute test collections

API Tokens

Programmatic access via Python SDK

Organizations

Team management and access control


Need Help? Check out our Development Guide for SDK and API documentation, or visit Getting Started for initial setup.