Integrations

Rhesis plugs into your LLM stack across four layers, each addressing a different concern. This page is the single map: pick the layer you care about and follow the link into the matching guide.

Layer	What it covers	Where to start
LLM providers	The model that runs your test generation and LLM-as-Judge evaluation.	Models
Tracing	Streaming OpenTelemetry spans from your application to Rhesis.	Tracing · Auto-instrumentation
Test execution	Letting Rhesis invoke entry points in your application remotely to run test cases.	Connector
REST API	Programmatic access to test sets, runs, and platform resources.	api.rhesis.ai/docs

Beyond those four layers, Rhesis also bundles evaluation frameworks (DeepEval, Ragas, Garak) and MCP servers as additional sources of metrics, tests, and tool context.

Tracing your application

Your application emits spans through the Rhesis SDK; spans are batched and sent to Rhesis over HTTP using OpenTelemetry conventions. The integration mechanism depends on the framework:

LangChain and LangGraph are auto-instrumented end-to-end with a single auto_instrument() call — no per-function decorators required.
Other Python frameworks (CrewAI, OpenAI Agents SDK, LlamaIndex, AutoGen, …) are traced with the @observe.* decorators, applied to the functions, tools, or agents you want to capture.
Any OpenTelemetry-compatible exporter can target the Rhesis ingestion endpoint directly. See Tracing setup for the exact endpoint and headers.

See: Tracing overview · Auto-instrumentation · Decorators · Multi-agent tracing

Rhesis provides observability via OpenTelemetry-based tracing. Auto-instrumentation works out of the box for LangChain and LangGraph; connect any Python app using the Connector with the @endpoint decorator.

LangChain

LangChain LangGraph

LangGraph SDK Connector

SDK Connector

Test execution: the connector

For Rhesis to run test cases against your application, it needs a way to call your code from outside your environment. The Rhesis SDK maintains a persistent outbound WebSocket connection to Rhesis once the connector is started — typically by calling client.connect() in sync scripts, or implicitly when the connector runs inside an existing event loop (for example, a web server or an async app). Rhesis can then invoke registered entry points whenever a test run fires. The connection is outbound from your app, so it works through firewalls and from local laptops without exposing a public URL.

Register an entry point with the @endpoint decorator. When a test run starts, Rhesis sends each test case’s input down the WebSocket; your application runs the function locally and sends the output back up the same connection. The same call path serves single-turn test cases and multi-turn conversations.

See: Connector · Parameter binding · Connector examples

LLM providers

Choose any provider for the LLMs that drive test synthesis and LLM-as-Judge evaluation. Provider routing is powered by LiteLLM , giving you a single interface to 100+ models — cloud (OpenAI, Anthropic, Google Gemini, Mistral, Cohere, Groq, Together AI) or local/self-hosted (Ollama, vLLM, LiteLLM proxy).

See: Models · API tokens

Supported LLM backends you can configure under Models. For API access, see API Tokens.

Anthropic

Azure AI Studio

Azure OpenAI

Cohere

Google

Groq

LiteLLM Proxy

Evaluation frameworks

In addition to Rhesis-native metrics, you can use metrics from DeepEval and Ragas, and import Garak probes as test sets for adversarial scanning.

See: Frameworks · Metrics · Adversarial testing

In addition to Rhesis metrics, you can use DeepEval, Ragas, and Garak metrics and import Garak test sets — see framework integrations.

MCP servers

Connect MCP servers so Rhesis can use external tools during workflows — for example pulling context from Notion, GitHub, Jira, or Confluence during test generation.

See: MCP

Connect MCP servers so Rhesis can use external tools during workflows. See the MCP guide for setup.

REST API

Direct API access for custom integrations and CI/CD pipelines: manage test sets, trigger test runs, fetch results, and inspect traces programmatically. Language-agnostic — call from Python, TypeScript, Go, shell scripts, or anywhere else.

See: OpenAPI spec · API tokens

Tip: New to Rhesis? Start with Getting started.