Skip to Content
SDKAgents

Agents

The SDK agents framework lets you build conversational workflows that can reason, call tools, and return structured results. Use it when a workflow needs multiple tool calls, user-facing progress events, or Model Context Protocol (MCP) access to the Rhesis API.

Agents are available from rhesis.sdk.agents. The framework includes BaseAgent, BaseTool, MCPTool, ArchitectAgent, ExploreEndpointTool, and shared result schemas.

Core concepts

ConceptPurpose
BaseAgentRuns the ReAct loop, formats tool schemas, tracks execution history, and emits lifecycle events.
BaseToolAsync tool interface for SDK-side tools. Implement name, description, parameters_schema, and execute.
MCPToolAdapts an MCP server into the same tool interface used by agents.
AgentEventHandlerReceives lifecycle events for logging, WebSockets, telemetry, or UI updates.
ToolCall / ToolResultTyped schemas for tool input and tool output.

BaseAgent accepts these runtime controls:

ParameterDefaultDescription
max_iterations10Maximum ReAct loop iterations before the agent stops.
max_tool_executionsmax_iterations * 3Maximum number of tool calls allowed in a run.
timeout_secondsNoneOptional wall-clock timeout for the run.
history_window20Number of execution steps included in the next prompt.
event_handlers[]Async handlers for lifecycle, streaming, and Architect-specific events.

Connect to Rhesis tools over MCP

Use get_rhesis_tools() to connect an agent to the Rhesis backend MCP endpoint. It reads RHESIS_BASE_URL and RHESIS_API_KEY from SDK configuration unless you pass explicit values.

rhesis_tools.py
from rhesis.sdk.agents import BaseAgent, get_rhesis_tools

agent = BaseAgent(
    model="vertex_ai/gemini-2.0-flash",
    tools=get_rhesis_tools(),
    max_iterations=8,
)

result = agent.run("List my endpoints and summarize what each one tests.")
print(result.final_answer)

MCPTool reconnects on the next call when the underlying transport or session is lost. This makes it safe to reuse across repeated script or notebook calls that create new event loops. If the transport drops during a tool call, that call fails and the caller is responsible for retrying.

Create a custom tool

Subclass BaseTool when your agent needs SDK-side behavior that is not exposed by MCP.

custom_tool.py
from rhesis.sdk.agents import BaseAgent, BaseTool
from rhesis.sdk.agents.schemas import ToolResult

class EchoTool(BaseTool):
    @property
    def name(self) -> str:
        return "echo"

    @property
    def description(self) -> str:
        return "Echoes a short message back to the caller."

    @property
    def parameters_schema(self) -> dict:
        return {
            "type": "object",
            "properties": {
                "message": {
                    "type": "string",
                    "description": "Message to echo back.",
                },
            },
            "required": ["message"],
        }

    async def execute(self, message: str, **kwargs) -> ToolResult:
        return ToolResult(tool_name=self.name, success=True, content=message)

agent = BaseAgent(tools=[EchoTool()])
result = agent.run("Call echo with the message hello.")
print(result.final_answer)

Handle agent events

Event handlers are async no-ops by default. Override only the events your integration needs.

agent_events.py
from rhesis.sdk.agents import AgentEventHandler, BaseAgent, get_rhesis_tools

class ConsoleEvents(AgentEventHandler):
    async def on_tool_start(self, *, tool_name, arguments, reasoning=None, **kwargs):
        print(f"Starting {tool_name}: {arguments}")

    async def on_tool_end(self, *, tool_name, result, **kwargs):
        print(f"Finished {tool_name}: success={result.success}")

agent = BaseAgent(
    tools=get_rhesis_tools(),
    event_handlers=[ConsoleEvents()],
)

agent.run("Show the first five test sets.")

Common events include:

EventWhen it fires
on_agent_start / on_agent_endRun lifecycle starts or completes.
on_iteration_start / on_iteration_endEach ReAct iteration starts or completes.
on_llm_start / on_llm_endThe model call starts or returns a parsed action.
on_tool_start / on_tool_endA tool call starts or completes.
on_stream_start / on_text_chunk / on_stream_endFinal response streaming starts, emits chunks, and completes.
on_mode_change / on_plan_updateArchitect-specific mode and plan updates.

Tool execution includes wall-clock timing in ToolResult.duration_ms, which is useful for backend telemetry and frontend streaming indicators.

ArchitectAgent

ArchitectAgent extends BaseAgent and adds stateful conversation features for test-suite design and execution:

  • discovery/planning/creating/executing modes
  • plan persistence and progress tracking
  • confirmation guardrails for mutating tools
  • async task waiting state (await_task)
  • event hooks for streaming UIs

Constructor parameters

ParameterRequiredDescription
modelNoModel slug or BaseLLM instance
toolsNoList of BaseTool and/or MCPTool instances
configNoArchitectConfig for limits and guardrails
max_iterationsNoMaximum ReAct iterations for a turn
max_tool_executionsNoSafety cap for total tool calls in a turn
timeout_secondsNoOptional wall-clock timeout per turn
history_windowNoNumber of recent messages/tool steps injected into prompts
event_handlersNoList of AgentEventHandler implementations

Async usage with attachments

chat_async() accepts optional attachments with mention and file payloads.

architect_agent_async.py
import asyncio
from rhesis.sdk.agents import ArchitectAgent, get_rhesis_tools

agent = ArchitectAgent(tools=get_rhesis_tools())

attachments = {
    "mentions": [{"type": "endpoint", "id": "endpoint-uuid", "display": "Support Bot"}],
    "files": [{"filename": "requirements.md", "content": "# Test requirements..."}],
}

async def run():
    text = await agent.chat_async(
        "Use this endpoint and file to propose a test plan.",
        attachments=attachments,
    )
    print(text)

asyncio.run(run())

Explore endpoints with Penelope

ExploreEndpointTool delegates endpoint probing to Penelope. Use it when an agent needs to learn what an endpoint does before designing tests.

The tool supports two modes:

ModeHow to configureUse case
BoundPass endpoint_id or a loaded Endpoint.Scripts and notebooks exploring one endpoint.
UnboundPass target_factory.Backend workers where the agent resolves endpoint_id at call time.

Named strategies are:

StrategyPurpose
domain_probingDiscover the endpoint domain and purpose.
capability_mappingEnumerate supported behaviors and interaction patterns.
boundary_discoveryFind refusal patterns, limitations, and edge cases.
comprehensiveRun domain probing first, then capability mapping and boundary discovery.
explore_endpoint.py
import asyncio
from rhesis.sdk.agents import ExploreEndpointTool
from rhesis.sdk.entities import Endpoint

endpoint = Endpoint(id="00000000-0000-0000-0000-000000000000")
endpoint.pull()

tool = ExploreEndpointTool(endpoint=endpoint, max_turns=5)

async def main():
    result = await tool.execute(strategy="domain_probing")
    print(result.content)

asyncio.run(main())