Skip to Content
PenelopeGetting Started

Getting Started

Get Penelope installed and run your first test in minutes.

Installation

Penelope is part of the Rhesis monorepo and uses uv for dependency management.

Prerequisites

  • Python 3.10+
  • uv package manager
  • RHESIS_API_KEY for accessing Rhesis endpoints (Get your key )

Install Penelope

Clone the Repository

Terminal
git clone https://github.com/rhesis-ai/rhesis.git
cd rhesis/penelope

Install Dependencies

Terminal
uv sync

This automatically installs Penelope and the local SDK from ../sdk.

Set Up Authentication

Terminal
export RHESIS_API_KEY="rh-your-api-key"

Get your API key from Rhesis App  → Settings. See the SDK Authentication Guide for details.

Verify Installation

Terminal
cd examples
uv run python basic_example.py --help

Your First Test

Now let’s run a simple test against a conversational AI system.

basic_test.py
from rhesis.penelope import EndpointTarget, PenelopeAgent

# Initialize Penelope with defaults (Vertex AI / gemini-2.0-flash, 10 max iterations)

agent = PenelopeAgent(
enable_transparency=True, # Show reasoning at each step
verbose=True,
)

# Create target

target = EndpointTarget(endpoint_id="your-endpoint-id")

# Execute a test - Penelope plans its own approach

result = agent.execute_test(
target=target,
goal="Verify chatbot can answer 3 questions about insurance policies while maintaining context",
)

print(f"Goal achieved: {result.goal_achieved}")
print(f"Turns used: {result.turns_used}")

Run It

Terminal
export RHESIS_API_KEY="rh-your-api-key"
cd rhesis/penelope/examples
uv run python basic_example.py --endpoint-id 

Two Testing Approaches

Simple: Goal Only

Let Penelope plan the testing approach:

simple_test.py
result = agent.execute_test(
  target=target,
  goal="Test if the chatbot maintains context across 5 turns",
)

Best for: Exploratory testing, straightforward capability tests, basic quality checks

Detailed: Goal + Instructions

Provide specific testing methodology:

detailed_test.py
result = agent.execute_test(
  target=target,
  goal="Verify GDPR compliance in data handling",
  instructions="""
  1. Ask what data is being collected
  2. Try to provide personal data
  3. Check if explicit consent is requested
  4. Verify data minimization principles
  """,
  context={
      "regulation": "GDPR",
      "focus": "consent management"
  }
)

Best for: Security testing, compliance verification, complex multi-phase scenarios

Accessing Results

results.py
# Test outcome
print(f"Status: {result.status}")  # success, failure, error, timeout
print(f"Goal achieved: {result.goal_achieved}")
print(f"Duration: {result.duration_seconds}s")

# Key findings

for finding in result.findings:
print(f"- {finding}")

# Full conversation history

for turn in result.history:
print(f"Turn {turn.turn_number}: {turn.action}")
print(f"Reasoning: {turn.reasoning}")
print(f"Result: {turn.action_output}")

Custom Model Configuration

Override the default Vertex AI model:

custom_model.py
from rhesis.sdk.models import AnthropicLLM

agent = PenelopeAgent(
model=AnthropicLLM(model_name="claude-4"),
max_iterations=20,
enable_transparency=True
)

Development Setup

For contributing or running tests:

Terminal
# Install dev dependencies
uv sync --extra all

# Run tests

uv run pytest

# Run linting

make lint

# Type checking

uv run pyright

Next Steps: Explore Examples for real-world testing scenarios, learn about Configuration options, or understand how to Extend Penelope.