Getting Started

Get Penelope installed and run your first test in minutes.

Installation

Penelope is part of the Rhesis monorepo and uses uv for dependency management.

Prerequisites

Python 3.10+
uv package manager
RHESIS_API_KEY for accessing Rhesis endpoints (Get your key )

Install Penelope

Clone the Repository

Terminal
git clone https://github.com/rhesis-ai/rhesis.git
cd rhesis/penelope

Install Dependencies

Terminal
uv sync

This automatically installs Penelope and the local SDK from ../sdk.

Set Up Authentication

Terminal
export RHESIS_API_KEY="rh-your-api-key"

Get your API key from Rhesis App → Settings. See the SDK Authentication Guide for details.

Verify Installation

Terminal
cd examples
uv run python basic_example.py --help

Your First Test

Now let’s run a simple test against a conversational AI system.

basic_test.py
from rhesis.penelope import EndpointTarget, PenelopeAgent

# Initialize Penelope with defaults (Vertex AI / gemini-2.0-flash, 10 max iterations)
agent = PenelopeAgent(
    enable_transparency=True,  # Show reasoning at each step
    verbose=True,
)

# Create target
target = EndpointTarget(endpoint_id="your-endpoint-id")

# Execute a test - Penelope plans its own approach
result = agent.execute_test(
    target=target,
    goal="Verify chatbot can answer 3 questions about insurance policies while maintaining context",
)

print(f"Goal achieved: {result.goal_achieved}")
print(f"Turns used: {result.turns_used}")

Run It

Terminal
export RHESIS_API_KEY="rh-your-api-key"
cd rhesis/penelope/examples
uv run python basic_example.py --endpoint-id <your-endpoint-id>

Two Testing Approaches

Simple: Goal Only

Let Penelope plan the testing approach:

simple_test.py
result = agent.execute_test(
    target=target,
    goal="Test if the chatbot maintains context across 5 turns",
)

Best for: Exploratory testing, straightforward capability tests, basic quality checks

Detailed: Goal + Instructions

Provide specific testing methodology:

detailed_test.py
result = agent.execute_test(
    target=target,
    goal="Verify GDPR compliance in data handling",
    instructions="""
    1. Ask what data is being collected
    2. Try to provide personal data
    3. Check if explicit consent is requested
    4. Verify data minimization principles
    """,
    context={
        "regulation": "GDPR",
        "focus": "consent management",
    },
)

Best for: Security testing, compliance verification, complex multi-phase scenarios

Accessing Results

results.py
# Test outcome
print(f"Status: {result.status}")  # success, failure, error, timeout
print(f"Goal achieved: {result.goal_achieved}")
print(f"Duration: {result.duration_seconds}s")

# Key findings
for finding in result.findings:
    print(f"- {finding}")

# Full conversation history
for turn in result.history:
    print(f"Turn {turn.turn_number}: {turn.action}")
    print(f"Reasoning: {turn.reasoning}")
    print(f"Result: {turn.action_output}")

Custom Model Configuration

Override the default Vertex AI model:

custom_model.py
from rhesis.sdk.models import AnthropicLLM

agent = PenelopeAgent(
    model=AnthropicLLM(model_name="claude-4"),
    max_iterations=20,
    enable_transparency=True,
)

Development Setup

For contributing or running tests:

Terminal
# Install dev dependencies
uv sync --extra all

# Run tests
uv run pytest

# Run linting
make lint

# Type checking
uv run pyright

Next Steps: Explore Examples for real-world testing scenarios, learn about Configuration options, or understand how to Extend Penelope.