Skip to Content
GlossaryMulti-Turn Test - Glossary

Multi-Turn Test

Back to GlossaryTesting

Goal-based conversation tests that evaluate your AI system across multiple turns, powered by Penelope.

Also known as: multi turn, conversational test

Overview

Multi-turn tests evaluate conversational AI systems through goal-oriented dialogues. Powered by Penelope, these tests adapt their strategy based on your AI's responses, testing complex scenarios that require multiple exchanges.

How It Works

  1. Goal Definition: Define what the test should achieve
  2. Adaptive Conversation: Penelope conducts a natural dialogue
  3. Context Tracking: Maintains conversation state across turns
  4. Goal Assessment: Evaluates if the objective was met

Use Cases

Customer Support:

  • Test problem resolution workflows
  • Verify information gathering
  • Check escalation handling

E-commerce:

  • Evaluate product discovery
  • Test personalization
  • Verify upsell appropriateness

Technical Assistance:

  • Multi-step troubleshooting
  • Iterative refinement
  • Context-dependent responses

Example with Penelope

python
from rhesis.penelope import PenelopeAgent, EndpointTarget

# Initialize Penelope
agent = PenelopeAgent()

# Create target (your AI endpoint)
target = EndpointTarget(endpoint_id="my-chatbot")

# Execute a multi-turn test
result = agent.execute_test(
      target=target,
      goal="Book a hotel room for 2 adults in Paris for 3 nights",
      max_iterations=10
)

print(f"Goal achieved: {result.goal_achieved}")
print(f"Turns used: {result.turns_used}")

Key Differences from Single-Turn

AspectSingle-TurnMulti-Turn
ConversationOne exchangeMultiple exchanges
ContextNoneMaintained across turns
ComplexitySimpleComplex scenarios
Execution TimeFastSlower
Use CaseQuick checksWorkflow testing

Best Practices

  • Clear goals: Define specific measurable objectives
  • Reasonable scope: Limit turns to 5-15 for most tests
  • Edge cases: Test conversation recovery and clarification
  • Combine with single-turn: Use both types for comprehensive coverage

Documentation

Related Terms