Behavior
Back to GlossaryTesting
A formalized expectation that describes how your AI system should perform, such as response quality, safety, or accuracy.
Overview
Behaviors define the expectations for how your AI system should perform. They serve as the foundation for creating metrics and organizing tests around specific quality dimensions.
Common Behavior Categories
Quality:
- Accuracy: Factually correct information
- Completeness: Comprehensive responses
- Relevance: Answers the actual question
- Clarity: Easy to understand
Safety:
- Harmlessness: No dangerous or harmful content
- Appropriate Refusal: Declines inappropriate requests
- Privacy Aware: Respects PII and confidentiality
- Bias-Free: Fair and unbiased responses
Functional:
- Tool Usage: Correctly uses available tools
- Format Compliance: Follows required formats
- Instruction Following: Adheres to guidelines
- Context Awareness: Uses conversation context
Using Behaviors
In the Web Interface: Define behaviors through the Rhesis web interface when creating metrics and organizing tests.
With SDK Synthesizers:
From Behaviors to Tests
- Define the behaviors you care about
- Generate tests that exercise those behaviors
- Create metrics to evaluate the behaviors
- Run evaluations and analyze results
- Iterate based on findings
Best Practices
- Be specific: Vague behaviors lead to inconsistent evaluation
- Provide examples: Show what good and bad looks like
- Prioritize: Focus on behaviors that matter most to users
- Iterate: Refine behaviors based on real-world performance