Test Run
Back to GlossaryResults
A snapshot capturing the complete result of executing a test set against an endpoint, including individual test results, execution metadata, and pass/fail status.
Also known as: test execution, run
Overview
A test run is the complete record of executing a test set against an endpoint at a specific point in time. It captures all results, metrics, and metadata for that execution.
What's Included
Test Results:
- Individual test outcomes (pass/fail)
- Metric scores and reasoning
- AI responses for each test
- Execution time per test
Run Metadata:
- Timestamp of execution
- Endpoint configuration
- Model version
- Environment details
Analytics:
- Overall pass rate
- Metric performance breakdown
- Performance benchmarks
- Comparison to previous runs
Use Cases
CI/CD Integration: Integrate test execution into your CI/CD pipeline using the SDK to compare runs over time, identify regressions, track improvement trends, and validate fixes.
Debugging:
- Review failed tests
- Analyze AI responses
- Understand metric scores
- Reproduce issues
Best Practices
- Run regularly: Establish baseline with consistent testing
- Tag runs: Use metadata to identify versions or features
- Review failures: Investigate why tests fail, not just that they failed
- Track trends: Look for patterns across multiple runs