Test Result Statistics API
/test_results/stats aggregates test_metrics for dashboards: multiple mode values, rich filters, and compact payloads.
Overview
Analytics over test results: per-metric and overall pass/fail derived from stored metrics, with filters for runs, entities, time ranges, and more.
Key Features
- Modes to trim payloads (
summary,metrics, etc., as implemented) - Multi-valued filter parameters where the API supports them
Authentication
All requests require Bearer token authentication:
Quick Start
Basic Usage
Basic Filtering
Data Modes
The endpoint supports configurable data modes for performance optimization:
Performance-Optimized Modes
| Mode | Data Size | Response Time | Use Case |
|---|---|---|---|
summary | ~5% | ~50ms | Dashboard widgets, KPI tracking |
metrics | ~20% | ~100ms | Metric-focused charts |
behavior | ~15% | ~100ms | Behavior performance analysis |
category | ~15% | ~100ms | Category comparison |
topic | ~15% | ~100ms | Topic performance insights |
overall | ~10% | ~75ms | Executive dashboards |
timeline | ~40% | ~150ms | Trend analysis |
test_runs | ~30% | ~125ms | Test run comparison |
all | 100% | ~200-500ms | Comprehensive analytics |
Mode Examples
Summary Mode (Ultra-lightweight)
Returns only:
overall_pass_ratesmetadata
Metrics Mode (Individual metric analysis)
Returns only:
metric_pass_rates(Answer Fluency, Answer Relevancy, etc.)metadata
Timeline Mode (Trend analysis)
Returns only:
timeline(monthly pass/fail trends)metadata
Filtering System
The endpoint supports comprehensive filtering across multiple dimensions:
Test-Level Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
test_set_ids | UUID[] | Filter by test sets | ?test_set_ids=uuid1&test_set_ids=uuid2 |
behavior_ids | UUID[] | Filter by behaviors | ?behavior_ids=uuid1&behavior_ids=uuid2 |
category_ids | UUID[] | Filter by categories | ?category_ids=uuid1&category_ids=uuid2 |
topic_ids | UUID[] | Filter by topics | ?topic_ids=uuid1&topic_ids=uuid2 |
status_ids | UUID[] | Filter by test statuses | ?status_ids=uuid1&status_ids=uuid2 |
test_ids | UUID[] | Filter specific tests | ?test_ids=uuid1&test_ids=uuid2 |
test_type_ids | UUID[] | Filter by test types | ?test_type_ids=uuid1&test_type_ids=uuid2 |
Test Run Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
test_run_id | UUID | Single test run (legacy) | ?test_run_id=uuid |
test_run_ids | UUID[] | Multiple test runs | ?test_run_ids=uuid1&test_run_ids=uuid2 |
User-Related Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
user_ids | UUID[] | Filter by test creators | ?user_ids=uuid1&user_ids=uuid2 |
assignee_ids | UUID[] | Filter by assignees | ?assignee_ids=uuid1&assignee_ids=uuid2 |
owner_ids | UUID[] | Filter by test owners | ?owner_ids=uuid1&owner_ids=uuid2 |
Other Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
prompt_ids | UUID[] | Filter by prompts | ?prompt_ids=uuid1&prompt_ids=uuid2 |
priority_min | int | Minimum priority (inclusive) | ?priority_min=1 |
priority_max | int | Maximum priority (inclusive) | ?priority_max=5 |
tags | string[] | Filter by tags (AND logic) | ?tags=urgent&tags=regression |
Date Range Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
start_date | ISO date | Start date (overrides months) | ?start_date=2024-01-01 |
end_date | ISO date | End date (overrides months) | ?end_date=2024-12-31 |
months | int | Historical months (default: 6) | ?months=12 |
Multiple Value Support
All list-based filters support multiple values using repeated parameters:
Multiple Test Runs
Multiple Behaviors/Categories/Topics
Multiple Users/Teams
Multiple Tags
Response Structure
Summary Mode Response
Metrics Mode Response
Complete Mode Sections
| Mode | Response Sections |
|---|---|
all | All sections below |
summary | overall_pass_rates + metadata |
metrics | metric_pass_rates + metadata |
behavior | behavior_pass_rates + metadata |
category | category_pass_rates + metadata |
topic | topic_pass_rates + metadata |
overall | overall_pass_rates + metadata |
timeline | timeline + metadata |
test_runs | test_run_summary + metadata |
Performance Optimization
Choose the Right Mode
For Dashboard Widgets
For Specific Analysis
For Time-based Charts
Use Targeted Filters
Reduce Dataset Size
Complete Examples
Dashboard Implementation
Executive Dashboard (Ultra-fast)
Team Performance Dashboard
Analytics Use Cases
Metric Performance Analysis
Behavior Comparison
Category Trends
Advanced Filtering
Urgent Healthcare Tests
Team Regression Analysis
Multi-dimensional Analysis
Best Practices
Performance Best Practices
- Use Specific Modes: Always use the most specific mode for your use case
- Apply Filters: Reduce dataset size with targeted filters
- Limit Time Ranges: Use shorter time periods when possible
- Cache Results: Cache responses for repeated queries
Query Optimization
Good Examples:
Avoid These:
Error Handling
The API provides clear validation errors:
Integration Tips
React/JavaScript Usage:
Python Usage:
This endpoint provides enterprise-level analytics capabilities with the flexibility and performance needed for production dashboards and detailed analysis workflows.
Related Documentation
- Test Result Status - How individual test statuses are determined
- Test Run Status - How test run statuses are determined
- Background Tasks - Test execution flow
- Email Notifications - Email system