Test Result Statistics API
This guide covers the comprehensive test result statistics endpoint, designed for analytics dashboards and performance monitoring. The endpoint provides configurable data modes and extensive filtering capabilities for optimal performance.
Table of Contents
- Overview
- Quick Start
- Data Modes
- Filtering System
- Multiple Value Support
- Response Structure
- Performance Optimization
- Complete Examples
- Best Practices
Overview
The test result statistics endpoint (/test_results/stats) provides powerful analytics capabilities for test performance analysis. It analyzes test_metrics JSONB data to determine pass/fail status per metric and overall test results.
Key Features
- Configurable Data Modes: Retrieve only the data you need
- Comprehensive Filtering: Filter by any combination of test attributes
- Multiple Value Support: Query multiple entities simultaneously
- Performance Optimized: Reduce payload size and response times
- React Chart Ready: Structured data for charting libraries
Authentication
All requests require Bearer token authentication:
Quick Start
Basic Usage
Basic Filtering
Data Modes
The endpoint supports configurable data modes for performance optimization:
Performance-Optimized Modes
| Mode | Data Size | Response Time | Use Case |
|---|---|---|---|
summary | ~5% | ~50ms | Dashboard widgets, KPI tracking |
metrics | ~20% | ~100ms | Metric-focused charts |
behavior | ~15% | ~100ms | Behavior performance analysis |
category | ~15% | ~100ms | Category comparison |
topic | ~15% | ~100ms | Topic performance insights |
overall | ~10% | ~75ms | Executive dashboards |
timeline | ~40% | ~150ms | Trend analysis |
test_runs | ~30% | ~125ms | Test run comparison |
all | 100% | ~200-500ms | Comprehensive analytics |
Mode Examples
Summary Mode (Ultra-lightweight)
Returns only:
overall_pass_ratesmetadata
Metrics Mode (Individual metric analysis)
Returns only:
metric_pass_rates(Answer Fluency, Answer Relevancy, etc.)metadata
Timeline Mode (Trend analysis)
Returns only:
timeline(monthly pass/fail trends)metadata
Filtering System
The endpoint supports comprehensive filtering across multiple dimensions:
Test-Level Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
test_set_ids | UUID[] | Filter by test sets | ?test_set_ids=uuid1&test_set_ids=uuid2 |
behavior_ids | UUID[] | Filter by behaviors | ?behavior_ids=uuid1&behavior_ids=uuid2 |
category_ids | UUID[] | Filter by categories | ?category_ids=uuid1&category_ids=uuid2 |
topic_ids | UUID[] | Filter by topics | ?topic_ids=uuid1&topic_ids=uuid2 |
status_ids | UUID[] | Filter by test statuses | ?status_ids=uuid1&status_ids=uuid2 |
test_ids | UUID[] | Filter specific tests | ?test_ids=uuid1&test_ids=uuid2 |
test_type_ids | UUID[] | Filter by test types | ?test_type_ids=uuid1&test_type_ids=uuid2 |
Test Run Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
test_run_id | UUID | Single test run (legacy) | ?test_run_id=uuid |
test_run_ids | UUID[] | Multiple test runs | ?test_run_ids=uuid1&test_run_ids=uuid2 |
User-Related Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
user_ids | UUID[] | Filter by test creators | ?user_ids=uuid1&user_ids=uuid2 |
assignee_ids | UUID[] | Filter by assignees | ?assignee_ids=uuid1&assignee_ids=uuid2 |
owner_ids | UUID[] | Filter by test owners | ?owner_ids=uuid1&owner_ids=uuid2 |
Other Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
prompt_ids | UUID[] | Filter by prompts | ?prompt_ids=uuid1&prompt_ids=uuid2 |
priority_min | int | Minimum priority (inclusive) | ?priority_min=1 |
priority_max | int | Maximum priority (inclusive) | ?priority_max=5 |
tags | string[] | Filter by tags (AND logic) | ?tags=urgent&tags=regression |
Date Range Filters
| Parameter | Type | Description | Example |
|---|---|---|---|
start_date | ISO date | Start date (overrides months) | ?start_date=2024-01-01 |
end_date | ISO date | End date (overrides months) | ?end_date=2024-12-31 |
months | int | Historical months (default: 6) | ?months=12 |
Multiple Value Support
All list-based filters support multiple values using repeated parameters:
Multiple Test Runs
Multiple Behaviors/Categories/Topics
Multiple Users/Teams
Multiple Tags
Response Structure
Summary Mode Response
Metrics Mode Response
Complete Mode Sections
| Mode | Response Sections |
|---|---|
all | All sections below |
summary | overall_pass_rates + metadata |
metrics | metric_pass_rates + metadata |
behavior | behavior_pass_rates + metadata |
category | category_pass_rates + metadata |
topic | topic_pass_rates + metadata |
overall | overall_pass_rates + metadata |
timeline | timeline + metadata |
test_runs | test_run_summary + metadata |
Performance Optimization
Choose the Right Mode
For Dashboard Widgets
For Specific Analysis
For Time-based Charts
Use Targeted Filters
Reduce Dataset Size
Complete Examples
Dashboard Implementation
Executive Dashboard (Ultra-fast)
Team Performance Dashboard
Analytics Use Cases
Metric Performance Analysis
Behavior Comparison
Category Trends
Advanced Filtering
Urgent Healthcare Tests
Team Regression Analysis
Multi-dimensional Analysis
Best Practices
Performance Best Practices
- Use Specific Modes: Always use the most specific mode for your use case
- Apply Filters: Reduce dataset size with targeted filters
- Limit Time Ranges: Use shorter time periods when possible
- Cache Results: Cache responses for repeated queries
Query Optimization
Good Examples:
Avoid These:
Error Handling
The API provides clear validation errors:
Integration Tips
React/JavaScript Usage:
Python Usage:
This endpoint provides enterprise-level analytics capabilities with the flexibility and performance needed for production dashboards and detailed analysis workflows.
Related Documentation
- Test Result Status - How individual test statuses are determined
- Test Run Status - How test run statuses are determined
- Background Tasks - Test execution flow
- Email Notifications - Email system