Telemetry System
This page provides technical details about Rhesis’s telemetry architecture for developers and contributors.
Privacy First: All telemetry is optional, privacy-focused, and transparent. User and organization IDs are hashed using SHA-256, and sensitive data like passwords and API keys are automatically filtered out.
Overview
Rhesis includes a privacy-focused telemetry system to collect and analyze usage patterns from both cloud-hosted and self-hosted instances. This system uses OpenTelemetry (OTEL) for distributed tracing and metrics collection.
Key Principles:
- Enabled by default (opt-out) for self-hosted deployments
- Always enabled for cloud deployments (user consent via Terms & Conditions)
- Privacy-first: All user/org IDs are hashed (SHA-256)
- Automatic filtering: Sensitive data (passwords, tokens, PII) is never collected
- Transparent: All collected data types are documented
Architecture Overview
The telemetry system consists of three main components working together:
Components
- OpenTelemetry Collector - Receives telemetry from user instances, filters sensitive data, and forwards to the processor
- Telemetry Processor - gRPC service that processes traces and stores structured analytics data
- Analytics Database - Separate PostgreSQL database for analytics, isolated from operational data
What Data We Collect
The telemetry system collects usage patterns to help us understand how Rhesis is used and identify areas for improvement.
User Activity
- Login/logout events
- Session duration
- Deployment type (cloud or self-hosted)
- Hashed user and organization IDs (SHA-256, irreversible)
Endpoint Usage
- API endpoint paths and HTTP methods
- Response status codes
- Request duration (performance metrics)
- Timestamp of requests
Feature Usage
- Feature interactions (created, viewed, updated, deleted)
- Feature names (e.g., “test-run”, “test-set”, “endpoint”)
- Usage timestamps
- Deployment context
What We DON’T Collect
Privacy Protection: The following data is automatically filtered and NEVER stored:
- ❌ Passwords or password hashes
- ❌ API keys or tokens
- ❌ Authentication credentials
- ❌ Personal Identifiable Information (PII)
- ❌ Test content or user-generated data
- ❌ Email addresses or usernames
- ❌ IP addresses or device identifiers
- ❌ Any sensitive business data
ID Hashing
All user and organization IDs are one-way hashed before storage:
Properties:
- ✅ One-way: Cannot recover original IDs
- ✅ Consistent: Same ID always produces the same hash for analytics
- ✅ Anonymous: No PII stored
- ✅ Collision-resistant: 2^64 unique values
Privacy & Security
Opt-In/Opt-Out
- Telemetry respects user preferences
- Can be disabled entirely for self-hosted instances
- No data collection without explicit configuration
Data Isolation
- Analytics database is completely separate from operational data
- Different access controls and backup policies
- Can be managed independently
- No impact on application performance
Security Measures
- Sensitive attributes filtered at the collector level
- Batch processing with retry logic for reliability
- Memory limits to prevent resource exhaustion
- Health checks and monitoring built-in
Ports & Endpoints
OpenTelemetry Collector
- 4317: OTLP gRPC receiver (primary)
- 4318: OTLP HTTP receiver (web apps)
- 8888: Collector metrics (Prometheus)
- 13133: Health check endpoint
- 55679: Debug zpages
Telemetry Processor
- 4317: gRPC server for receiving traces from collector
Database Schema
The analytics database uses three tables with consistent structure:
user_activity
Tracks user engagement events
| Column | Type | Description |
|---|---|---|
id | UUID | Primary key |
user_id | VARCHAR(32) | Hashed user ID |
organization_id | VARCHAR(32) | Hashed org ID |
event_type | VARCHAR(50) | Event type (login, logout) |
timestamp | TIMESTAMP | Event time |
session_id | VARCHAR(255) | Session identifier |
deployment_type | VARCHAR(50) | cloud / self-hosted |
event_metadata | JSONB | Additional context |
endpoint_usage
Tracks API usage and performance
| Column | Type | Description |
|---|---|---|
id | UUID | Primary key |
endpoint | VARCHAR(255) | API endpoint path |
method | VARCHAR(10) | HTTP method |
user_id | VARCHAR(32) | Hashed user ID |
organization_id | VARCHAR(32) | Hashed org ID |
status_code | INTEGER | HTTP status |
duration_ms | DOUBLE PRECISION | Request duration |
timestamp | TIMESTAMP | Request time |
deployment_type | VARCHAR(50) | cloud / self-hosted |
event_metadata | JSONB | Additional context |
feature_usage
Tracks feature-specific interactions
| Column | Type | Description |
|---|---|---|
id | UUID | Primary key |
feature_name | VARCHAR(100) | Feature identifier |
user_id | VARCHAR(32) | Hashed user ID |
organization_id | VARCHAR(32) | Hashed org ID |
action | VARCHAR(100) | Action type |
timestamp | TIMESTAMP | Action time |
deployment_type | VARCHAR(50) | cloud / self-hosted |
event_metadata | JSONB | Additional context |
Configuration
OpenTelemetry Collector
Configured via apps/otel-collector/otel-collector-config.yaml:
Key Features:
- OTLP gRPC and HTTP receivers with CORS support
- Batch processing for efficiency
- Memory limits to prevent OOM
- Automatic sensitive data filtering
- Resource metadata enrichment
- Event categorization for analytics
Environment Variables:
Telemetry Processor
Required Environment Variables:
Environment Variables Reference
Backend/Services Configuration
Configure these in your .env or .env.docker file:
Variable Details:
OTEL_RHESIS_TELEMETRY_ENABLED: Master switch for telemetry. Defaults totrue(enabled) for self-hosted deployments. Set tofalseto opt-outOTEL_DEPLOYMENT_TYPE: Used to tag data by deployment type for analyticsOTEL_EXPORTER_OTLP_ENDPOINT: Where backend sends telemetry (should point to OTel Collector)OTEL_PROCESSOR_ENDPOINT: Internal endpoint for collector-to-processor communicationOTEL_API_KEY: Optional authentication key for securing telemetry endpoints
Analytics Database Configuration
Separate database for telemetry storage (recommended to isolate from operational data):