Skip to Content
DevelopmentTelemetry System

Telemetry System

This page provides technical details about Rhesis’s telemetry architecture for developers and contributors.

Privacy First: All telemetry is optional, privacy-focused, and transparent. User and organization IDs are hashed using SHA-256, and sensitive data like passwords and API keys are automatically filtered out.

Overview

Rhesis includes a privacy-focused telemetry system to collect and analyze usage patterns from both cloud-hosted and self-hosted instances. This system uses OpenTelemetry (OTEL) for distributed tracing and metrics collection.

Key Principles:

  • Enabled by default (opt-out) for self-hosted deployments
  • Always enabled for cloud deployments (user consent via Terms & Conditions)
  • Privacy-first: All user/org IDs are hashed (SHA-256)
  • Automatic filtering: Sensitive data (passwords, tokens, PII) is never collected
  • Transparent: All collected data types are documented

Architecture Overview

The telemetry system consists of three main components working together:

Components

  1. OpenTelemetry Collector - Receives telemetry from user instances, filters sensitive data, and forwards to the processor
  2. Telemetry Processor - gRPC service that processes traces and stores structured analytics data
  3. Analytics Database - Separate PostgreSQL database for analytics, isolated from operational data

What Data We Collect

The telemetry system collects usage patterns to help us understand how Rhesis is used and identify areas for improvement.

User Activity

  • Login/logout events
  • Session duration
  • Deployment type (cloud or self-hosted)
  • Hashed user and organization IDs (SHA-256, irreversible)

Endpoint Usage

  • API endpoint paths and HTTP methods
  • Response status codes
  • Request duration (performance metrics)
  • Timestamp of requests

Feature Usage

  • Feature interactions (created, viewed, updated, deleted)
  • Feature names (e.g., “test-run”, “test-set”, “endpoint”)
  • Usage timestamps
  • Deployment context

What We DON’T Collect

Privacy Protection: The following data is automatically filtered and NEVER stored:

  • ❌ Passwords or password hashes
  • ❌ API keys or tokens
  • ❌ Authentication credentials
  • ❌ Personal Identifiable Information (PII)
  • ❌ Test content or user-generated data
  • ❌ Email addresses or usernames
  • ❌ IP addresses or device identifiers
  • ❌ Any sensitive business data

ID Hashing

All user and organization IDs are one-way hashed before storage:

Example Hashing
# SHA-256 hash truncated to 16 characters
hash = hashlib.sha256(id_str.encode()).hexdigest()[:16]
# Original ID: "user-123-456-789"
# Stored hash: "a1b2c3d4e5f6g7h8" (cannot be reversed)

Properties:

  • One-way: Cannot recover original IDs
  • Consistent: Same ID always produces the same hash for analytics
  • Anonymous: No PII stored
  • Collision-resistant: 2^64 unique values

Privacy & Security

Opt-In/Opt-Out

  • Telemetry respects user preferences
  • Can be disabled entirely for self-hosted instances
  • No data collection without explicit configuration

Data Isolation

  • Analytics database is completely separate from operational data
  • Different access controls and backup policies
  • Can be managed independently
  • No impact on application performance

Security Measures

  • Sensitive attributes filtered at the collector level
  • Batch processing with retry logic for reliability
  • Memory limits to prevent resource exhaustion
  • Health checks and monitoring built-in

Ports & Endpoints

OpenTelemetry Collector

  • 4317: OTLP gRPC receiver (primary)
  • 4318: OTLP HTTP receiver (web apps)
  • 8888: Collector metrics (Prometheus)
  • 13133: Health check endpoint
  • 55679: Debug zpages

Telemetry Processor

  • 4317: gRPC server for receiving traces from collector

Database Schema

The analytics database uses three tables with consistent structure:

user_activity

Tracks user engagement events

ColumnTypeDescription
idUUIDPrimary key
user_idVARCHAR(32)Hashed user ID
organization_idVARCHAR(32)Hashed org ID
event_typeVARCHAR(50)Event type (login, logout)
timestampTIMESTAMPEvent time
session_idVARCHAR(255)Session identifier
deployment_typeVARCHAR(50)cloud / self-hosted
event_metadataJSONBAdditional context

endpoint_usage

Tracks API usage and performance

ColumnTypeDescription
idUUIDPrimary key
endpointVARCHAR(255)API endpoint path
methodVARCHAR(10)HTTP method
user_idVARCHAR(32)Hashed user ID
organization_idVARCHAR(32)Hashed org ID
status_codeINTEGERHTTP status
duration_msDOUBLE PRECISIONRequest duration
timestampTIMESTAMPRequest time
deployment_typeVARCHAR(50)cloud / self-hosted
event_metadataJSONBAdditional context

feature_usage

Tracks feature-specific interactions

ColumnTypeDescription
idUUIDPrimary key
feature_nameVARCHAR(100)Feature identifier
user_idVARCHAR(32)Hashed user ID
organization_idVARCHAR(32)Hashed org ID
actionVARCHAR(100)Action type
timestampTIMESTAMPAction time
deployment_typeVARCHAR(50)cloud / self-hosted
event_metadataJSONBAdditional context

Configuration

OpenTelemetry Collector

Configured via apps/otel-collector/otel-collector-config.yaml:

Key Features:

  • OTLP gRPC and HTTP receivers with CORS support
  • Batch processing for efficiency
  • Memory limits to prevent OOM
  • Automatic sensitive data filtering
  • Resource metadata enrichment
  • Event categorization for analytics

Environment Variables:

.env
OTEL_PROCESSOR_ENDPOINT=telemetry-processor:4317

Telemetry Processor

Required Environment Variables:

.env
# Analytics Database (separate from main database)
ANALYTICS_DB_USER=your-db-username
ANALYTICS_DB_PASS=your-secure-password
ANALYTICS_DB_HOST=postgres-analytics
ANALYTICS_DB_PORT=5432
ANALYTICS_DB_NAME=rhesis_analytics

# Or use connection string
ANALYTICS_DATABASE_URL=postgresql://user:pass@host:port/dbname

# Service Configuration
PORT=4317
LOG_LEVEL=INFO

Environment Variables Reference

Backend/Services Configuration

Configure these in your .env or .env.docker file:

.env
# Telemetry Control
OTEL_RHESIS_TELEMETRY_ENABLED=true   # Set to 'false' to disable (enabled by default for self-hosted)
OTEL_DEPLOYMENT_TYPE=self-hosted     # Options: 'self-hosted' | 'cloud'

# OpenTelemetry Configuration
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318/  # OTel Collector endpoint
OTEL_SERVICE_NAME=rhesis                                  # Service identifier
OTEL_PROCESSOR_ENDPOINT=telemetry-processor:4317          # Processor endpoint (internal)
OTEL_API_KEY=your-api-key                                 # Optional: API key for authentication

Variable Details:

  • OTEL_RHESIS_TELEMETRY_ENABLED: Master switch for telemetry. Defaults to true (enabled) for self-hosted deployments. Set to false to opt-out
  • OTEL_DEPLOYMENT_TYPE: Used to tag data by deployment type for analytics
  • OTEL_EXPORTER_OTLP_ENDPOINT: Where backend sends telemetry (should point to OTel Collector)
  • OTEL_PROCESSOR_ENDPOINT: Internal endpoint for collector-to-processor communication
  • OTEL_API_KEY: Optional authentication key for securing telemetry endpoints

Analytics Database Configuration

Separate database for telemetry storage (recommended to isolate from operational data):

.env
# Analytics Database (separate from main DB)
ANALYTICS_DB_USER=analytics-user
ANALYTICS_DB_PASS=secure-password
ANALYTICS_DB_HOST=postgres-analytics
ANALYTICS_DB_PORT=5432
ANALYTICS_DB_NAME=rhesis-analytics

# Or use full connection string
ANALYTICS_DATABASE_URL=postgresql://user:pass@host:port/rhesis-analytics