Skip to Content

Data Structures

Schemas, database design, and data formats for the tracing system.

Span Structure

OTLP Span Format

Spans sent from SDK to backend follow the OTLP/JSON format:

Span Payload
{
  "trace_id": "a1b2c3d4e5f6...",
  "span_id": "1234567890abcdef",
  "parent_span_id": null,
  "project_id": "my-project",
  "environment": "development",
  "span_name": "ai.llm.invoke",
  "span_kind": "CLIENT",
  "start_time": "2024-01-01T00:00:00.000000Z",
  "end_time": "2024-01-01T00:00:01.500000Z",
  "status_code": "OK",
  "status_message": null,
  "attributes": {
    "ai.model.name": "gpt-4",
    "ai.model.provider": "openai",
    "ai.llm.tokens.input": 10,
    "ai.llm.tokens.output": 25,
    "rhesis.test.run_id": "uuid",
    "rhesis.test.id": "uuid"
  },
  "events": [
    {
      "name": "ai.prompt",
      "timestamp": "2024-01-01T00:00:00.100000Z",
      "attributes": {
        "ai.prompt.role": "user",
        "ai.prompt.content": "Hello, world!"
      }
    },
    {
      "name": "ai.completion",
      "timestamp": "2024-01-01T00:00:01.400000Z",
      "attributes": {
        "ai.completion.content": "Hi there! How can I help?"
      }
    }
  ],
  "links": [],
  "resource": {
    "service.name": "my-service",
    "service.namespace": "rhesis",
    "deployment.environment": "development"
  }
}

Test Execution Context

Context attributes added to spans during test execution:

Test Context Attributes
test_execution_context = {
    "rhesis.test.run_id": "uuid",              # Which test run
    "rhesis.test.id": "uuid",                  # Which test definition
    "rhesis.test.configuration_id": "uuid",    # Which configuration
    # test_result_id is linked after creation
}

Database Schema

traces Table

SQL Schema
CREATE TABLE traces (
    -- Identity
    id UUID PRIMARY KEY,
    trace_id VARCHAR(32) NOT NULL,      -- OTEL trace ID
    span_id VARCHAR(16) NOT NULL,       -- OTEL span ID
    parent_span_id VARCHAR(16),
    
    -- Span data
    span_name VARCHAR(255) NOT NULL,
    start_time TIMESTAMP WITH TIME ZONE NOT NULL,
    end_time TIMESTAMP WITH TIME ZONE NOT NULL,
    duration_ms FLOAT NOT NULL,
    status_code VARCHAR(50) NOT NULL,
    
    -- Multi-tenancy
    organization_id UUID NOT NULL REFERENCES organization(id),
    project_id UUID NOT NULL REFERENCES project(id),
    
    -- Test execution (FKs for linking)
    test_run_id UUID REFERENCES test_run(id) ON DELETE SET NULL,
    test_result_id UUID REFERENCES test_result(id) ON DELETE SET NULL,
    test_id UUID REFERENCES test(id) ON DELETE SET NULL,
    
    -- JSONB columns (flexible schema)
    attributes JSONB NOT NULL DEFAULT '{}',
    events JSONB NOT NULL DEFAULT '[]',
    enriched_data JSONB,                -- Cached enrichment
    
    -- Timestamps
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

Column Details

ColumnTypeDescription
idUUIDPrimary key (internal)
trace_idVARCHAR(32)OpenTelemetry trace ID (groups spans)
span_idVARCHAR(16)OpenTelemetry span ID (unique per span)
parent_span_idVARCHAR(16)Parent span for hierarchy
span_nameVARCHAR(255)Operation name (ai.llm.invoke)
start_timeTIMESTAMPSpan start time
end_timeTIMESTAMPSpan end time
duration_msFLOATCalculated duration
status_codeVARCHAR(50)OK, ERROR, UNSET
organization_idUUIDMulti-tenancy isolation
project_idUUIDProject isolation
test_run_idUUIDLinked test run
test_result_idUUIDLinked test result
test_idUUIDLinked test definition
attributesJSONBSpan attributes
eventsJSONBSpan events (prompts, completions)
enriched_dataJSONBCached enrichment results

Indexes

Critical Indexes
-- Get all spans for a trace (primary query)
CREATE INDEX idx_trace_trace_id ON traces(trace_id, start_time DESC);

-- Test execution queries
CREATE INDEX idx_trace_test_run ON traces(test_run_id, start_time DESC);
CREATE INDEX idx_trace_test_result ON traces(test_result_id);

-- JSONB attribute queries
CREATE INDEX idx_trace_attributes ON traces USING GIN(attributes jsonb_path_ops);

-- Organization/project filtering
CREATE INDEX idx_trace_org_project ON traces(organization_id, project_id, created_at DESC);

-- Status code filtering
CREATE INDEX idx_trace_status ON traces(status_code, created_at DESC);

Index Usage

QueryIndex Used
Get spans by trace_ididx_trace_trace_id
Get traces for test runidx_trace_test_run
Query by attribute (model, provider)idx_trace_attributes
Filter by organizationidx_trace_org_project
Find error tracesidx_trace_status

Enrichment Data

The enriched_data JSONB column caches computed values:

Enrichment Structure
{
  "costs": {
    "total_cost_usd": 0.023,
    "total_cost_eur": 0.021,
    "breakdown": [
      {
        "span_id": "1234567890abcdef",
        "model": "gpt-4",
        "tokens_input": 150,
        "tokens_output": 80,
        "cost_usd": 0.023,
        "cost_eur": 0.021
      }
    ]
  },
  "anomalies": [
    {
      "type": "high_latency",
      "span_id": "1234567890abcdef",
      "threshold_ms": 1000,
      "actual_ms": 2340,
      "severity": "warning"
    }
  ],
  "metadata": {
    "models_used": [
      "gpt-4"
    ],
    "total_tokens_input": 150,
    "total_tokens_output": 80,
    "total_tokens": 230,
    "span_count": 5,
    "llm_call_count": 2,
    "tool_call_count": 1
  },
  "enriched_at": "2025-01-01T10:00:00Z"
}

Enrichment Fields

FieldDescription
costs.total_cost_usdTotal cost in USD
costs.total_cost_eurTotal cost in EUR
costs.breakdownPer-span cost breakdown
anomaliesDetected anomalies
metadata.models_usedUnique models in trace
metadata.total_tokensSum of all tokens
metadata.span_countNumber of spans
enriched_atEnrichment timestamp

Common Query Patterns

Get Trace by ID

Query
SELECT * FROM traces
WHERE trace_id = 'abc123...'
ORDER BY start_time ASC;
-- Uses: idx_trace_trace_id

Get Traces for Test Run

Query
SELECT DISTINCT trace_id, MIN(start_time) as trace_start
FROM traces
WHERE test_run_id = 'uuid'
GROUP BY trace_id
ORDER BY trace_start DESC;
-- Uses: idx_trace_test_run

Get LLM Calls with Specific Model

Query
SELECT * FROM traces
WHERE attributes @> '{"ai.model.name": "gpt-4"}'
AND project_id = 'uuid'
ORDER BY created_at DESC;
-- Uses: idx_trace_attributes (GIN index)

Get Error Traces

Query
SELECT DISTINCT trace_id, span_name, status_code
FROM traces
WHERE status_code = 'ERROR'
AND project_id = 'uuid'
ORDER BY created_at DESC
LIMIT 100;
-- Uses: idx_trace_status

Get High-Cost Traces

Query
SELECT trace_id, 
       enriched_data->'costs'->>'total_cost_usd' as cost_usd,
       enriched_data->'metadata'->>'models_used' as models
FROM traces
WHERE enriched_data IS NOT NULL
AND (enriched_data->'costs'->>'total_cost_usd')::float > 0.10
AND project_id = 'uuid'
ORDER BY (enriched_data->'costs'->>'total_cost_usd')::float DESC
LIMIT 50;

HTTP Request Format

Ingestion Endpoint

Endpoint: POST /telemetry/traces

Headers:

Headers
Authorization: Bearer <api_key>
Content-Type: application/json

Payload:

Request Body
{
  "spans": [
      { ... span 1 ... },
      { ... span 2 ... }
  ]
}

Response Codes

StatusMeaningAction
200SuccessSpans ingested
401UnauthorizedCheck API key
422Validation ErrorFix span names/attributes
500Server ErrorRetry with backoff

Validation Errors

Common 422 errors:

Validation Error
{
  "detail": [
    {
      "loc": [
        "spans",
        0,
        "span_name"
      ],
      "msg": "span_name cannot use framework concept 'agent'. Use primitive operations: llm, tool, retrieval, embedding",
      "type": "value_error"
    }
  ]
}

Why PostgreSQL + JSONB?

AspectBenefit
Single DatabaseSimplifies operations, existing expertise
JSONB FlexibilitySchema can evolve without migrations
GIN IndexesFast attribute queries
ACID ComplianceReliable linking operations
Familiar SQLEasy debugging and ad-hoc queries

Future Scaling

If trace volume exceeds PostgreSQL capacity:

  1. Partition by time - Monthly partitions for retention
  2. TimescaleDB - Hypertable for time-series optimization
  3. ClickHouse - Columnar store for analytics
  4. Archive strategy - Move old traces to cold storage