Skip to Content
SDKModels

Models

Models are core components of the SDK that enable you to create and manage test sets for Gen AI applications. Models also play a crucial role in the evaluation process, serving as LLM judges to assess and validate outputs.

Using the get_model function

The easiest way to use models is through the get_model function. When called without any arguments, this function returns the default Rhesis model.

default_model.py
from rhesis.sdk.models import get_model

model = get_model()

To use a different provider, you can pass the provider name as an argument. This will use the default model for that provider.

provider_model.py
from rhesis.sdk.models import get_model

# Use default Gemini model
model = get_model("gemini")

Supported providers (highlights):

  • rhesis - Rhesis-hosted default models
  • openai, anthropic, gemini, vertex_ai - major hosted providers
  • azure_ai - Azure AI Studio deployments via LiteLLM
  • azure - Azure OpenAI deployments via LiteLLM
  • litellm_proxy - OpenAI-compatible LiteLLM Proxy endpoint
  • openrouter, mistral, cohere, groq, perplexity, replicate, together_ai
  • ollama, huggingface, lmformatenforcer - local/self-hosted options

Use get_model(...) for the most stable cross-provider API.

To use a specific model, provide its name in the format provider/model_name:

specific_model.py
from rhesis.sdk.models import get_model

model = get_model("gemini/gemini-2.0-flash")

The above code is equivalent to:

specific_model_alt.py
from rhesis.sdk.models import get_model

model = get_model(provider="gemini", model_name="gemini-2.0-flash")

Provider-specific connection options

Some providers accept extra connection parameters:

ProviderRequired fieldsOptional fieldsNotes
litellm_proxymodel_nameapi_base, api_keyapi_base defaults to LITELLM_PROXY_BASE_URL env var or http://0.0.0.0:4000
azure_aimodel_name, api_base, api_keyapi_base/api_key can be supplied via AZURE_AI_API_BASE/AZURE_AI_API_KEY env vars
azuremodel_name, api_base, api_keyapi_versionapi_base/api_key/api_version can be supplied via AZURE_API_BASE/AZURE_API_KEY/AZURE_API_VERSION env vars
litellm_proxy.py
from rhesis.sdk.models import get_model

model = get_model(
    provider="litellm_proxy",
    model_name="gpt-4o-mini",
    api_base="http://localhost:4000",
    api_key="proxy-key",  # optional
)
azure_ai.py
from rhesis.sdk.models import get_model

model = get_model(
    provider="azure_ai",
    model_name="command-r-plus",
    api_base="https://your-endpoint.inference.ai.azure.com/",
    api_key="your-azure-ai-key",
)
azure_openai.py
from rhesis.sdk.models import get_model

model = get_model(
    provider="azure",
    model_name="my-gpt4o-deployment",
    api_base="https://your-resource.openai.azure.com/",
    api_key="your-azure-openai-key",
    api_version="2024-08-01-preview",
)

Direct import

Alternatively, you can access models by importing the model class directly. When you provide a model name as an argument, that specific model will be used. If no model name is provided, the default model for that provider will be used.

direct_import.py
from rhesis.sdk.models import AzureOpenAILLM, GeminiLLM

# Use specific Gemini model
gemini_model = GeminiLLM("gemini-2.0-flash")

# Use Azure OpenAI deployment
azure_model = AzureOpenAILLM(
    model_name="my-gpt4o-deployment",
    api_base="https://your-resource.openai.azure.com/",
    api_key="your-azure-openai-key",
)

Generate content with models

All models share a consistent interface. The primary function is generate, which accepts:

  • prompt: The text prompt for generation
  • schema: (optional) A Pydantic schema defining the structure of the generated text

Language model generation APIs (v0.6.9+)

Language-model providers in the SDK are async-first. You can use:

MethodExecution styleTypical use
generate(...)Synchronous convenience wrapperScripts and notebooks without explicit async code
a_generate(...)Native async callAsync services, workers, and concurrent pipelines
generate_batch(...)Multi-prompt batch callHigh-throughput generation for datasets or evaluations

generate(...) internally bridges to a_generate(...). If your application is already async, prefer a_generate(...) directly.

Generate text using prompt only:

generate_text.py
# Use default Rhesis model
model = get_model()
output = model.generate(prompt="What is the capital of France?")
# Output: "The capital of France is Paris."

Generate structured output using schemas:

generate_structured.py
from pydantic import BaseModel
from rhesis.sdk.models import get_model

class City(BaseModel):
    name: str
    population: int

class CityResponse(BaseModel):
    biggest_cities: list[City]

# Use default Rhesis model
model = get_model()
output = model.generate(
    prompt="The list of 5 biggest cities in Germany?",
    schema=CityResponse
)

Batch Processing

For improved performance when processing multiple prompts, use the generate_batch method. This method leverages parallel processing to efficiently handle multiple requests simultaneously.

Basic batch generation:

batch_basic.py
from rhesis.sdk.models import get_model

model = get_model("openai/gpt-4o-mini")

# Process multiple prompts in a single batch call
prompts = [
    "What is the capital of France?",
    "What is the capital of Germany?",
    "What is the capital of Spain?",
]

results = model.generate_batch(prompts=prompts)
# Returns: ["Paris is the capital of France.", "Berlin is...", "Madrid is..."]

Batch generation with structured output:

batch_structured.py
from pydantic import BaseModel
from rhesis.sdk.models import get_model

class CityInfo(BaseModel):
    name: str
    country: str
    population: int

model = get_model("openai/gpt-4o-mini")

prompts = [
    "Provide info about Paris",
    "Provide info about Tokyo",
    "Provide info about New York",
]

results = model.generate_batch(
    prompts=prompts,
    schema=CityInfo
)
# Returns list of validated dicts matching CityInfo schema

Generate multiple completions per prompt:

batch_multiple.py
from rhesis.sdk.models import get_model

model = get_model("openai/gpt-4o-mini")

prompts = ["Generate a creative product name for a coffee brand"]

# Generate 3 different responses for each prompt
results = model.generate_batch(
    prompts=prompts,
    n=3
)
# Returns 3 different product name suggestions

When to use batch processing:

  • Processing large test sets or datasets
  • Running evaluations across multiple inputs
  • Generating multiple test variations
  • Any scenario requiring parallel LLM calls

Batch processing is available for LiteLLM-based providers and rhesis language models.

async_generation.py
import asyncio
from rhesis.sdk.models import get_model

model = get_model("rhesis")

async def main():
    single = await model.a_generate("Summarize why unit tests matter.")
    # generate_batch is synchronous; use asyncio.to_thread to avoid blocking the event loop
    batch = await asyncio.to_thread(
        model.generate_batch,
        prompts=[
            "Give one CI best practice.",
            "Give one code review best practice.",
        ],
    )
    print(single)
    print(batch)

asyncio.run(main())

Using models with synthesizers and metrics

Models become most useful when combined with synthesizers and metrics:

models_with_tools.py
from rhesis.sdk.models import get_model
from rhesis.sdk.synthesizers import PromptSynthesizer
from rhesis.sdk.metrics import RhesisPromptMetricNumeric

# With synthesizers
model = get_model("gemini")
synthesizer = PromptSynthesizer(
    prompt="Generate tests for the car selling chatbot",
    model=model,
)

# With metrics
metric = RhesisPromptMetricNumeric(
    name="answer_quality_evaluator",
    evaluation_prompt="Evaluate the answer for accuracy, completeness, clarity, and relevance.",
    model="gemini",
)

Saving models to the platform

You can save an LLM configuration to the Rhesis platform as a Model entity. This allows you to:

  • Store model configurations centrally for team sharing
  • Set default models for test generation and evaluation
  • Retrieve configurations across different scripts
save_model.py
from rhesis.sdk.models import get_model

# Create an LLM instance
llm = get_model("openai", "gpt-4", api_key="sk-...")

# Save to platform as a Model entity
model = llm.push(name="GPT-4 Production")

# Set as default for generation or evaluation
model.set_default_generation()
model.set_default_evaluation()

You can also retrieve saved configurations and convert them back to LLM instances:

load_model.py
from rhesis.sdk.entities import Models

# Pull saved model from platform
model = Models.pull(name="GPT-4 Production")

# Convert to LLM instance
llm = model.get_model_instance()

# Use for generation
response = llm.generate("Hello, how are you?")

Embedders

Embedders generate vector representations (embeddings) of text, which are useful for semantic search, similarity comparison, and clustering tasks.

Using the get_model function

The easiest way to use embedders is through the get_model function with embedding models. When called with an embedding model, it automatically detects the type and returns an embedder.

default_embedder.py
from rhesis.sdk.models import get_model

embedder = get_model("openai/text-embedding-3-small")

To use a specific model, provide the model name:

specific_embedder.py
from rhesis.sdk.models import get_model

# Use a specific embedding model (auto-detected from name)
embedder = get_model("openai/text-embedding-3-large")

The above code is equivalent to:

specific_embedder_alt.py
from rhesis.sdk.models import get_model

embedder = get_model(provider="openai", model_name="text-embedding-3-large", model_type="embedding")

Direct import

You can also import the embedder class directly:

direct_embedder_import.py
from rhesis.sdk.models import OpenAIEmbedder

# Use default model (text-embedding-3-small)
embedder = OpenAIEmbedder()

# Use specific model with custom dimensions
embedder = OpenAIEmbedder(model_name="text-embedding-3-large", dimensions=1024)

Generate embeddings

All embedders share a consistent interface with two main methods:

Generate embedding for a single text:

generate_embedding.py
from rhesis.sdk.models import get_model

embedder = get_model("openai/text-embedding-3-small")
embedding = embedder.generate("What is machine learning?")
# Returns: [0.0123, -0.0456, 0.0789, ...]  (list of floats)

Generate embeddings for multiple texts:

generate_embeddings_batch.py
from rhesis.sdk.models import get_model

embedder = get_model("openai/text-embedding-3-small")
texts = [
    "What is machine learning?",
    "How does deep learning work?",
    "Explain neural networks",
]

embeddings = embedder.generate_batch(texts)
# Returns: list of embedding vectors, one per input text

Configuring embedding dimensions

Some embedding models (like OpenAI’s text-embedding-3 family) support configurable output dimensions. Smaller dimensions reduce storage and computation costs while maintaining most of the semantic information.

embedder_dimensions.py
from rhesis.sdk.models import get_model

# Set dimensions at initialization
embedder = get_model("openai/text-embedding-3-small", dimensions=256)
embedding = embedder.generate("Hello world")
# Returns embedding with 256 dimensions

# Or override per call
embedding = embedder.generate("Hello world", dimensions=512)
# Returns embedding with 512 dimensions

See the Model entity documentation for more details on managing model configurations.