# Any Llm

> Documentation for Any Llm

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/any_llm.md

## AnyLLM

::: any_llm.AnyLLM

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/api-reference.md

# API Reference

<swagger-ui src="openapi.json"/>

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/authentication.md

# Authentication

any-llm-gateway offers two authentication methods, each designed for different use cases. Understanding when to use each approach will help you secure your gateway effectively.
## Authentication Methods Overview

| Method | Best For | Key Management | Usage Tracking |
|--------|----------|----------------|----------------|
| **Master Key** | Internal services, admin operations, trusted environments | Single key with full access | Requires manual user specification |
| **Virtual API Keys** | External apps, per-user access, customer-facing services | Multiple scoped keys | Automatic per-key tracking |

### Supported Headers

The gateway accepts authentication via two headers:

- **`X-AnyLLM-Key`** (preferred): The gateway's native authentication header
- **`Authorization`**: Standard HTTP authorization header for OpenAI client compatibility

Both headers use the `Bearer <token>` format. When both headers are present, `X-AnyLLM-Key` takes precedence.

Using the `Authorization` header allows you to use the gateway with OpenAI-compatible clients without modification:

```python
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="your-master-key-or-virtual-key",  # Sent as Authorization: Bearer ...
)
```

## Master Key 
The master key is the root credential for your gateway installation. It has unrestricted access to all gateway operations and should be treated with the same security as your production database credentials.

### Generating a Master Key

Generate a cryptographically secure master key (minimum 32 characters recommended):

```bash
python -c "import secrets; print(secrets.token_urlsafe(32))"
```

**Example output:**
```
Zx8Q_vKm3nR7wP2sT9yU5iO1eA6hD4fG0bN8cL3jM5k
```

Set the generated key in your configuration:

**Using environment variables:**
```bash
export GATEWAY_MASTER_KEY="Zx8Q_vKm3nR7wP2sT9yU5iO1eA6hD4fG0bN8cL3jM5k"
```

**Using config.yml:**
```yaml
master_key: "Zx8Q_vKm3nR7wP2sT9yU5iO1eA6hD4fG0bN8cL3jM5k"
```

### Creating a User

```bash
curl -X POST http://localhost:8000/v1/users \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"user_id": "user-123", "alias": "Alice"}'
```
<details>
<summary>With optional metadata</summary>

```bash
curl -X POST http://localhost:8000/v1/users \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d ' { 
    "user_id": "user-123", 
    "alias": "Alice",
    "metadata": {
      "department": "Engineering",
      "team": "ML",
      "email": "alice@example.com"
    }
  }'
```
</details>

### Making Requests with Master Key
When using the master key, you **must** specify which user is making the request using the `user` field:

```bash
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai:gpt-4o-mini",
    "messages": [{"role": "user", "content": "Write a haiku on Jupiter"}],
    "user": "user-123"
  }'
```
The `user` field tells the gateway which user's budget and spend tracking to update. Without this field, the request will be rejected.

## Virtual API Keys

Virtual API keys provide scoped access for making completion requests without exposing the master key. Each virtual key can have expiration dates, metadata, and associated users for automatic usage tracking.

### Creating a Virtual API Key
Create a virtual key with a descriptive name : 

```bash
curl -X POST http://localhost:8000/v1/keys \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"key_name": "mobile-app"}'
```
> **Important:** Save the `key` value immediately—it's only shown once and cannot be retrieved later.

<details>
<summary>Example Response</summary>

```json
{
  "id": "abc-123",
  "key": "gw-...",
  "key_name": "mobile-app",
  "created_at": "2025-10-20T10:00:00",
  "expires_at": null,
  "is_active": true,
  "metadata": {}
}
```
</details>

#### Key with Expiration

Create a key that automatically expires on a specific date:

```bash
curl -X POST http://localhost:8000/v1/keys \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "key_name": "trial-access",
    "expires_at": "2025-12-31T23:59:59Z"
  }'
```

### Using Virtual API Keys
Making requests with a virtual key is simpler than using the master key—no `user` field is required:

```bash
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "X-AnyLLM-Key: Bearer gw-..." \
  -H "Content-Type: application/json" \
  -d '{"model": "openai:gpt-5-mini", "messages": [{"role": "user", "content": "Write a haiku on Saturn"}]}'
```

The gateway automatically tracks usage based on the virtual key used.
### Managing Virtual Keys

#### List All Keys
**List all keys:**
```bash
curl http://localhost:8000/v1/keys \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}"
```

**Deactivate a key:**
```bash
curl -X PATCH http://localhost:8000/v1/keys/<virtual_key_id>\
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"is_active": false}'
```

**Delete a key:**
```bash
curl -X DELETE http://localhost:8000/v1/keys/<virtual_key_id> \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}"
```

> See [API Reference](api-reference.md) for complete key management operations.


Note: The actual key values are never returned in list or get operations for security reasons.

## Next Steps

Now that you understand authentication, explore these related topics:

- **[Budget Management](budget-management.md)** - Set spending limits for users and enforce budgets
- **[Configuration](configuration.md)** - Learn about provider setup and pricing configuration
- **[API Reference](api-reference.md)** - Explore all available endpoints for managing keys and users
- **[Quick Start](quickstart.md)** - Complete walkthrough of setting up your first gateway

For questions or issues, refer to the [troubleshooting guide](troubleshooting.md) or check the project's issue tracker.

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/batch.md

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/batch.md

# Batch

!!! warning "Experimental API"
    The Batch API is experimental and subject to breaking changes in future versions. Use with caution in production environments.

The Batch API allows you to process multiple requests asynchronously at a lower cost.

## File Path Interface

The `any-llm` batch API requires you to pass a **path to a local JSONL file** containing your batch requests. The provider implementation automatically handles uploading and file management as needed.

Different providers handle batch processing differently:

- **OpenAI**: Requires uploading a file first, then creating a batch with the file ID
- **Anthropic** (future): Expects file content passed directly in the request
- **Other providers**: May have their own unique requirements

By accepting a local file path, `any-llm` abstracts these provider differences and handles the implementation details automatically.

::: any_llm.api.create_batch
::: any_llm.api.acreate_batch
::: any_llm.api.retrieve_batch
::: any_llm.api.aretrieve_batch
::: any_llm.api.cancel_batch
::: any_llm.api.acancel_batch
::: any_llm.api.list_batches
::: any_llm.api.alist_batches

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/budget-management.md

# Budget Management

Budgets provide shared spending limits that can be assigned to multiple users. This allows you to create budget tiers (like "Free", "Pro", "Enterprise") and enforce spending limits across groups of users.

## Creating a Budget

```bash
# Create a budget with a $10.00 spending limit and monthly resets (30 days = 2592000 seconds)
curl -X POST http://localhost:8000/v1/budgets \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "max_budget": 10.0,
    "budget_duration_sec": 2592000
  }'
```

<details> 
<summary> Sample Response</summary>

```json
{
  "budget_id": "abc-123",
  "max_budget": 10.0,
  "budget_duration_sec": 2592000,
  "created_at": "2025-10-22T10:00:00Z",
  "updated_at": "2025-10-22T10:00:00Z"
}
```
</details>

## Assigning Budgets to Users

When creating or updating a user, specify the `budget_id`:

**Warning: If you don't create and set a budget, budget is unlimited**

```bash
# Create a user with a budget
curl -X POST http://localhost:8000/v1/users \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user-456",
    "alias": "Bob",
    "budget_id": "abc-123"
  }'

# Update an existing user's budget
curl -X PATCH http://localhost:8000/v1/users/user-123 \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"budget_id": "abc-123"}'
```

## Per-User Budget Resets

Budget resets are **per-user**, not global. Each user tracks their own budget period based on when they were assigned the budget.

**Example:**
1. Create a budget with `budget_duration_sec: 604800` (1 week)
2. Assign User A to the budget on Monday
3. Assign User B to the budget on Tuesday
4. User A's budget resets every Monday
5. User B's budget resets every Tuesday

This allows you to create budget tiers (like "Free", "Pro", "Enterprise") without worrying about all users resetting at the same time.

## Automatic Reset Behavior

Budget resets happen automatically using a "lazy reset" approach:
- When a user makes a request, the system checks if their `next_budget_reset_at` has passed
- If yes, the user's `spend` is reset to $0.00 and a new reset date is calculated
- A log entry is created in `budget_reset_logs` for audit purposes
- The request then proceeds normally

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/completion.md

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/completion.md

## Completion

::: any_llm.api.completion
::: any_llm.api.acompletion

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/configuration.md

# Configuration

The any-llm-gateway requires configuration to connect to your database, authenticate requests, and route to LLM providers. This guide covers the two main configuration approaches and how to set up model pricing for cost tracking.

You can configure the gateway using either a YAML configuration file or environment variables:

- **Config File (Recommended)**: Best for development and when managing multiple providers with complex settings. Easier to version control and share across teams.
- **Environment Variables**: Best for production deployments, containerized environments, or when following 12-factor app principles.

Both methods can also be combined—environment variables will override config file values.

## Option 1: Config File

Create a `config.yml` file with your database connection, master key, and provider credentials:

> **Generating a secure master key:**
> ```bash
>  python -c "import secrets; print(secrets.token_urlsafe(32))"
> ```

```yaml
#Database connection
database_url: "postgresql://gateway:gateway@localhost:5432/gateway_db"

#Master key for admin access
master_key: "your-secure-master-key"

## LLM Provider Credentials
providers:
  openai:
    api_key: "${OPENAI_API_KEY}"
  gemini:
    api_key: "${GEMINI_API_KEY}"
  vertexai:
    credentials: "/path/to/service_account.json"
    project: "your-gcp-project-id"
    location: "us-central1"

# Model pricing for cost-tracking (optional)
pricing:
  openai:gpt-4:
    input_price_per_million: 0.15
    output_price_per_million: 0.6
```

Start the gateway with your config file:

```bash
any-llm-gateway serve --config config.yml
```

## Option 2: Environment Variables
Configure the gateway entirely through environment variables—useful for containerized deployments:

```bash
#Required settings
export DATABASE_URL="postgresql://gateway:gateway@localhost:5432/gateway_db"
export GATEWAY_MASTER_KEY="your-secure-master-key"
export GATEWAY_HOST="0.0.0.0"
export GATEWAY_PORT=8000

any-llm-gateway serve
```
> **Note**: Model pricing cannot be set via environment variables. Use the config file or the [Pricing API](#dynamic-pricing-via-api) instead.


## Model Pricing Configuration

Configure model pricing in your config file to automatically track costs. Pricing can be set via config file or dynamically via the API.

### Config File Pricing

Add pricing for models in your config file using the format `provider:model-name`:

```yaml
pricing:
  openai:gpt-3.5-turbo:
    input_price_per_million: 0.5
    output_price_per_million: 1.5
```

### Dynamic Pricing via API

You can also set or update pricing dynamically using the API:
```bash
curl -X POST http://localhost:8000/v1/pricing \
  -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai:gpt-4",
    "input_price_per_million": 30.0,
    "output_price_per_million": 60.0
  }'
```

This is useful for:
- Updating pricing without restarting the gateway
- Managing pricing in production environments
- Adjusting rates as provider pricing changes

**Important notes:**
- Database pricing takes precedence - config only sets initial values
- If pricing for the model already exists in the database, config values are ignored (with a warning logged)

## Provider Client Args

You can set additional arguments to provider clients via the `client_args` configuration. These arguments are passed directly to the provider's client initialization, enabling custom headers, timeouts, and other provider-specific options.

```yaml
providers:
  openai:
    api_key: "${OPENAI_API_KEY}"
    client_args:
      custom_headers:
        X-Custom-Header: "custom-value"
      timeout: 60
```

Common use cases:
- **Custom headers**: Pass additional headers to the provider (e.g., for proxy authentication or request tracing)
- **Timeouts**: Configure connection and request timeouts
- **Provider-specific options**: Pass any additional arguments supported by the provider's client

The available `client_args` options depend on the provider. See the [any-llm provider documentation](https://mozilla-ai.github.io/any-llm/providers/) for provider-specific options.

## Next Steps

- See [supported providers](https://mozilla-ai.github.io/any-llm/providers/) for provider-specific configuration
- Learn about [authentication methods](./authentication.md) for managing access
- Set up [budget management](./budget-management.md) to enforce spending limits

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/docker-deployment.md

# Docker Deployment Guide

This guide walks you through deploying `any-llm-gateway` using Docker and Docker Compose. Whether you're setting up a local development environment or deploying to production, this guide covers the essential steps and best practices for a secure, reliable deployment.

## Quick Start with Docker Compose

Docker Compose is the recommended deployment method for most users. It automatically sets up both the gateway application and a PostgreSQL database with proper networking and dependencies.

**Prerequisites:**
- Docker Engine 20.10 or newer
- Docker Compose 2.0 or newer
- At least one LLM provider API key (OpenAI, Anthropic, Mistral, etc.)

### Configure the Gateway

First, prepare your configuration file with credentials and settings:

Copy the example configuration file:

```bash
cp docker/config.example.yml docker/config.yml
```

Generate a secure master key (minimum 32 characters recommended):

```bash
python -c "import secrets; print(secrets.token_urlsafe(32))"
```

Save the output of this command for the next step. [Learn more about keys here](authentication.md).

Edit `docker/config.yml` with your master key and provider credentials. See the [Configuration Guide](configuration.md) for detailed options.

### Start the Services

Launch the gateway and database with a single command:

```bash
docker-compose -f docker/docker-compose.yml up -d
```

This command will:
- Pull the PostgreSQL 16 Alpine image
- Build the gateway Docker image from source (or pull from GHCR if configured)
- Create a dedicated network for service communication
- Start PostgreSQL with automatic health checks
- Wait for the database to be healthy before starting the gateway
- Initialize database tables and schema automatically

The `-d` flag runs services in detached mode (background).

### Verify Deployment

Confirm everything is running correctly:

```bash
# Test the health endpoint
curl http://localhost:8000/health
# Expected: {"status": "healthy"}

# Check service status
docker-compose -f docker/docker-compose.yml ps

# View real-time logs
docker-compose -f docker/docker-compose.yml logs -f gateway
```

If the health check returns successfully, your gateway is ready to accept requests!

## Standalone Docker Deployment

For scenarios where you have an existing PostgreSQL database or prefer more control over your deployment architecture, you can run the gateway as a standalone container.

### Using Pre-built Image

Pull and run the official image from GitHub Container Registry:

```bash
docker pull ghcr.io/mozilla-ai/any-llm/gateway:latest

docker run -d \
  --name any-llm-gateway \
  -p 8000:8000 \
  -v $(pwd)/config.yml:/app/config.yml \
  -e DATABASE_URL="postgresql://user:pass@host:5432/dbname" \
  ghcr.io/mozilla-ai/any-llm/gateway:latest \
  any-llm-gateway serve --config /app/config.yml
```

Replace the `DATABASE_URL` with your actual PostgreSQL connection string. The format is: `postgresql://username:password@hostname:port/database_name`

### Building from Source

If you need to customize the image or test local changes:

```bash
docker build -t any-llm-gateway:local -f docker/Dockerfile .

docker run -d \
  --name any-llm-gateway \
  -p 8000:8000 \
  -v $(pwd)/config.yml:/app/config.yml \
  -e DATABASE_URL="postgresql://user:pass@host:5432/dbname" \
  any-llm-gateway:local
```

## Production Deployment

Production deployments require additional considerations for reliability, security, and performance.

### Production Configuration

Enhance your docker-compose.yml with production-grade settings:

```yaml
services:
  gateway:
    image: ghcr.io/mozilla-ai/any-llm/gateway:latest
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 2G
        reservations:
          cpus: '1'
          memory: 1G
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
```

### Nginx Reverse Proxy

For production, always use a reverse proxy with HTTPS:

```nginx
server {
    listen 443 ssl http2;
    server_name gateway.yourdomain.com;

    ssl_certificate /etc/ssl/certs/gateway.crt;
    ssl_certificate_key /etc/ssl/private/gateway.key;

    # Security headers
    add_header Strict-Transport-Security "max-age=31536000" always;

    location / {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Timeouts for LLM streaming
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;
    }
}
```

## Environment Variables

The gateway can be configured using environment variables instead of or in addition to a config file. This is useful for Docker deployments and follows 12-factor app principles.

For a complete list of environment variables and configuration options, see the [Configuration Guide](configuration.md).

**Docker Compose example with .env file:**

```yaml
services:
  gateway:
    env_file:
      - .env
```

## Database Backups

```bash
# Backup
docker-compose -f docker/docker-compose.yml exec postgres \
  pg_dump -U gateway gateway > backup.sql

# Restore
docker-compose -f docker/docker-compose.yml exec -T postgres \
  psql -U gateway gateway < backup.sql
```

## Security Best Practices

1. **Never commit secrets** - Use `.env` files (gitignored) or Docker secrets
2. **Use read-only volumes** - Mount configs with `:ro` flag
3. **Enable HTTPS** - Use a reverse proxy with SSL certificates
4. **Isolate networks** - Keep database on internal network only
5. **Update regularly** - Use tagged versions and update containers periodically

## Monitoring and Logging

### Health Checks

```bash
# Test health endpoint
curl http://localhost:8000/health

# Check container health status
docker inspect --format='{{.State.Health.Status}}' container-name
```

### Logging

```bash
# View logs
docker-compose logs -f gateway

# Last 100 lines
docker-compose logs --tail=100 gateway
```

Configure log rotation:

```yaml
services:
  gateway:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
```

## Troubleshooting

**Container won't start:**
```bash
docker-compose logs gateway
```
Common issues: Database connection failed, port in use, missing config

**Database connection issues:**
```bash
docker-compose exec postgres psql -U gateway -c "SELECT version();"
```

**Permission errors:**
```bash
chmod 644 docker/config.yml
chmod 600 docker/service_account.json
```

**Rebuild after changes:**
```bash
docker-compose -f docker/docker-compose.yml up -d --build
```

## Next Steps

- [Configuration Guide](configuration.md) - Advanced configuration options
- [Authentication](authentication.md) - Set up API keys and user management
- [Budget Management](budget-management.md) - Configure spending limits
- [API Reference](api-reference.md) - Explore the complete API
- [Troubleshooting](troubleshooting.md) - Common issues and solutions

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/embedding.md

## Embedding

::: any_llm.api.embedding
::: any_llm.api.aembedding

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/exceptions.md

# Exception Handling

::: any_llm.exceptions
    options:
      show_root_heading: false
      heading_level: 3

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/index.md

---
schema:
  type: "SoftwareSourceCode"
  name: "any-llm"
  description: "A Python library providing a single interface to different LLM providers including OpenAI, Anthropic, Mistral, and more"
  programmingLanguage: "Python"
  codeRepository: "https://github.com/mozilla-ai/any-llm"
  license: "https://github.com/mozilla-ai/any-llm/blob/main/LICENSE"
---

<p align="center">
  <picture>
    <img src="./images/any-llm-logo.png" width="20%" alt="any-llm logo"/>
  </picture>
      <p align="center">  <b>One interface. Every LLM. </b></p>
</p>

`any-llm` is a Python library providing a single interface to different llm providers.

```python
from any_llm import completion

# Using the messages format
response = completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is Python?"}],
    provider="openai"
)
print(response)

# Switch providers without changing your code
response = completion(
    model="claude-sonnet-4-5-20250929",
    messages=[{"role": "user", "content": "What is Python?"}],
    provider="anthropic"
)
print(response)
```

### Why any-llm
  - Switch providers in one line
  - Unified exception handling across providers
  - Simple API, powerful features

[View supported providers →](./providers.md)

### Getting Started

**[Get started in 5 minutes →](./quickstart.md)** - Install the library and run your first API call.


### Demo

Try `any-llm` in action with our interactive chat demo:

**[📂 Run the Demo](https://github.com/mozilla-ai/any-llm/tree/main/demos/chat#readme)**

Features: real-time streaming responses, multiple provider support, and collapsible "thinking" content display.

### API Documentation

`any-llm` provides two main interfaces:

**Direct API Functions** (recommended for simple use cases):
- [completion](./api/completion.md) - Chat completions with any provider
- [embedding](./api/embedding.md) - Text embeddings
- [responses](./api/responses.md) - [OpenResponses](https://www.openresponses.org/) API for agentic AI systems

**AnyLLM Class** (recommended for advanced use cases):
- [Provider API](./api/any_llm.md) - Lower-level provider interface with metadata access and reusability

## For AI Systems

This documentation is available in two AI-friendly formats:

- **[llms.txt](https://mozilla-ai.github.io/any-llm/llms.txt)** - A structured overview with curated links to key documentation sections
- **[llms-full.txt](https://mozilla-ai.github.io/any-llm/llms-full.txt)** - Complete documentation content concatenated into a single file

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/list_models.md

## Models

::: any_llm.api.list_models
::: any_llm.api.alist_models

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/model.md

## Model Types

Data models and types for model operations.

::: any_llm.types.model

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/overview.md

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/platform/overview.md

# Managed Platform Overview

## What is the any-llm Managed Platform?

The any-llm managed platform is a cloud-hosted service that provides secure API key vaulting and usage tracking for all your LLM providers. Instead of managing multiple provider API keys across your codebase, you get a single virtual key that works with any supported provider while keeping your credentials encrypted and your usage tracked.

The managed platform is available at [any-llm.ai](https://any-llm.ai).

## Why use the Managed Platform?

Managing LLM API keys and tracking costs across multiple providers is challenging:

- **Security risks**: API keys scattered across `.env` files, CI/CD pipelines, and developer machines
- **No visibility**: Difficult to track spending across OpenAI, Anthropic, Google, and other providers
- **Key rotation pain**: Updating keys means touching multiple systems and codebases
- **No performance insights**: No easy way to measure latency, throughput, or reliability

The managed platform solves these problems:

- **Secure Key Vault**: Your provider API keys are encrypted client-side before storage—we never see your raw keys
- **Single Virtual Key**: One `ANY_LLM_KEY` works across all providers
- **Usage Analytics**: Track tokens, costs, and performance metrics without logging prompts or responses
- **Zero Infrastructure**: No servers to deploy, no databases to manage

## How it works

The managed platform acts as a secure credential manager and usage tracker. Here's the flow:

1. **You add provider keys** to the platform dashboard (keys are encrypted in your browser before upload)
2. **You get a virtual key** (`ANY_LLM_KEY`) that represents your project
3. **Your application** uses the `PlatformProvider` with your virtual key
4. **The SDK** authenticates with the platform, retrieves and decrypts your provider key client-side
5. **Your request** goes directly to the LLM provider (OpenAI, Anthropic, etc.)
6. **Usage metadata** (tokens, model, latency) is reported back—never your prompts or responses

```
┌─────────────────────────────────────────────────────────────────────────┐
│                          Your Application                               │
│                                                                         │
│   from any_llm import completion                                        │
│   completion(provider="platform", model="openai:gpt-4", ...)            │
└──────────────────────────────┬──────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                        any-llm SDK (PlatformProvider)                   │
│                                                                         │
│  1. Authenticate with platform using ANY_LLM_KEY                        │
│  2. Receive encrypted provider key                                      │
│  3. Decrypt provider key locally (client-side)                          │
│  4. Make request directly to provider                                   │
│  5. Report usage metadata (tokens, latency) to platform                 │
└────────────────┬─────────────────────────────────────┬──────────────────┘
                 │                                     │
                 ▼                                     ▼
┌─────────────────────────────┐       ┌────────────────────────────────────┐
│   any-llm Managed Platform  │       │        LLM Provider                │
│                             │       │   (OpenAI, Anthropic, etc.)        │
│  • Encrypted key storage    │       │                                    │
│  • Usage tracking           │       │   Your prompts/responses go        │
│  • Cost analytics           │       │   directly here—never through      │
│  • Performance metrics      │       │   our platform                     │
└─────────────────────────────┘       └────────────────────────────────────┘
```

## Key Features

### Client-Side Encryption

Your provider API keys are encrypted in your browser using XChaCha20-Poly1305 before being sent to our servers. The encryption key is derived from your account credentials and never leaves your device. This means:

- We cannot read your provider API keys
- Even if our database were compromised, your keys remain encrypted
- You maintain full control over your credentials


### Privacy-First Usage Tracking

The platform tracks usage metadata to provide cost and performance insights:

**What we track for you:**

- Token counts (input and output)
- Model name and provider
- Request timestamps
- Performance metrics (latency, throughput)

**What we never track:**

- Your prompts
- Model responses
- Any content from your conversations

### Project Organization

Organize your usage by project, team, or environment:

- Create separate projects for development, staging, and production
- Track costs per project
- Set up different provider keys per project

## Platform vs. Gateway

any-llm offers two solutions for managing LLM access. Choose the one that fits your needs:

| Feature | Managed Platform | Self-Hosted Gateway |
|---------|-----------------|---------------------|
| **Deployment** | Cloud-hosted (no infrastructure) | Self-hosted (Docker + Postgres) |
| **Key Storage** | Client-side encrypted vault | Your own configuration |
| **Budget Enforcement** | Coming soon | Built-in |
| **User Management** | Per-project | Full user/key management |
| **Request Routing** | Direct to provider, no proxy | Through your gateway |
| **Best For** | Teams wanting zero-ops key management and usage tracking| Organizations needing full control |

You can also use both together—store your provider keys in the managed platform and use them in a self-hosted gateway deployment.

## Current Status

The any-llm managed platform is in **open beta**. During the beta:

- **Free access** to all features
- Core encryption and key management are **production-ready**
- Dashboard UX and advanced features are being refined
- Feedback is welcome at [any-llm.ai](https://any-llm.ai)

## Getting Started

Ready to try the managed platform?

1. Create an account at [any-llm.ai](https://any-llm.ai)
2. Add your provider API keys
3. Get your virtual key
4. Make your first request

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/provider.md

## Provider Types

Data models and types for provider operations.

::: any_llm.types.provider

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/providers.md

---
schema:
  type: "TechArticle"
  name: "Supported Providers - any-llm"
  description: "Complete list of LLM providers supported by any-llm including OpenAI, Anthropic, Mistral, and more"
  datePublished: "2024-03-15"
  dateModified: "2024-11-18"
---

# Supported Providers

`any-llm` supports the below providers. In order to discover information about what models are supported by a provider
as well as what features the provider supports for each model, refer to the provider documentation.

Provider source code can be found in the [`src/any_llm/providers/`](https://github.com/mozilla-ai/any-llm/tree/main/src/any_llm/providers) directory of the repository.

!!! note "Legend"

    - **Key**: Environment variable for the API key (e.g., `OPENAI_API_KEY`).
    - **Base**: Environment variable for a custom API base URL (e.g., `OPENAI_BASE_URL`). Useful for proxies or self-hosted endpoints.
    - **Reasoning (Completions)**: Provider can return reasoning traces alongside the assistant message via the completions and/or streaming endpoints. This does not indicate whether the provider offers separate "reasoning models". See [this](https://github.com/mozilla-ai/any-llm/issues/95) discussion for more information.
    - **Streaming (Completions)**: Provider can stream completion results back as an iterator.
    - **Image (Completions)**: Provider supports passing an `image_data` parameter for vision capabilities, as defined by the OpenAI spec [here](https://platform.openai.com/docs/api-reference/chat/create#chat_create-messages).
    - **OpenResponses API**: Provider supports the [OpenResponses](https://www.openresponses.org/) specification for agentic AI systems. See the [Responses API docs](api/responses.md) for usage details.
    - **List Models API**: Provider supports listing available models programmatically via the `list_models()` function. This allows you to discover what models are available from the provider at runtime, which can be useful for dynamic model selection or validation.


<!-- The below table is auto-generated by the mkdocs build hook. It will display in the generated site -->
<!-- AUTO-GENERATED TABLE START -->
<!-- AUTO-GENERATED TABLE END -->

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/quickstart.md

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/quickstart.md

---
schema:
  type: "HowTo"
  name: "How to Install and Use any-llm"
  description: "Step-by-step guide to installing any-llm and making your first API call with Python"
  totalTime: "PT5M"
  tool:
    - "Python 3.11 or newer"
    - "pip package manager"
  supply:
    - "API key from your chosen LLM provider"
  steps:
    - name: "Install any-llm"
      text: "Install any-llm with your chosen providers using pip. Use the all option to install support for all providers."
      url: "https://mozilla-ai.github.io/any-llm/quickstart/#installation"
    - name: "Set up API keys"
      text: "Configure your provider's API key as an environment variable. Make sure you have the appropriate environment variable set for your chosen provider."
      url: "https://mozilla-ai.github.io/any-llm/quickstart/#api-keys"
    - name: "Make your first completion call"
      text: "Import the completion function from any-llm and create your first API call with your chosen model and provider"
      url: "https://mozilla-ai.github.io/any-llm/quickstart/#your-first-api-call"
---


## Requirements

- Python 3.11 or newer
- API keys for your chosen LLM provider

## Installation

```bash
pip install any-llm-sdk[all]  # Install with all provider support
```

### Installing Specific Providers

If you want to install a specific provider from our [supported providers](./providers.md):

```bash
pip install any-llm-sdk[mistral]  # For Mistral provider
pip install any-llm-sdk[ollama]   # For Ollama provider
# install multiple providers
pip install any-llm-sdk[mistral,ollama]
```

### Library Integration

If you're building a library, install just the base package (`pip install any-llm-sdk`) and let your users install provider dependencies.

> **API Keys:** Set your provider's API key as an environment variable (e.g., `export MISTRAL_API_KEY="your-key"`) or pass it directly using the `api_key` parameter.

## APIs

### Using the AnyLLM Class

For applications making multiple requests with the same provider, use the `AnyLLM` class to avoid repeated provider instantiation:

```python
import os

from any_llm import AnyLLM

# Make sure you have the appropriate API key set
api_key = os.environ.get('MISTRAL_API_KEY')
if not api_key:
    raise ValueError("Please set MISTRAL_API_KEY environment variable")

llm = AnyLLM.create("mistral")

response = llm.completion(
    model="mistral-small-latest",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

metadata = llm.get_provider_metadata()
print(f"Supports streaming: {metadata.streaming}")
print(f"Supports tools: {metadata.completion}")
```

### API Call

```python
import os

from any_llm import completion

# Make sure you have the appropriate API key set
api_key = os.environ.get('MISTRAL_API_KEY')
if not api_key:
    raise ValueError("Please set MISTRAL_API_KEY environment variable")

# Recommended: separate provider and model parameters
response = completion(
    model="mistral-small-latest",
    provider="mistral",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
```

### When to Choose Which Approach

**Use Direct API Functions (`completion`, `acompletion`) when:**

- Making simple, one-off requests
- Prototyping or writing quick scripts
- You want the simplest possible interface

**Use Provider Class (`AnyLLM.create`) when:**

- Building applications that make multiple requests with the same provider
- You want to avoid repeated provider instantiation overhead

**Finding model names:** Check the [providers page](./providers.md) for provider IDs, or use the [`list_models`](./api/list_models.md) API to see available models for your provider.

## Streaming

For the [providers that support streaming](./providers.md), you can enable it by passing `stream=True`:

```python
output = ""
for chunk in completion(
    model="mistral-small-latest",
    provider="mistral",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
):
    chunk_content = chunk.choices[0].delta.content or ""
    print(chunk_content)
    output += chunk_content
```

## Embeddings

[`embedding`][any_llm.embedding] and [`aembedding`][any_llm.aembedding] allow you to create vector embeddings from text using the same unified interface across providers.

Not all providers support embeddings - check the [providers documentation](./providers.md) to see which ones do.

```python
from any_llm import embedding

result = embedding(
    model="text-embedding-3-small",
    provider="openai",
    inputs="Hello, world!" # can be either string or list of strings
)

# Access the embedding vector
embedding_vector = result.data[0].embedding
print(f"Embedding vector length: {len(embedding_vector)}")
print(f"Tokens used: {result.usage.total_tokens}")
```

## Tools

`any-llm` supports tool calling for providers that support it. You can pass a list of tools where each tool is either:

1. **Python callable** - Functions with proper docstrings and type annotations
2. **OpenAI Format tool dict** - Already in OpenAI tool format

```python
from any_llm import completion

def get_weather(location: str, unit: str = "F") -> str:
    """Get weather information for a location.

    Args:
        location: The city or location to get weather for
        unit: Temperature unit, either 'C' or 'F'

    Returns:
        Current weather description
    """
    return f"Weather in {location} is sunny and 75{unit}!"

response = completion(
    model="mistral-small-latest",
    provider="mistral",
    messages=[{"role": "user", "content": "What's the weather in Pittsburgh PA?"}],
    tools=[get_weather]
)
```

any-llm automatically converts your Python functions to OpenAI tools format. Functions must have:
- A docstring describing what the function does
- Type annotations for all parameters
- A return type annotation

## Exception Handling

The `any-llm` package provides a unified exception hierarchy that works consistently across all LLM providers.

### Enabling Unified Exceptions

!!! info "Opt-in Feature"
    Unified exception handling is currently **opt-in**. Set the `ANY_LLM_UNIFIED_EXCEPTIONS` environment variable to enable it:

```bash
export ANY_LLM_UNIFIED_EXCEPTIONS=1
```

When enabled, provider-specific exceptions are automatically converted to `any-llm` exception types. When disabled (default), the original provider exceptions are raised with a deprecation warning.

### Basic Usage

```python
from any_llm import completion
from any_llm.exceptions import (
    RateLimitError,
    AuthenticationError,
    ProviderError,
    AnyLLMError,
)

try:
    response = completion(
        model="gpt-4",
        provider="openai",
        messages=[{"role": "user", "content": "Hello!"}]
    )
except RateLimitError as e:
    print(f"Rate limited: {e.message}")
except AuthenticationError as e:
    print(f"Auth failed: {e.message}")
except ProviderError as e:
    print(f"Provider error: {e.message}")
except AnyLLMError as e:
    print(f"Error: {e.message}")
```

### Accessing Original Exceptions

All unified exceptions preserve the original provider exception for debugging:

```python
from any_llm.exceptions import RateLimitError

messages = [{"role": "user", "content": "Hello!"}]

try:
    response = completion(model="gpt-4", provider="openai", messages=messages)
except RateLimitError as e:
    print(f"Provider: {e.provider_name}")
    print(f"Original exception: {type(e.original_exception)}")
```

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/responses.md

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/responses.md

## OpenResponses API

The Responses API in any-llm implements the [OpenResponses](https://www.openresponses.org/) specification—an open-source standard for building multi-provider, interoperable LLM interfaces for agentic AI systems.

!!! info "Learn More"

    - [OpenResponses Specification](https://www.openresponses.org/specification)
    - [OpenResponses Reference](https://www.openresponses.org/reference)
    - [HuggingFace Responses API Guide](https://huggingface.co/docs/inference-providers/guides/responses-api)

### Return Types

The `responses()` and `aresponses()` functions return different types depending on the provider's level of OpenResponses compliance:

| Return Type | When Returned |
|-------------|---------------|
| `openresponses_types.ResponseResource` | Providers fully compliant with the OpenResponses specification |
| `openai.types.responses.Response` | Providers using OpenAI's native Responses API (not yet fully OpenResponses-compliant) |
| `Iterator[dict]` / `AsyncIterator[dict]` | When `stream=True` is set |


Both `ResponseResource` and `Response` share a similar structure, so in many cases
you can access common fields like `output` without type checking.

::: any_llm.api.responses
::: any_llm.api.aresponses

---

# Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/troubleshooting.md

# Troubleshooting

## Database connection errors

Make sure the database URL is correct and the database is accessible:

```bash
python -c "from sqlalchemy import create_engine; engine = create_engine('postgresql://user:pass@host/db'); print('OK')"
```

## Common Issues

### Authentication Errors

- Ensure you're using the correct master key format: `Bearer your-secure-master-key`
- Check that the `X-AnyLLM-Key` header is properly set
- Verify that virtual API keys are active and not expired

### Configuration Issues

- Verify your `config.yml` file is properly formatted
- Check that environment variables are set correctly
- Ensure provider API keys are valid and have proper permissions

### Budget Enforcement

- Check that budgets are properly assigned to users
- Verify budget limits are set correctly
- Monitor user spending to ensure limits are being enforced

## Getting Help

- Check the logs for detailed error messages
- Verify your configuration matches the examples in the documentation
- Ensure all required environment variables are set