# Any Llm > Documentation for Any Llm --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/any_llm.md ## AnyLLM ::: any_llm.AnyLLM --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/api-reference.md # API Reference --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/authentication.md # Authentication any-llm-gateway offers two authentication methods, each designed for different use cases. Understanding when to use each approach will help you secure your gateway effectively. ## Authentication Methods Overview | Method | Best For | Key Management | Usage Tracking | |--------|----------|----------------|----------------| | **Master Key** | Internal services, admin operations, trusted environments | Single key with full access | Requires manual user specification | | **Virtual API Keys** | External apps, per-user access, customer-facing services | Multiple scoped keys | Automatic per-key tracking | ### Supported Headers The gateway accepts authentication via two headers: - **`X-AnyLLM-Key`** (preferred): The gateway's native authentication header - **`Authorization`**: Standard HTTP authorization header for OpenAI client compatibility Both headers use the `Bearer ` format. When both headers are present, `X-AnyLLM-Key` takes precedence. Using the `Authorization` header allows you to use the gateway with OpenAI-compatible clients without modification: ```python from openai import OpenAI client = OpenAI( base_url="http://localhost:8000/v1", api_key="your-master-key-or-virtual-key", # Sent as Authorization: Bearer ... ) ``` ## Master Key The master key is the root credential for your gateway installation. It has unrestricted access to all gateway operations and should be treated with the same security as your production database credentials. ### Generating a Master Key Generate a cryptographically secure master key (minimum 32 characters recommended): ```bash python -c "import secrets; print(secrets.token_urlsafe(32))" ``` **Example output:** ``` Zx8Q_vKm3nR7wP2sT9yU5iO1eA6hD4fG0bN8cL3jM5k ``` Set the generated key in your configuration: **Using environment variables:** ```bash export GATEWAY_MASTER_KEY="Zx8Q_vKm3nR7wP2sT9yU5iO1eA6hD4fG0bN8cL3jM5k" ``` **Using config.yml:** ```yaml master_key: "Zx8Q_vKm3nR7wP2sT9yU5iO1eA6hD4fG0bN8cL3jM5k" ``` ### Creating a User ```bash curl -X POST http://localhost:8000/v1/users \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{"user_id": "user-123", "alias": "Alice"}' ```
With optional metadata ```bash curl -X POST http://localhost:8000/v1/users \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d ' { "user_id": "user-123", "alias": "Alice", "metadata": { "department": "Engineering", "team": "ML", "email": "alice@example.com" } }' ```
### Making Requests with Master Key When using the master key, you **must** specify which user is making the request using the `user` field: ```bash curl -X POST http://localhost:8000/v1/chat/completions \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{ "model": "openai:gpt-4o-mini", "messages": [{"role": "user", "content": "Write a haiku on Jupiter"}], "user": "user-123" }' ``` The `user` field tells the gateway which user's budget and spend tracking to update. Without this field, the request will be rejected. ## Virtual API Keys Virtual API keys provide scoped access for making completion requests without exposing the master key. Each virtual key can have expiration dates, metadata, and associated users for automatic usage tracking. ### Creating a Virtual API Key Create a virtual key with a descriptive name : ```bash curl -X POST http://localhost:8000/v1/keys \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{"key_name": "mobile-app"}' ``` > **Important:** Save the `key` value immediately—it's only shown once and cannot be retrieved later.
Example Response ```json { "id": "abc-123", "key": "gw-...", "key_name": "mobile-app", "created_at": "2025-10-20T10:00:00", "expires_at": null, "is_active": true, "metadata": {} } ```
#### Key with Expiration Create a key that automatically expires on a specific date: ```bash curl -X POST http://localhost:8000/v1/keys \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{ "key_name": "trial-access", "expires_at": "2025-12-31T23:59:59Z" }' ``` ### Using Virtual API Keys Making requests with a virtual key is simpler than using the master key—no `user` field is required: ```bash curl -X POST http://localhost:8000/v1/chat/completions \ -H "X-AnyLLM-Key: Bearer gw-..." \ -H "Content-Type: application/json" \ -d '{"model": "openai:gpt-5-mini", "messages": [{"role": "user", "content": "Write a haiku on Saturn"}]}' ``` The gateway automatically tracks usage based on the virtual key used. ### Managing Virtual Keys #### List All Keys **List all keys:** ```bash curl http://localhost:8000/v1/keys \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" ``` **Deactivate a key:** ```bash curl -X PATCH http://localhost:8000/v1/keys/\ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{"is_active": false}' ``` **Delete a key:** ```bash curl -X DELETE http://localhost:8000/v1/keys/ \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" ``` > See [API Reference](api-reference.md) for complete key management operations. Note: The actual key values are never returned in list or get operations for security reasons. ## Next Steps Now that you understand authentication, explore these related topics: - **[Budget Management](budget-management.md)** - Set spending limits for users and enforce budgets - **[Configuration](configuration.md)** - Learn about provider setup and pricing configuration - **[API Reference](api-reference.md)** - Explore all available endpoints for managing keys and users - **[Quick Start](quickstart.md)** - Complete walkthrough of setting up your first gateway For questions or issues, refer to the [troubleshooting guide](troubleshooting.md) or check the project's issue tracker. --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/batch.md # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/batch.md # Batch !!! warning "Experimental API" The Batch API is experimental and subject to breaking changes in future versions. Use with caution in production environments. The Batch API allows you to process multiple requests asynchronously at a lower cost. ## File Path Interface The `any-llm` batch API requires you to pass a **path to a local JSONL file** containing your batch requests. The provider implementation automatically handles uploading and file management as needed. Different providers handle batch processing differently: - **OpenAI**: Requires uploading a file first, then creating a batch with the file ID - **Anthropic** (future): Expects file content passed directly in the request - **Other providers**: May have their own unique requirements By accepting a local file path, `any-llm` abstracts these provider differences and handles the implementation details automatically. ::: any_llm.api.create_batch ::: any_llm.api.acreate_batch ::: any_llm.api.retrieve_batch ::: any_llm.api.aretrieve_batch ::: any_llm.api.cancel_batch ::: any_llm.api.acancel_batch ::: any_llm.api.list_batches ::: any_llm.api.alist_batches --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/budget-management.md # Budget Management Budgets provide shared spending limits that can be assigned to multiple users. This allows you to create budget tiers (like "Free", "Pro", "Enterprise") and enforce spending limits across groups of users. ## Creating a Budget ```bash # Create a budget with a $10.00 spending limit and monthly resets (30 days = 2592000 seconds) curl -X POST http://localhost:8000/v1/budgets \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{ "max_budget": 10.0, "budget_duration_sec": 2592000 }' ```
Sample Response ```json { "budget_id": "abc-123", "max_budget": 10.0, "budget_duration_sec": 2592000, "created_at": "2025-10-22T10:00:00Z", "updated_at": "2025-10-22T10:00:00Z" } ```
## Assigning Budgets to Users When creating or updating a user, specify the `budget_id`: **Warning: If you don't create and set a budget, budget is unlimited** ```bash # Create a user with a budget curl -X POST http://localhost:8000/v1/users \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{ "user_id": "user-456", "alias": "Bob", "budget_id": "abc-123" }' # Update an existing user's budget curl -X PATCH http://localhost:8000/v1/users/user-123 \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{"budget_id": "abc-123"}' ``` ## Per-User Budget Resets Budget resets are **per-user**, not global. Each user tracks their own budget period based on when they were assigned the budget. **Example:** 1. Create a budget with `budget_duration_sec: 604800` (1 week) 2. Assign User A to the budget on Monday 3. Assign User B to the budget on Tuesday 4. User A's budget resets every Monday 5. User B's budget resets every Tuesday This allows you to create budget tiers (like "Free", "Pro", "Enterprise") without worrying about all users resetting at the same time. ## Automatic Reset Behavior Budget resets happen automatically using a "lazy reset" approach: - When a user makes a request, the system checks if their `next_budget_reset_at` has passed - If yes, the user's `spend` is reset to $0.00 and a new reset date is calculated - A log entry is created in `budget_reset_logs` for audit purposes - The request then proceeds normally --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/completion.md # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/completion.md ## Completion ::: any_llm.api.completion ::: any_llm.api.acompletion --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/configuration.md # Configuration The any-llm-gateway requires configuration to connect to your database, authenticate requests, and route to LLM providers. This guide covers the two main configuration approaches and how to set up model pricing for cost tracking. You can configure the gateway using either a YAML configuration file or environment variables: - **Config File (Recommended)**: Best for development and when managing multiple providers with complex settings. Easier to version control and share across teams. - **Environment Variables**: Best for production deployments, containerized environments, or when following 12-factor app principles. Both methods can also be combined—environment variables will override config file values. ## Option 1: Config File Create a `config.yml` file with your database connection, master key, and provider credentials: > **Generating a secure master key:** > ```bash > python -c "import secrets; print(secrets.token_urlsafe(32))" > ``` ```yaml #Database connection database_url: "postgresql://gateway:gateway@localhost:5432/gateway_db" #Master key for admin access master_key: "your-secure-master-key" ## LLM Provider Credentials providers: openai: api_key: "${OPENAI_API_KEY}" gemini: api_key: "${GEMINI_API_KEY}" vertexai: credentials: "/path/to/service_account.json" project: "your-gcp-project-id" location: "us-central1" # Model pricing for cost-tracking (optional) pricing: openai:gpt-4: input_price_per_million: 0.15 output_price_per_million: 0.6 ``` Start the gateway with your config file: ```bash any-llm-gateway serve --config config.yml ``` ## Option 2: Environment Variables Configure the gateway entirely through environment variables—useful for containerized deployments: ```bash #Required settings export DATABASE_URL="postgresql://gateway:gateway@localhost:5432/gateway_db" export GATEWAY_MASTER_KEY="your-secure-master-key" export GATEWAY_HOST="0.0.0.0" export GATEWAY_PORT=8000 any-llm-gateway serve ``` > **Note**: Model pricing cannot be set via environment variables. Use the config file or the [Pricing API](#dynamic-pricing-via-api) instead. ## Model Pricing Configuration Configure model pricing in your config file to automatically track costs. Pricing can be set via config file or dynamically via the API. ### Config File Pricing Add pricing for models in your config file using the format `provider:model-name`: ```yaml pricing: openai:gpt-3.5-turbo: input_price_per_million: 0.5 output_price_per_million: 1.5 ``` ### Dynamic Pricing via API You can also set or update pricing dynamically using the API: ```bash curl -X POST http://localhost:8000/v1/pricing \ -H "X-AnyLLM-Key: Bearer ${GATEWAY_MASTER_KEY}" \ -H "Content-Type: application/json" \ -d '{ "model": "openai:gpt-4", "input_price_per_million": 30.0, "output_price_per_million": 60.0 }' ``` This is useful for: - Updating pricing without restarting the gateway - Managing pricing in production environments - Adjusting rates as provider pricing changes **Important notes:** - Database pricing takes precedence - config only sets initial values - If pricing for the model already exists in the database, config values are ignored (with a warning logged) ## Provider Client Args You can set additional arguments to provider clients via the `client_args` configuration. These arguments are passed directly to the provider's client initialization, enabling custom headers, timeouts, and other provider-specific options. ```yaml providers: openai: api_key: "${OPENAI_API_KEY}" client_args: custom_headers: X-Custom-Header: "custom-value" timeout: 60 ``` Common use cases: - **Custom headers**: Pass additional headers to the provider (e.g., for proxy authentication or request tracing) - **Timeouts**: Configure connection and request timeouts - **Provider-specific options**: Pass any additional arguments supported by the provider's client The available `client_args` options depend on the provider. See the [any-llm provider documentation](https://mozilla-ai.github.io/any-llm/providers/) for provider-specific options. ## Next Steps - See [supported providers](https://mozilla-ai.github.io/any-llm/providers/) for provider-specific configuration - Learn about [authentication methods](./authentication.md) for managing access - Set up [budget management](./budget-management.md) to enforce spending limits --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/docker-deployment.md # Docker Deployment Guide This guide walks you through deploying `any-llm-gateway` using Docker and Docker Compose. Whether you're setting up a local development environment or deploying to production, this guide covers the essential steps and best practices for a secure, reliable deployment. ## Quick Start with Docker Compose Docker Compose is the recommended deployment method for most users. It automatically sets up both the gateway application and a PostgreSQL database with proper networking and dependencies. **Prerequisites:** - Docker Engine 20.10 or newer - Docker Compose 2.0 or newer - At least one LLM provider API key (OpenAI, Anthropic, Mistral, etc.) ### Configure the Gateway First, prepare your configuration file with credentials and settings: Copy the example configuration file: ```bash cp docker/config.example.yml docker/config.yml ``` Generate a secure master key (minimum 32 characters recommended): ```bash python -c "import secrets; print(secrets.token_urlsafe(32))" ``` Save the output of this command for the next step. [Learn more about keys here](authentication.md). Edit `docker/config.yml` with your master key and provider credentials. See the [Configuration Guide](configuration.md) for detailed options. ### Start the Services Launch the gateway and database with a single command: ```bash docker-compose -f docker/docker-compose.yml up -d ``` This command will: - Pull the PostgreSQL 16 Alpine image - Build the gateway Docker image from source (or pull from GHCR if configured) - Create a dedicated network for service communication - Start PostgreSQL with automatic health checks - Wait for the database to be healthy before starting the gateway - Initialize database tables and schema automatically The `-d` flag runs services in detached mode (background). ### Verify Deployment Confirm everything is running correctly: ```bash # Test the health endpoint curl http://localhost:8000/health # Expected: {"status": "healthy"} # Check service status docker-compose -f docker/docker-compose.yml ps # View real-time logs docker-compose -f docker/docker-compose.yml logs -f gateway ``` If the health check returns successfully, your gateway is ready to accept requests! ## Standalone Docker Deployment For scenarios where you have an existing PostgreSQL database or prefer more control over your deployment architecture, you can run the gateway as a standalone container. ### Using Pre-built Image Pull and run the official image from GitHub Container Registry: ```bash docker pull ghcr.io/mozilla-ai/any-llm/gateway:latest docker run -d \ --name any-llm-gateway \ -p 8000:8000 \ -v $(pwd)/config.yml:/app/config.yml \ -e DATABASE_URL="postgresql://user:pass@host:5432/dbname" \ ghcr.io/mozilla-ai/any-llm/gateway:latest \ any-llm-gateway serve --config /app/config.yml ``` Replace the `DATABASE_URL` with your actual PostgreSQL connection string. The format is: `postgresql://username:password@hostname:port/database_name` ### Building from Source If you need to customize the image or test local changes: ```bash docker build -t any-llm-gateway:local -f docker/Dockerfile . docker run -d \ --name any-llm-gateway \ -p 8000:8000 \ -v $(pwd)/config.yml:/app/config.yml \ -e DATABASE_URL="postgresql://user:pass@host:5432/dbname" \ any-llm-gateway:local ``` ## Production Deployment Production deployments require additional considerations for reliability, security, and performance. ### Production Configuration Enhance your docker-compose.yml with production-grade settings: ```yaml services: gateway: image: ghcr.io/mozilla-ai/any-llm/gateway:latest restart: unless-stopped healthcheck: test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')"] interval: 30s timeout: 10s retries: 3 start_period: 40s deploy: resources: limits: cpus: '2' memory: 2G reservations: cpus: '1' memory: 1G logging: driver: "json-file" options: max-size: "10m" max-file: "3" ``` ### Nginx Reverse Proxy For production, always use a reverse proxy with HTTPS: ```nginx server { listen 443 ssl http2; server_name gateway.yourdomain.com; ssl_certificate /etc/ssl/certs/gateway.crt; ssl_certificate_key /etc/ssl/private/gateway.key; # Security headers add_header Strict-Transport-Security "max-age=31536000" always; location / { proxy_pass http://localhost:8000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # Timeouts for LLM streaming proxy_read_timeout 300s; proxy_connect_timeout 75s; } } ``` ## Environment Variables The gateway can be configured using environment variables instead of or in addition to a config file. This is useful for Docker deployments and follows 12-factor app principles. For a complete list of environment variables and configuration options, see the [Configuration Guide](configuration.md). **Docker Compose example with .env file:** ```yaml services: gateway: env_file: - .env ``` ## Database Backups ```bash # Backup docker-compose -f docker/docker-compose.yml exec postgres \ pg_dump -U gateway gateway > backup.sql # Restore docker-compose -f docker/docker-compose.yml exec -T postgres \ psql -U gateway gateway < backup.sql ``` ## Security Best Practices 1. **Never commit secrets** - Use `.env` files (gitignored) or Docker secrets 2. **Use read-only volumes** - Mount configs with `:ro` flag 3. **Enable HTTPS** - Use a reverse proxy with SSL certificates 4. **Isolate networks** - Keep database on internal network only 5. **Update regularly** - Use tagged versions and update containers periodically ## Monitoring and Logging ### Health Checks ```bash # Test health endpoint curl http://localhost:8000/health # Check container health status docker inspect --format='{{.State.Health.Status}}' container-name ``` ### Logging ```bash # View logs docker-compose logs -f gateway # Last 100 lines docker-compose logs --tail=100 gateway ``` Configure log rotation: ```yaml services: gateway: logging: driver: "json-file" options: max-size: "10m" max-file: "3" ``` ## Troubleshooting **Container won't start:** ```bash docker-compose logs gateway ``` Common issues: Database connection failed, port in use, missing config **Database connection issues:** ```bash docker-compose exec postgres psql -U gateway -c "SELECT version();" ``` **Permission errors:** ```bash chmod 644 docker/config.yml chmod 600 docker/service_account.json ``` **Rebuild after changes:** ```bash docker-compose -f docker/docker-compose.yml up -d --build ``` ## Next Steps - [Configuration Guide](configuration.md) - Advanced configuration options - [Authentication](authentication.md) - Set up API keys and user management - [Budget Management](budget-management.md) - Configure spending limits - [API Reference](api-reference.md) - Explore the complete API - [Troubleshooting](troubleshooting.md) - Common issues and solutions --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/embedding.md ## Embedding ::: any_llm.api.embedding ::: any_llm.api.aembedding --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/exceptions.md # Exception Handling ::: any_llm.exceptions options: show_root_heading: false heading_level: 3 --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/index.md --- schema: type: "SoftwareSourceCode" name: "any-llm" description: "A Python library providing a single interface to different LLM providers including OpenAI, Anthropic, Mistral, and more" programmingLanguage: "Python" codeRepository: "https://github.com/mozilla-ai/any-llm" license: "https://github.com/mozilla-ai/any-llm/blob/main/LICENSE" ---

any-llm logo

One interface. Every LLM.

`any-llm` is a Python library providing a single interface to different llm providers. ```python from any_llm import completion # Using the messages format response = completion( model="gpt-4o-mini", messages=[{"role": "user", "content": "What is Python?"}], provider="openai" ) print(response) # Switch providers without changing your code response = completion( model="claude-sonnet-4-5-20250929", messages=[{"role": "user", "content": "What is Python?"}], provider="anthropic" ) print(response) ``` ### Why any-llm - Switch providers in one line - Unified exception handling across providers - Simple API, powerful features [View supported providers →](./providers.md) ### Getting Started **[Get started in 5 minutes →](./quickstart.md)** - Install the library and run your first API call. ### Demo Try `any-llm` in action with our interactive chat demo: **[📂 Run the Demo](https://github.com/mozilla-ai/any-llm/tree/main/demos/chat#readme)** Features: real-time streaming responses, multiple provider support, and collapsible "thinking" content display. ### API Documentation `any-llm` provides two main interfaces: **Direct API Functions** (recommended for simple use cases): - [completion](./api/completion.md) - Chat completions with any provider - [embedding](./api/embedding.md) - Text embeddings - [responses](./api/responses.md) - [OpenResponses](https://www.openresponses.org/) API for agentic AI systems **AnyLLM Class** (recommended for advanced use cases): - [Provider API](./api/any_llm.md) - Lower-level provider interface with metadata access and reusability ## For AI Systems This documentation is available in two AI-friendly formats: - **[llms.txt](https://mozilla-ai.github.io/any-llm/llms.txt)** - A structured overview with curated links to key documentation sections - **[llms-full.txt](https://mozilla-ai.github.io/any-llm/llms-full.txt)** - Complete documentation content concatenated into a single file --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/list_models.md ## Models ::: any_llm.api.list_models ::: any_llm.api.alist_models --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/model.md ## Model Types Data models and types for model operations. ::: any_llm.types.model --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/overview.md # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/platform/overview.md # Managed Platform Overview ## What is the any-llm Managed Platform? The any-llm managed platform is a cloud-hosted service that provides secure API key vaulting and usage tracking for all your LLM providers. Instead of managing multiple provider API keys across your codebase, you get a single virtual key that works with any supported provider while keeping your credentials encrypted and your usage tracked. The managed platform is available at [any-llm.ai](https://any-llm.ai). ## Why use the Managed Platform? Managing LLM API keys and tracking costs across multiple providers is challenging: - **Security risks**: API keys scattered across `.env` files, CI/CD pipelines, and developer machines - **No visibility**: Difficult to track spending across OpenAI, Anthropic, Google, and other providers - **Key rotation pain**: Updating keys means touching multiple systems and codebases - **No performance insights**: No easy way to measure latency, throughput, or reliability The managed platform solves these problems: - **Secure Key Vault**: Your provider API keys are encrypted client-side before storage—we never see your raw keys - **Single Virtual Key**: One `ANY_LLM_KEY` works across all providers - **Usage Analytics**: Track tokens, costs, and performance metrics without logging prompts or responses - **Zero Infrastructure**: No servers to deploy, no databases to manage ## How it works The managed platform acts as a secure credential manager and usage tracker. Here's the flow: 1. **You add provider keys** to the platform dashboard (keys are encrypted in your browser before upload) 2. **You get a virtual key** (`ANY_LLM_KEY`) that represents your project 3. **Your application** uses the `PlatformProvider` with your virtual key 4. **The SDK** authenticates with the platform, retrieves and decrypts your provider key client-side 5. **Your request** goes directly to the LLM provider (OpenAI, Anthropic, etc.) 6. **Usage metadata** (tokens, model, latency) is reported back—never your prompts or responses ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Your Application │ │ │ │ from any_llm import completion │ │ completion(provider="platform", model="openai:gpt-4", ...) │ └──────────────────────────────┬──────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ any-llm SDK (PlatformProvider) │ │ │ │ 1. Authenticate with platform using ANY_LLM_KEY │ │ 2. Receive encrypted provider key │ │ 3. Decrypt provider key locally (client-side) │ │ 4. Make request directly to provider │ │ 5. Report usage metadata (tokens, latency) to platform │ └────────────────┬─────────────────────────────────────┬──────────────────┘ │ │ ▼ ▼ ┌─────────────────────────────┐ ┌────────────────────────────────────┐ │ any-llm Managed Platform │ │ LLM Provider │ │ │ │ (OpenAI, Anthropic, etc.) │ │ • Encrypted key storage │ │ │ │ • Usage tracking │ │ Your prompts/responses go │ │ • Cost analytics │ │ directly here—never through │ │ • Performance metrics │ │ our platform │ └─────────────────────────────┘ └────────────────────────────────────┘ ``` ## Key Features ### Client-Side Encryption Your provider API keys are encrypted in your browser using XChaCha20-Poly1305 before being sent to our servers. The encryption key is derived from your account credentials and never leaves your device. This means: - We cannot read your provider API keys - Even if our database were compromised, your keys remain encrypted - You maintain full control over your credentials ### Privacy-First Usage Tracking The platform tracks usage metadata to provide cost and performance insights: **What we track for you:** - Token counts (input and output) - Model name and provider - Request timestamps - Performance metrics (latency, throughput) **What we never track:** - Your prompts - Model responses - Any content from your conversations ### Project Organization Organize your usage by project, team, or environment: - Create separate projects for development, staging, and production - Track costs per project - Set up different provider keys per project ## Platform vs. Gateway any-llm offers two solutions for managing LLM access. Choose the one that fits your needs: | Feature | Managed Platform | Self-Hosted Gateway | |---------|-----------------|---------------------| | **Deployment** | Cloud-hosted (no infrastructure) | Self-hosted (Docker + Postgres) | | **Key Storage** | Client-side encrypted vault | Your own configuration | | **Budget Enforcement** | Coming soon | Built-in | | **User Management** | Per-project | Full user/key management | | **Request Routing** | Direct to provider, no proxy | Through your gateway | | **Best For** | Teams wanting zero-ops key management and usage tracking| Organizations needing full control | You can also use both together—store your provider keys in the managed platform and use them in a self-hosted gateway deployment. ## Current Status The any-llm managed platform is in **open beta**. During the beta: - **Free access** to all features - Core encryption and key management are **production-ready** - Dashboard UX and advanced features are being refined - Feedback is welcome at [any-llm.ai](https://any-llm.ai) ## Getting Started Ready to try the managed platform? 1. Create an account at [any-llm.ai](https://any-llm.ai) 2. Add your provider API keys 3. Get your virtual key 4. Make your first request --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/provider.md ## Provider Types Data models and types for provider operations. ::: any_llm.types.provider --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/providers.md --- schema: type: "TechArticle" name: "Supported Providers - any-llm" description: "Complete list of LLM providers supported by any-llm including OpenAI, Anthropic, Mistral, and more" datePublished: "2024-03-15" dateModified: "2024-11-18" --- # Supported Providers `any-llm` supports the below providers. In order to discover information about what models are supported by a provider as well as what features the provider supports for each model, refer to the provider documentation. Provider source code can be found in the [`src/any_llm/providers/`](https://github.com/mozilla-ai/any-llm/tree/main/src/any_llm/providers) directory of the repository. !!! note "Legend" - **Key**: Environment variable for the API key (e.g., `OPENAI_API_KEY`). - **Base**: Environment variable for a custom API base URL (e.g., `OPENAI_BASE_URL`). Useful for proxies or self-hosted endpoints. - **Reasoning (Completions)**: Provider can return reasoning traces alongside the assistant message via the completions and/or streaming endpoints. This does not indicate whether the provider offers separate "reasoning models". See [this](https://github.com/mozilla-ai/any-llm/issues/95) discussion for more information. - **Streaming (Completions)**: Provider can stream completion results back as an iterator. - **Image (Completions)**: Provider supports passing an `image_data` parameter for vision capabilities, as defined by the OpenAI spec [here](https://platform.openai.com/docs/api-reference/chat/create#chat_create-messages). - **OpenResponses API**: Provider supports the [OpenResponses](https://www.openresponses.org/) specification for agentic AI systems. See the [Responses API docs](api/responses.md) for usage details. - **List Models API**: Provider supports listing available models programmatically via the `list_models()` function. This allows you to discover what models are available from the provider at runtime, which can be useful for dynamic model selection or validation. --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/quickstart.md # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/quickstart.md --- schema: type: "HowTo" name: "How to Install and Use any-llm" description: "Step-by-step guide to installing any-llm and making your first API call with Python" totalTime: "PT5M" tool: - "Python 3.11 or newer" - "pip package manager" supply: - "API key from your chosen LLM provider" steps: - name: "Install any-llm" text: "Install any-llm with your chosen providers using pip. Use the all option to install support for all providers." url: "https://mozilla-ai.github.io/any-llm/quickstart/#installation" - name: "Set up API keys" text: "Configure your provider's API key as an environment variable. Make sure you have the appropriate environment variable set for your chosen provider." url: "https://mozilla-ai.github.io/any-llm/quickstart/#api-keys" - name: "Make your first completion call" text: "Import the completion function from any-llm and create your first API call with your chosen model and provider" url: "https://mozilla-ai.github.io/any-llm/quickstart/#your-first-api-call" --- ## Requirements - Python 3.11 or newer - API keys for your chosen LLM provider ## Installation ```bash pip install any-llm-sdk[all] # Install with all provider support ``` ### Installing Specific Providers If you want to install a specific provider from our [supported providers](./providers.md): ```bash pip install any-llm-sdk[mistral] # For Mistral provider pip install any-llm-sdk[ollama] # For Ollama provider # install multiple providers pip install any-llm-sdk[mistral,ollama] ``` ### Library Integration If you're building a library, install just the base package (`pip install any-llm-sdk`) and let your users install provider dependencies. > **API Keys:** Set your provider's API key as an environment variable (e.g., `export MISTRAL_API_KEY="your-key"`) or pass it directly using the `api_key` parameter. ## APIs ### Using the AnyLLM Class For applications making multiple requests with the same provider, use the `AnyLLM` class to avoid repeated provider instantiation: ```python import os from any_llm import AnyLLM # Make sure you have the appropriate API key set api_key = os.environ.get('MISTRAL_API_KEY') if not api_key: raise ValueError("Please set MISTRAL_API_KEY environment variable") llm = AnyLLM.create("mistral") response = llm.completion( model="mistral-small-latest", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) metadata = llm.get_provider_metadata() print(f"Supports streaming: {metadata.streaming}") print(f"Supports tools: {metadata.completion}") ``` ### API Call ```python import os from any_llm import completion # Make sure you have the appropriate API key set api_key = os.environ.get('MISTRAL_API_KEY') if not api_key: raise ValueError("Please set MISTRAL_API_KEY environment variable") # Recommended: separate provider and model parameters response = completion( model="mistral-small-latest", provider="mistral", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) ``` ### When to Choose Which Approach **Use Direct API Functions (`completion`, `acompletion`) when:** - Making simple, one-off requests - Prototyping or writing quick scripts - You want the simplest possible interface **Use Provider Class (`AnyLLM.create`) when:** - Building applications that make multiple requests with the same provider - You want to avoid repeated provider instantiation overhead **Finding model names:** Check the [providers page](./providers.md) for provider IDs, or use the [`list_models`](./api/list_models.md) API to see available models for your provider. ## Streaming For the [providers that support streaming](./providers.md), you can enable it by passing `stream=True`: ```python output = "" for chunk in completion( model="mistral-small-latest", provider="mistral", messages=[{"role": "user", "content": "Hello!"}], stream=True ): chunk_content = chunk.choices[0].delta.content or "" print(chunk_content) output += chunk_content ``` ## Embeddings [`embedding`][any_llm.embedding] and [`aembedding`][any_llm.aembedding] allow you to create vector embeddings from text using the same unified interface across providers. Not all providers support embeddings - check the [providers documentation](./providers.md) to see which ones do. ```python from any_llm import embedding result = embedding( model="text-embedding-3-small", provider="openai", inputs="Hello, world!" # can be either string or list of strings ) # Access the embedding vector embedding_vector = result.data[0].embedding print(f"Embedding vector length: {len(embedding_vector)}") print(f"Tokens used: {result.usage.total_tokens}") ``` ## Tools `any-llm` supports tool calling for providers that support it. You can pass a list of tools where each tool is either: 1. **Python callable** - Functions with proper docstrings and type annotations 2. **OpenAI Format tool dict** - Already in OpenAI tool format ```python from any_llm import completion def get_weather(location: str, unit: str = "F") -> str: """Get weather information for a location. Args: location: The city or location to get weather for unit: Temperature unit, either 'C' or 'F' Returns: Current weather description """ return f"Weather in {location} is sunny and 75{unit}!" response = completion( model="mistral-small-latest", provider="mistral", messages=[{"role": "user", "content": "What's the weather in Pittsburgh PA?"}], tools=[get_weather] ) ``` any-llm automatically converts your Python functions to OpenAI tools format. Functions must have: - A docstring describing what the function does - Type annotations for all parameters - A return type annotation ## Exception Handling The `any-llm` package provides a unified exception hierarchy that works consistently across all LLM providers. ### Enabling Unified Exceptions !!! info "Opt-in Feature" Unified exception handling is currently **opt-in**. Set the `ANY_LLM_UNIFIED_EXCEPTIONS` environment variable to enable it: ```bash export ANY_LLM_UNIFIED_EXCEPTIONS=1 ``` When enabled, provider-specific exceptions are automatically converted to `any-llm` exception types. When disabled (default), the original provider exceptions are raised with a deprecation warning. ### Basic Usage ```python from any_llm import completion from any_llm.exceptions import ( RateLimitError, AuthenticationError, ProviderError, AnyLLMError, ) try: response = completion( model="gpt-4", provider="openai", messages=[{"role": "user", "content": "Hello!"}] ) except RateLimitError as e: print(f"Rate limited: {e.message}") except AuthenticationError as e: print(f"Auth failed: {e.message}") except ProviderError as e: print(f"Provider error: {e.message}") except AnyLLMError as e: print(f"Error: {e.message}") ``` ### Accessing Original Exceptions All unified exceptions preserve the original provider exception for debugging: ```python from any_llm.exceptions import RateLimitError messages = [{"role": "user", "content": "Hello!"}] try: response = completion(model="gpt-4", provider="openai", messages=messages) except RateLimitError as e: print(f"Provider: {e.provider_name}") print(f"Original exception: {type(e.original_exception)}") ``` --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/types/responses.md # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/api/responses.md ## OpenResponses API The Responses API in any-llm implements the [OpenResponses](https://www.openresponses.org/) specification—an open-source standard for building multi-provider, interoperable LLM interfaces for agentic AI systems. !!! info "Learn More" - [OpenResponses Specification](https://www.openresponses.org/specification) - [OpenResponses Reference](https://www.openresponses.org/reference) - [HuggingFace Responses API Guide](https://huggingface.co/docs/inference-providers/guides/responses-api) ### Return Types The `responses()` and `aresponses()` functions return different types depending on the provider's level of OpenResponses compliance: | Return Type | When Returned | |-------------|---------------| | `openresponses_types.ResponseResource` | Providers fully compliant with the OpenResponses specification | | `openai.types.responses.Response` | Providers using OpenAI's native Responses API (not yet fully OpenResponses-compliant) | | `Iterator[dict]` / `AsyncIterator[dict]` | When `stream=True` is set | Both `ResponseResource` and `Response` share a similar structure, so in many cases you can access common fields like `output` without type checking. ::: any_llm.api.responses ::: any_llm.api.aresponses --- # Source: https://raw.githubusercontent.com/mozilla-ai/any-llm/refs/heads/main/docs/gateway/troubleshooting.md # Troubleshooting ## Database connection errors Make sure the database URL is correct and the database is accessible: ```bash python -c "from sqlalchemy import create_engine; engine = create_engine('postgresql://user:pass@host/db'); print('OK')" ``` ## Common Issues ### Authentication Errors - Ensure you're using the correct master key format: `Bearer your-secure-master-key` - Check that the `X-AnyLLM-Key` header is properly set - Verify that virtual API keys are active and not expired ### Configuration Issues - Verify your `config.yml` file is properly formatted - Check that environment variables are set correctly - Ensure provider API keys are valid and have proper permissions ### Budget Enforcement - Check that budgets are properly assigned to users - Verify budget limits are set correctly - Monitor user spending to ensure limits are being enforced ## Getting Help - Check the logs for detailed error messages - Verify your configuration matches the examples in the documentation - Ensure all required environment variables are set