Virtual Keys (API Key Management)

Overview

Virtual keys allow you to create API keys with:

Custom budgets and rate limits
Model access restrictions
Expiration dates
Team associations
Metadata and tags

Generate a Key

Create a virtual key using the master key:

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "models": ["gpt-3.5-turbo", "gpt-4"],
    "max_budget": 10.0,
    "duration": "30d"
  }'

Response:

{
  "key": "sk-1234567890abcdef",
  "key_name": null,
  "expires": "2024-04-15T10:30:00Z",
  "models": ["gpt-3.5-turbo", "gpt-4"],
  "max_budget": 10.0,
  "budget_duration": "30d",
  "budget_reset_at": "2024-04-15T10:30:00Z"
}

Key Generation Parameters

Basic Parameters

{
  "key_name": "production-api-key",      // Optional friendly name
  "duration": "30d",                     // Key expiration (e.g., 30d, 24h, null for no expiry)
  "models": ["gpt-3.5-turbo", "gpt-4"], // Allowed models
  "metadata": {                          // Custom metadata
    "environment": "production",
    "team": "backend"
  }
}

Budget Parameters

{
  "max_budget": 100.0,           // Maximum spend in USD
  "budget_duration": "30d",      // Budget reset period
  "soft_budget": 80.0            // Alert threshold (80% of max_budget)
}

Rate Limiting

{
  "rpm": 100,        // Requests per minute
  "tpm": 100000,     // Tokens per minute
  "max_parallel_requests": 10
}

Team Association

{
  "team_id": "team-abc-123",
  "user_id": "user-xyz-456"
}

Complete Example

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key_name": "production-backend",
    "duration": "90d",
    "models": ["gpt-3.5-turbo", "gpt-4"],
    "max_budget": 100.0,
    "budget_duration": "30d",
    "soft_budget": 80.0,
    "rpm": 100,
    "tpm": 100000,
    "metadata": {
      "environment": "production",
      "team": "backend"
    }
  }'

Get Key Information

Retrieve information about a key:

curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef'

Response:

{
  "key": "sk-1234...def",
  "key_name": "production-backend",
  "team_id": null,
  "max_budget": 100.0,
  "spend": 45.23,
  "budget_reset_at": "2024-04-15T10:30:00Z",
  "models": ["gpt-3.5-turbo", "gpt-4"],
  "rpm": 100,
  "tpm": 100000,
  "expires": "2024-07-15T10:30:00Z",
  "metadata": {
    "environment": "production",
    "team": "backend"
  }
}

Update a Key

Modify key properties:

curl -X POST 'http://localhost:4000/key/update' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key": "sk-1234567890abcdef",
    "max_budget": 200.0,
    "models": ["gpt-3.5-turbo", "gpt-4", "claude-3-opus"],
    "rpm": 200
  }'

You can only update a key using the master key, not the key itself.

Delete a Key

Revoke a virtual key:

curl -X POST 'http://localhost:4000/key/delete' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "keys": ["sk-1234567890abcdef"]
  }'

List All Keys

Get all virtual keys:

curl -X GET 'http://localhost:4000/key/list' \
  -H 'Authorization: Bearer sk-1234'

Key Auto-Rotation

Configure automatic key rotation:

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "key_alias": "production-key",
    "auto_rotate": true,
    "rotation_interval": "90d",
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0
  }'

The key will automatically rotate every 90 days. The key_alias remains constant while the underlying key changes.

Budget Tracking

Check Spend

Monitor key spending:

curl -X GET 'http://localhost:4000/key/info' \
  -H 'Authorization: Bearer sk-1234567890abcdef'

The response includes:

spend: Current spend
max_budget: Budget limit
budget_reset_at: When budget resets

Budget Alerts

Set soft budget for alerts:

{
  "max_budget": 100.0,
  "soft_budget": 80.0  // Alert at 80% usage
}

Configure webhook for alerts in your config:

config.yaml

litellm_settings:
  alerting:
    - slack
  alerting_threshold: 0.8  # Alert at 80% budget
  slack_webhook_url: os.environ/SLACK_WEBHOOK_URL

Model Access Control

Restrict to Specific Models

{
  "models": ["gpt-3.5-turbo", "gpt-4"]
}

Requests to other models will be rejected:

{
  "error": {
    "message": "API key does not have access to model: claude-3-opus",
    "type": "invalid_request_error"
  }
}

Allow All Models

Omit the models parameter or use null:

{
  "models": null  // Access to all configured models
}

Rate Limiting

Per-Key Rate Limits

{
  "rpm": 100,        // 100 requests per minute
  "tpm": 100000,     // 100k tokens per minute
  "max_parallel_requests": 10  // Max concurrent requests
}

When rate limit is exceeded:

{
  "error": {
    "message": "Rate limit exceeded. Retry after 60 seconds.",
    "type": "rate_limit_error"
  }
}

Team Keys

Generate keys associated with teams:

Create a Team

curl -X POST 'http://localhost:4000/team/new' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "team_alias": "engineering",
    "max_budget": 1000.0,
    "budget_duration": "30d"
  }'

Generate Team Key

curl -X POST 'http://localhost:4000/key/generate' \
  -H 'Authorization: Bearer sk-1234' \
  -H 'Content-Type: application/json' \
  -d '{
    "team_id": "team-abc-123",
    "models": ["gpt-3.5-turbo"],
    "max_budget": 100.0
  }'

Team keys inherit team budgets and settings. The key budget is separate from the team budget.

Key Metadata

Attach custom metadata to keys:

{
  "metadata": {
    "environment": "production",
    "service": "backend-api",
    "owner": "john@example.com",
    "cost_center": "engineering"
  }
}

Use metadata for:

Cost allocation
Usage tracking
Access auditing
Organizational reporting

Security Best Practices

1. Master Key Protection

Never expose the master key in client applications. Use virtual keys instead.

# Store master key securely
export LITELLM_MASTER_KEY=$(cat /secure/path/master_key.txt)

2. Key Rotation

Rotate keys regularly:

# Generate new key
curl -X POST 'http://localhost:4000/key/generate' ...

# Update applications
# Delete old key
curl -X POST 'http://localhost:4000/key/delete' \
  -H 'Authorization: Bearer sk-1234' \
  -d '{"keys": ["old-key"]}'

3. Principle of Least Privilege

Grant minimum required access:

{
  "models": ["gpt-3.5-turbo"],  // Only specific model
  "max_budget": 10.0,            // Low budget
  "duration": "7d",              // Short expiration
  "rpm": 10                      // Low rate limit
}

4. Monitor Usage

Regularly audit key usage:

# List all keys
curl -X GET 'http://localhost:4000/key/list' \
  -H 'Authorization: Bearer sk-1234'

# Check spend
curl -X GET 'http://localhost:4000/spend/keys' \
  -H 'Authorization: Bearer sk-1234'

Programmatic Key Management

import requests

class LiteLLMKeyManager:
    def __init__(self, base_url, master_key):
        self.base_url = base_url
        self.headers = {
            'Authorization': f'Bearer {master_key}',
            'Content-Type': 'application/json'
        }
    
    def create_key(self, **kwargs):
        response = requests.post(
            f'{self.base_url}/key/generate',
            headers=self.headers,
            json=kwargs
        )
        return response.json()
    
    def delete_key(self, key):
        response = requests.post(
            f'{self.base_url}/key/delete',
            headers=self.headers,
            json={'keys': [key]}
        )
        return response.json()
    
    def get_key_info(self, key):
        response = requests.get(
            f'{self.base_url}/key/info',
            headers={'Authorization': f'Bearer {key}'}
        )
        return response.json()

# Usage
manager = LiteLLMKeyManager(
    base_url='http://localhost:4000',
    master_key='sk-1234'
)

# Create key
key = manager.create_key(
    models=['gpt-3.5-turbo'],
    max_budget=10.0,
    duration='30d'
)
print(f"Created: {key['key']}")

# Get info
info = manager.get_key_info(key['key'])
print(f"Spend: ${info['spend']}")

# Delete key
manager.delete_key(key['key'])

Next Steps

Budget Alerts

Set up spending alerts and notifications

Configuration

Advanced proxy configuration

Quick Start

Get started with the proxy

Docker Deployment

Deploy in production

​Overview

​Generate a Key

​Key Generation Parameters

​Basic Parameters

​Budget Parameters

​Rate Limiting

​Team Association

​Complete Example

​Get Key Information

​Update a Key

​Delete a Key

​List All Keys

​Key Auto-Rotation

​Budget Tracking

​Check Spend

​Budget Alerts

​Model Access Control

​Restrict to Specific Models

​Allow All Models

​Rate Limiting

​Per-Key Rate Limits

​Team Keys

​Key Metadata

​Security Best Practices

​1. Master Key Protection

​2. Key Rotation

​3. Principle of Least Privilege

​4. Monitor Usage

​Programmatic Key Management

​Next Steps

Budget Alerts

Configuration

Quick Start

Docker Deployment

Overview

Generate a Key

Key Generation Parameters

Basic Parameters

Budget Parameters

Rate Limiting

Team Association

Complete Example

Get Key Information

Update a Key

Delete a Key

List All Keys

Key Auto-Rotation

Budget Tracking

Check Spend

Budget Alerts

Model Access Control

Restrict to Specific Models

Allow All Models

Rate Limiting

Per-Key Rate Limits

Team Keys

Key Metadata

Security Best Practices

1. Master Key Protection

2. Key Rotation

3. Principle of Least Privilege

4. Monitor Usage

Programmatic Key Management

Next Steps