LiteLLM Proxy

LiteLLM Proxy is a app in the osFoundry community catalog. Developer-focused LLM gateway. Single OpenAI-compatible /chat/completions endpoint that translates to 100+ provider APIs (OpenAI, Anthropic, Azure, Bedrock, Vertex, Cohere, Together, Replicate, Ollama, vLLM, ...) with automatic retries, fallbacks, rate-limit handling, key rotation, spend tracking per virtual key, and OpenAI Realtime API support. The most-starred LLM proxy on GitHub; SDK-first vs one-api's UI-first approach.

Details

Workspace: osfoundry
Category: AI
Pricing: Free
Access: Community

Features

Single OpenAI-compatible endpoint that calls 100+ providers (Anthropic, Bedrock, Vertex, Azure, Ollama, ...)
Virtual keys with per-key budgets + rate limits + model restrictions + expiry
Automatic fallbacks — 'use claude-3-5 if gpt-4o is down or over quota' as one-line config
Spend tracking per-key, per-model, per-team — export to CSV + Prometheus
OpenAI Realtime API support — voice/audio mode passthrough
SQLite default — zero infrastructure for solo + small-team use; Postgres optional

Documentation

# LiteLLM Proxy

## First-boot

Set `LITELLM_MASTER_KEY` + `LITELLM_SALT_KEY` + `UI_PASSWORD` env. Restart — admin UI lives at `/ui`.

## Add models

Log into `/ui` → **Models** → **+ Add** — each model maps a 'public model name' (what clients see, e.g. `gpt-4o`) to an upstream:

- OpenAI: pick OpenAI, paste key, pick model id
- Anthropic: pick Anthropic, paste key, pick claude-3-5-sonnet-20241022
- Bedrock / Vertex / Azure: paste the provider-specific creds
- Ollama: pick ollama, set api_base to your Ollama URL

## Generate virtual keys

**Keys** tab → **+ Create** — issue per-team or per-app keys with:

- Spend budget (per day / month / total)
- Model restrictions (only certain models accessible)
- Rate limits (RPM, TPM)
- Expiry date

Give the `sk-...` key to your downstream app:

```python
from openai import OpenAI
client = OpenAI(base_url='https://<your-public-url>', api_key='sk-...')
client.chat.completions.create(model='gpt-4o', messages=[...])
```

LiteLLM looks up the model, calls the upstream, tracks spend, enforces budgets.

## Fallbacks + retries

In Model config: set `fallbacks: [{ model: 'gpt-4o', fallbacks: ['claude-3-5-sonnet', 'gemini-1.5-pro'] }]` — if gpt-4o is down or over quota, requests auto-route to Claude, then Gemini.

## Spend tracking

Usage tab shows per-key, per-model, per-team spend in $. Export to CSV / Prometheus. Tags on each request let you slice by user/app/feature.

## Storage

SQLite at `/data/litellm.db` for keys + spend log. For multi-instance scale, switch to Postgres via `DATABASE_URL` env.

How to use LiteLLM Proxy in osFoundry

Install LiteLLM Proxy into your workspace in one click, then fork it in osStudio to customise the prompts, tools, or configuration for your stack. Anyone in your workspace can pick up where you left off.

Other apps from the community

CRM — Customer relationship management with contacts, deals, and pipeline tracking.
Kanban Board — Drag-and-drop task board with swimlanes, labels, and team assignments.
Helpdesk — Ticket triage and customer support inbox with SLA tracking.
Page Builder — Block-based page editor with publishing to public URLs.
Website Builder — Multi-page site builder with CMS, templates, and custom domains.
Storefront — E-commerce storefront with product catalog, cart, and checkout.