LiteLLM Proxy
LiteLLM Proxy osFoundry community catalog में एक app है। Developer-focused LLM gateway। Single OpenAI-compatible /chat/completions endpoint जो automatic retries, fallbacks, rate-limit handling, key rotation, per virtual key spend tracking, और OpenAI Realtime API support के साथ 100+ provider APIs (OpenAI, Anthropic, Azure, Bedrock, Vertex, Cohere, Together, Replicate, Ollama, vLLM, ...) में translate करता है। GitHub पर सबसे starred LLM proxy; one-api के UI-first approach के मुकाबले SDK-first।
विवरण
- Workspace: osfoundry
- Category: AI
- मूल्य: Free
- Access: Community
Features
- Single OpenAI-compatible endpoint that calls 100+ providers (Anthropic, Bedrock, Vertex, Azure, Ollama, ...)
- Virtual keys with per-key budgets + rate limits + model restrictions + expiry
- Automatic fallbacks — 'use claude-3-5 if gpt-4o is down or over quota' as one-line config
- Spend tracking per-key, per-model, per-team — export to CSV + Prometheus
- OpenAI Realtime API support — voice/audio mode passthrough
- SQLite default — zero infrastructure for solo + small-team use; Postgres optional
Documentation
Documentation को upstream project द्वारा अंग्रेज़ी में मेंटेन किया जाता है।
# LiteLLM Proxy
## First-boot
Set `LITELLM_MASTER_KEY` + `LITELLM_SALT_KEY` + `UI_PASSWORD` env. Restart — admin UI lives at `/ui`.
## Add models
Log into `/ui` → **Models** → **+ Add** — each model maps a 'public model name' (what clients see, e.g. `gpt-4o`) to an upstream:
- OpenAI: pick OpenAI, paste key, pick model id
- Anthropic: pick Anthropic, paste key, pick claude-3-5-sonnet-20241022
- Bedrock / Vertex / Azure: paste the provider-specific creds
- Ollama: pick ollama, set api_base to your Ollama URL
## Generate virtual keys
**Keys** tab → **+ Create** — issue per-team or per-app keys with:
- Spend budget (per day / month / total)
- Model restrictions (only certain models accessible)
- Rate limits (RPM, TPM)
- Expiry date
Give the `sk-...` key to your downstream app:
```python
from openai import OpenAI
client = OpenAI(base_url='https://<your-public-url>', api_key='sk-...')
client.chat.completions.create(model='gpt-4o', messages=[...])
```
LiteLLM looks up the model, calls the upstream, tracks spend, enforces budgets.
## Fallbacks + retries
In Model config: set `fallbacks: [{ model: 'gpt-4o', fallbacks: ['claude-3-5-sonnet', 'gemini-1.5-pro'] }]` — if gpt-4o is down or over quota, requests auto-route to Claude, then Gemini.
## Spend tracking
Usage tab shows per-key, per-model, per-team spend in $. Export to CSV / Prometheus. Tags on each request let you slice by user/app/feature.
## Storage
SQLite at `/data/litellm.db` for keys + spend log. For multi-instance scale, switch to Postgres via `DATABASE_URL` env.
osFoundry में LiteLLM Proxy का उपयोग कैसे करें
एक क्लिक में LiteLLM Proxy को अपने workspace में install करें, फिर अपने stack के लिए prompts, tools, या configuration को customise करने के लिए इसे osStudio में fork करें। आपके workspace का कोई भी सदस्य वहीं से आगे बढ़ा सकता है जहाँ आपने छोड़ा था।
Community से अन्य apps
- CRM — Contacts, deals, और pipeline tracking के साथ customer relationship management।
- Kanban Board — एक Trello-style kanban and project board के साथ cards, boards, calendar and table views, and per-board properties. Powered by Focalboard (standalone personal server). Embedded SQLite एक persistent volume पर.
- Helpdesk — SLA tracking के साथ ticket triage और customer support inbox।
- Page Builder — Visual drag-and-drop page builder के साथ sections, themes, SEO, and publishing
- Website Builder — Multi-page website builder के साथ CMS collections, global navigation, footer, themes, and publishing
- Storefront — Product catalog, cart, और checkout के साथ e-commerce storefront।