Text Generation WebUI

Text Generation WebUI is a app in the osFoundry community catalog. oobabooga's text-generation-webui — the swiss army knife for running and experimenting with local language models. Supports llama.cpp (GGUF), transformers, ExLlamaV2, AWQ + GPTQ quantizations, plus a built-in OpenAI-compatible API server. The most-extensible local LLM UI: parameter presets, character cards, persona-driven chat, notebook + chat + instruct modes, training tab for LoRA fine-tuning. CPU mode bundled (no GPU on this host).

Details

Workspace: osfoundry
Category: AI
Pricing: Free
Access: Community

Features

Load + chat with GGUF (llama.cpp) / HuggingFace Transformers / ExLlamaV2 / AWQ / GPTQ models
Three modes: Chat (assistant) / Instruct (single-turn) / Notebook (free-form completion)
OpenAI-compatible API server bundled — drop-in for any OpenAI SDK client
Parameter presets + samplers (mirostat, dynamic temperature, DRY, smoothing factor, ...) — the deepest sampler knobs in the OSS LLM world
Character cards + persona system shared with SillyTavern format
LoRA training tab for fine-tuning (CPU mode is feasible but slow)

Documentation

# Text Generation WebUI

## Drop in a model

The container ships with no models. Get a GGUF or HuggingFace model into `/data/models/`:

```
curl -L -o /data/models/llama-3.1-8b-instruct.gguf \
  https://huggingface.co/.../resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf
```

Or in the web UI → **Model** tab → **Download model or LoRA** → paste the HF repo path (e.g. `unsloth/Llama-3.2-1B-Instruct-GGUF`).

## Load + chat

1. **Model** tab → pick a downloaded model from the dropdown → **Load**
2. **Chat** tab → start chatting

## Three modes

- **Chat** — turn-based assistant
- **Instruct** — single-turn instruction following
- **Notebook** — free-form completion / story writing

## OpenAI-compatible API

With `--api` flag (default), an OpenAI-compatible endpoint is at port 7860/v1. Use as drop-in OpenAI for any client:

```python
from openai import OpenAI
client = OpenAI(base_url='https://<your-public-url>/v1', api_key='none')
```

## Character cards

Drop SillyTavern-format PNG cards into `/data/characters/`. They show up under Chat → Character.

## CPU mode caveat

This container is CPU-only. 7B Q4 models run at 2-6 tokens/sec on 2 vCPU. Use small quantized models for usable speed; 30B+ models will be too slow for interactive use.

How to use Text Generation WebUI in osFoundry

Install Text Generation WebUI into your workspace in one click, then fork it in osStudio to customise the prompts, tools, or configuration for your stack. Anyone in your workspace can pick up where you left off.

Other apps from the community

CRM — Customer relationship management with contacts, deals, and pipeline tracking.
Kanban Board — Drag-and-drop task board with swimlanes, labels, and team assignments.
Helpdesk — Ticket triage and customer support inbox with SLA tracking.
Page Builder — Block-based page editor with publishing to public URLs.
Website Builder — Multi-page site builder with CMS, templates, and custom domains.
Storefront — E-commerce storefront with product catalog, cart, and checkout.