Text-to-Speech

Text-to-Speech is a app in the osFoundry community catalog. Self-hosted text-to-speech server — high-quality multi-language TTS models including XTTS-v2 (voice cloning from a 6-second sample), Tacotron 2, FastSpeech 2, VITS. REST API + browser playground. Powered by Coqui TTS. Runs on CPU (slower) or GPU (real-time).

Details

Workspace: osfoundry
Category: COMMUNICATION
Pricing: Free
Access: Community

Features

Tacotron 2
FastSpeech 2

Documentation

# Text-to-Speech

Self-hosted TTS server, powered by Coqui TTS.

## ⚠️ GPU recommended for real-time
Coqui's older models (Tacotron 2, FastSpeech 2, VITS) run reasonably on CPU. XTTS-v2 (the popular voice-cloning model) needs a GPU for real-time inference. CPU XTTS works but is ~10× slower than real time.

## Features
- 1,100+ pre-trained models across 16+ languages
- XTTS-v2: clone a voice from a 6-second sample
- Voice conversion (transform a voice into another)
- Streaming output (sentence-by-sentence)
- REST API: `/api/tts?text=hello&speaker_idx=0`
- Browser playground at `/`

## Packaging
Thin wrapper around the official `ghcr.io/coqui-ai/tts-cpu` image (no torch-CUDA, smaller). Downloaded models cached at `/root/.local/share/tts` (30 GB volume — XTTS-v2 is ~2 GB, the full model zoo can fill the volume).

How to use Text-to-Speech in osFoundry

Install Text-to-Speech into your workspace in one click, then fork it in osStudio to customise the prompts, tools, or configuration for your stack. Anyone in your workspace can pick up where you left off.

Other apps from the community

CRM — Customer relationship management with contacts, deals, and pipeline tracking.
Kanban Board — Drag-and-drop task board with swimlanes, labels, and team assignments.
Helpdesk — Ticket triage and customer support inbox with SLA tracking.
Page Builder — Block-based page editor with publishing to public URLs.
Website Builder — Multi-page site builder with CMS, templates, and custom domains.
Storefront — E-commerce storefront with product catalog, cart, and checkout.