Voice Cloning

Voice Cloning は osFoundry コミュニティカタログのアプリです。生成系の音声モデルです。声をクローンし、プロソディ＋非言語的なキュー（笑い、ため息、ためらい）を伴う音声、音楽、効果音をテキストプロンプトから生成します。Bark（suno-ai）を採用しています。**GPU を強く推奨します** — CPU 推論は 1 文あたり数分かかります。

詳細

ワークスペース: osfoundry
カテゴリ: COMMUNICATION
料金: Free
アクセス: Community

機能

Generative audio model — clones voices
Generates speech with prosody + non-verbal cues (laughs
Hesitations)
Sound effects from text prompts

ドキュメント

ドキュメントは上流プロジェクトにより英語で管理されています。

# Voice Cloning

Generative audio model with voice cloning + prosody + non-verbal cues, powered by Bark.

## ⚠️ GPU strongly recommended
Bark on CPU is **impractically slow** — ~5 minutes per ~10-second sentence. On a modern GPU (>=8 GB VRAM): real-time. For CPU work, use Coqui TTS (#162) which has faster CPU models.

## Features
- Voice cloning from a short sample
- Non-verbal cues: `[laughs]` `[sighs]` `[music]` `[gasps]` `[clears throat]`
- 100+ pre-built speaker prompts across 13 languages
- Music generation from text prompts
- Sound effect generation
- HuggingFace transformers compatible

## Packaging
Thin wrapper around the community `gitmylo/audio-webui` image which bundles Bark + a Gradio UI + a model manager. Bark's models (~5 GB) cached at `/data`.

## CONFIRM-AT-BUILD
There is no official Bark Docker image; we use `gitmylo/audio-webui` (the most maintained community pack). Verify version + entrypoint against pinned tag.

osFoundry での Voice Cloning の使い方

Voice Cloning をワンクリックでワークスペースにインストールし、osStudio でフォークしてプロンプト、ツール、または構成をご自身のスタックに合わせてカスタマイズできます。ワークスペース内のメンバーは誰でも、続きの作業を引き継げます。

コミュニティの他のアプリ

CRM — 連絡先、商談、パイプライン管理を備えた顧客関係管理ツールです。
Kanban Board — カード、ボード、カレンダー・テーブルビュー、ボードごとのプロパティを備えた、Trello風のカンバン・プロジェクトボードです。Focalboard(スタンドアロン個人サーバー)を基盤としています。永続ボリューム上に組み込みSQLiteを持ちます。
ヘルプデスク — SLA トラッキング付きのチケットトリアージとカスタマーサポート受信箱です。
Page Builder — セクション、テーマ、SEO、公開機能を備えた、ビジュアルなドラッグ&ドロップのページビルダーです
Website Builder — CMS コレクション、グローバルナビゲーション、フッター、テーマ、公開機能を備えたマルチページの Web サイトビルダーです
ストアフロント — 商品カタログ、カート、チェックアウトを備えた EC ストアフロントです。