Home / Features / Train and fine-tune / LoRA fine-tuning

osFoundry पर LoRA के साथ Llama, Mistral, या Qwen को Fine-tune करें

osFoundry आपके data पर LoRA या QLoRA के साथ किसी भी open-weight base model को fine-tunes करता है — कोई notebook नहीं, कोई command line नहीं। एक base pick करें, एक dataset (आपका KB, एक upload, या एक public dataset) पर point करें, LoRA rank set करें, और train करें। Adapter आपके model catalog में register होता है और training finish होते ही Maestro और Room Apps से तुरंत routable होता है।

Quick answer

60+ open-weight base models पर LoRA + QLoRA।
अपने KB, JSONL/CSV uploads, या 250K public datasets पर Train करें।
UI-driven — कोई notebook नहीं।
Training finish होते ही Adapter workspace-routable है।

Key capabilities

60+ supported base models (Llama 3, Mistral, Qwen, Phi, Gemma…)।
LoRA + QLoRA flows; rank 8/16/32/64 selectable।
KBs (auto-formatted), JSONL/CSV/parquet, या 250K public datasets पर Train करें।
तीन runtimes: local GPU, osFoundry cloud, आपकी अपनी infrastructure।
हर N steps पर Checkpoints — एक interrupted job को last checkpoint से resume करें।
Adapter export: full training config के साथ .safetensors।

How to do it in osFoundry

एक base + LoRA target pick करें — Base model pick करें। LoRA rank, learning rate, epochs, और target modules Configure करें। Defaults अधिकांश मामलों के लिए काम करते हैं।
अपने dataset पर Point करें — एक KB choose करें (auto-instruction-pair format), JSONL upload करें, या एक public dataset pick करें।
Training Run करें — Runtime (local/cloud/BYO) pick करें। यह train होते समय loss curve को live watch करें।
Adapter Hot-swap करें — जब training finish हो, एक deployed base-model endpoint पर adapter को hot-swap करें। वही handle, नया behavior।

Use cases

Customer support: पिछले tickets पर Mistral 7B को LoRA-tune करें। Agent अब product knowledge के साथ आपके tone में answer देता है।
Legal team: Labelled contracts पर Llama 3.1 8B को Train करें। नए docs को अपनी firm के style में on-prem Redline करें।
Game studio: एक base model पर hot-swapped Per-character LoRAs। एक GPU, कई distinct NPC voices।

Frequently asked questions

एक LoRA fine-tune में कितना समय लगता है?

50K rows पर 7B model: A100 पर ~30 min। 70B: ~3 घंटे। Consumer M2/M3 Mac: 7B के लिए ~2 घंटे।

मुझे कौन सा rank उपयोग करना चाहिए?

Rank 16 से शुरू करें। कठिन domain shifts के लिए 32 या 64 तक बढ़ाएँ; stylistic tuning के लिए 8 तक घटाएँ।

क्या मैं अपने knowledge base पर train कर सकता हूँ?

हाँ — KBs auto-formatted instruction pairs के रूप में होते हैं।

क्या मैं adapter export कर सकता हूँ?

हाँ — full training config के साथ .safetensors download। osFoundry के बाहर भी Deployable।

क्या QLoRA supported है?

हाँ — QLoRA base को 4-bit में quantising करके VRAM को कम करता है। यदि आपका GPU memory पर tight है तो training config पर QLoRA pick करें।

मैं result को कैसे evaluate करूँ?

Side-by-side compare view के साथ अपने eval set पर base के विरुद्ध adapter की तुलना करें। जब quality आपकी bar पार करे तो promote करें।

Pricing

Local: free। Cloud: per-second GPU time। A100 पर एक 7B LoRA per run लगभग $2-3 costs करता है; 70B $20-30 costs करता है।