Home / Features / Train and fine-tune / LoRA fine-tuning
Fine-tune Llama, Mistral, or Qwen with LoRA on osFoundry
osFoundry fine-tunes any open-weight base model with LoRA or QLoRA on your data — no notebook, no command line. Pick a base, point at a dataset (your KB, an upload, or a public dataset), set LoRA rank, and train. The adapter is registered in your model catalog and immediately routable from Maestro and Room Apps the moment training finishes.
Quick answer
- LoRA + QLoRA on 60+ open-weight base models.
- Train on your KB, JSONL/CSV uploads, or 250K public datasets.
- UI-driven — no notebook.
- Adapter is workspace-routable the moment training finishes.
Key capabilities
- 60+ supported base models (Llama 3, Mistral, Qwen, Phi, Gemma…).
- LoRA + QLoRA flows; rank 8/16/32/64 selectable.
- Train on KBs (auto-formatted), JSONL/CSV/parquet, or 250K public datasets.
- Three runtimes: local GPU, osFoundry cloud, your own infrastructure.
- Checkpoints every N steps — resume an interrupted job from the last checkpoint.
- Adapter export: .safetensors with full training config.
How to do it in osFoundry
- Pick a base + LoRA target — Pick the base model. Configure LoRA rank, learning rate, epochs, and target modules. Defaults work for most cases.
- Point at your dataset — Choose a KB (auto-instruction-pair format), upload JSONL, or pick a public dataset.
- Run training — Pick the runtime (local/cloud/BYO). Watch the loss curve live as it trains.
- Hot-swap the adapter — When training finishes, hot-swap the adapter onto a deployed base-model endpoint. Same handle, new behavior.
Use cases
- Customer support: LoRA-tune Mistral 7B on past tickets. Agent now answers in your tone with product knowledge.
- Legal team: Train Llama 3.1 8B on labelled contracts. Redline new docs in your firm’s style on-prem.
- Game studio: Per-character LoRAs hot-swapped onto one base model. One GPU, many distinct NPC voices.
Frequently asked questions
How long does a LoRA fine-tune take?
7B model on 50K rows: ~30 min on A100. 70B: ~3 hours. Consumer M2/M3 Mac: ~2 hours for 7B.
What rank should I use?
Start with rank 16. Increase to 32 or 64 for harder domain shifts; decrease to 8 for stylistic tuning.
Can I train on my knowledge base?
Yes — KBs are auto-formatted as instruction pairs.
Can I export the adapter?
Yes — .safetensors download with full training config. Deployable outside osFoundry too.
Is QLoRA supported?
Yes — QLoRA reduces VRAM by quantising the base to 4-bit. Pick QLoRA at training config if your GPU is tight on memory.
How do I evaluate the result?
Compare the adapter against the base on your eval set with the side-by-side compare view. Promote when quality clears your bar.
Pricing
Local: free. Cloud: per-second GPU time. A 7B LoRA on A100 costs roughly $2-3 per run; 70B costs $20-30.
Related features