Home / Features / Train and fine-tune / LoRA fine-tuning

Fine-tune Llama, Mistral, or Qwen with LoRA on osFoundry

osFoundry fine-tunes any open-weight base model with LoRA or QLoRA on your data — no notebook, no command line. Pick a base, point at a dataset (your KB, an upload, or a public dataset), set LoRA rank, and train. The adapter is registered in your model catalog and immediately routable from Maestro and Room Apps the moment training finishes.

Quick answer

LoRA + QLoRA on 60+ open-weight base models.
Train on your KB, JSONL/CSV uploads, or 250K public datasets.
UI-driven — no notebook.
Adapter is workspace-routable the moment training finishes.

Key capabilities

60+ supported base models (Llama 3, Mistral, Qwen, Phi, Gemma…).
LoRA + QLoRA flows; rank 8/16/32/64 selectable.
Train on KBs (auto-formatted), JSONL/CSV/parquet, or 250K public datasets.
Three runtimes: local GPU, osFoundry cloud, your own infrastructure.
Checkpoints every N steps — resume an interrupted job from the last checkpoint.
Adapter export: .safetensors with full training config.

How to do it in osFoundry

Pick a base + LoRA target — Pick the base model. Configure LoRA rank, learning rate, epochs, and target modules. Defaults work for most cases.
Point at your dataset — Choose a KB (auto-instruction-pair format), upload JSONL, or pick a public dataset.
Run training — Pick the runtime (local/cloud/BYO). Watch the loss curve live as it trains.
Hot-swap the adapter — When training finishes, hot-swap the adapter onto a deployed base-model endpoint. Same handle, new behavior.

Use cases

Customer support: LoRA-tune Mistral 7B on past tickets. Agent now answers in your tone with product knowledge.
Legal team: Train Llama 3.1 8B on labelled contracts. Redline new docs in your firm’s style on-prem.
Game studio: Per-character LoRAs hot-swapped onto one base model. One GPU, many distinct NPC voices.

Frequently asked questions

How long does a LoRA fine-tune take?

7B model on 50K rows: ~30 min on A100. 70B: ~3 hours. Consumer M2/M3 Mac: ~2 hours for 7B.

What rank should I use?

Start with rank 16. Increase to 32 or 64 for harder domain shifts; decrease to 8 for stylistic tuning.

Can I train on my knowledge base?

Yes — KBs are auto-formatted as instruction pairs.

Can I export the adapter?

Yes — .safetensors download with full training config. Deployable outside osFoundry too.

Is QLoRA supported?

Yes — QLoRA reduces VRAM by quantising the base to 4-bit. Pick QLoRA at training config if your GPU is tight on memory.

How do I evaluate the result?

Compare the adapter against the base on your eval set with the side-by-side compare view. Promote when quality clears your bar.

Pricing

Local: free. Cloud: per-second GPU time. A 7B LoRA on A100 costs roughly $2-3 per run; 70B costs $20-30.

Related features

train and fine tune
hot swap lora
model quantization
train custom models