What is LoRA?

Abbreviation: LoRA

LoRA (Low-Rank Adaptation) fine-tunes only a small number of "adapter" parameters on top of a frozen base model, drastically reducing training cost. osFoundry fine-tunes any of 60+ open-weight base models with LoRA in a UI flow — no notebook required.

Detail

Instead of updating all model parameters during fine-tuning (slow, memory-heavy), LoRA inserts small trainable matrices into the attention layers. The base model stays frozen; only the adapter weights are trained. Result: 100-1000× less storage, 10× less training time, comparable quality on most tasks.

LoRA adapters are tiny (~MB instead of GB), portable between deployments, and stackable — you can hot-swap multiple adapters onto one base model.

How osFoundry approaches LoRA

osFoundry trains LoRA adapters in minutes-to-hours, registers them in your model catalog, and hot-swaps them onto base-model endpoints at inference time — many specialised behaviours on one shared GPU.