Home / Glossary / QLoRA
What is QLoRA?
Abbreviation: QLoRA
QLoRA is LoRA fine-tuning where the base model is quantised to 4-bit during training, cutting VRAM requirements roughly in half. osFoundry supports QLoRA as an alternative training path when your GPU is tight on memory.
Detail
Standard LoRA keeps the base model in full precision during training, which still needs significant VRAM for large models. QLoRA quantises the base to 4-bit using NF4 quantisation, letting you fine-tune 70B models on a single 24 GB consumer GPU.
Quality is comparable to standard LoRA in most cases, with slightly higher training perplexity. The adapter itself is unaffected — you can deploy the trained adapter onto a full-precision base later.
Related terms
Related features