Home / Features / Run any model / Hot-swap LoRA
Hot-swap LoRA adapters at inference time on osFoundry
osFoundry hot-swaps LoRA adapters on a single base model — no reload, sub-second switch. Stack multiple personas, domain experts, or fine-tuned skills on top of one base model and route per request. Cuts the cost of serving N specialised variants from N model deployments down to one.
Quick answer
- Up to 16 active LoRA adapters per base model.
- Sub-second adapter switch — no model reload.
- Pay for one base model, route to many specialised variants.
- Adapters trained inside osFoundry are auto-registered.
Frequently asked questions
How many adapters can I hot-swap?
Up to 16 active adapters per base model on a single endpoint.
Does this work with adapters I trained elsewhere?
Yes — upload .safetensors and the adapter is registered.
Related features