Home / Glossary / Self-hosting

What is Self-hosting?

Self-hosting means running an LLM on infrastructure you control — your laptop, your data centre, or a dedicated GPU you provision. osFoundry’s self-host runtime supports any of 76,000 open-weight models in one click.

Detail

Self-hosting an LLM gives you full control over weights, runtime, routing, and data flow. The tradeoff is that you (or your platform) own the infra ops. Common reasons to self-host: privacy, data residency, cost predictability at scale, or running models that aren’t available via API.

Self-hosting only works with open-weight models. Proprietary models (GPT-4, Claude) are API-only.

How osFoundry approaches Self-hosting

osFoundry collapses the self-host integration tax: built-in inference server, one-click model install, workspace-wide routing, no llama.cpp setup. Local hardware, our cloud, or your own GPU server — pick per model.

Related terms

llm
quantization
vpc

Related features

self-host-llms
local-llm-inference
gpu-endpoint
byo-vpc