Question 1

Does osFoundry use Ollama or llama.cpp?

Accepted Answer

osFoundry runs its own inference server. From your perspective it’s just "Install" and the model is ready.

Question 2

How much RAM do I need?

Accepted Answer

A Q4 7B model needs ~6 GB. A 13B needs ~10 GB. A 70B Q4 needs ~50 GB.

Question 3

Can I run multiple local models at once?

Accepted Answer

Yes — the server hot-loads on demand and unloads idle models to free memory.

Question 4

Is local inference billed?

Accepted Answer

No. Local runs on your own hardware and is free.