Question 1

क्या osFoundry Ollama या llama.cpp उपयोग करता है?

Accepted Answer

osFoundry अपना inference server चलाता है। आपके दृष्टिकोण से यह केवल "Install" है और model तैयार है।

Question 2

मुझे कितनी RAM चाहिए?

Accepted Answer

एक Q4 7B model को ~6 GB चाहिए। एक 13B को ~10 GB चाहिए। एक 70B Q4 को ~50 GB चाहिए।

Question 3

क्या मैं एक बार में कई local models चला सकता हूँ?

Accepted Answer

हाँ — server on demand hot-loads करता है और memory free करने के लिए idle models को unloads करता है।

Question 4

क्या local inference billed है?

Accepted Answer

नहीं। Local आपके अपने hardware पर चलता है और free है।