Name: Qwen 2.5 72B
Author: Alibaba

Question 1

Is Qwen 2.5 72B free to use?

Accepted Answer

Qwen 2.5 72B is free to run locally on your own hardware. Hosted access through osFoundry is metered (input Free (local) / $ 0.50 /1M, output Free (local) / $ 0.70 /1M). You can switch between local and hosted at any time.

Question 2

Can I use Qwen 2.5 72B commercially?

Accepted Answer

Yes — commercial use is allowed. Permits commercial use with attribution; some restrictions on misuse. "Built with Qwen" attribution required in derivatives.

Question 3

What is the context window of Qwen 2.5 72B?

Accepted Answer

Qwen 2.5 72B supports a 131K token context window.

Question 4

How much VRAM does Qwen 2.5 72B need?

Accepted Answer

Approximately 44 GB at Q4 quantisation, or 173 GB at full FP16 precision. Fits on a single A100/H100 80GB.

Question 5

Can I run Qwen 2.5 72B locally?

Accepted Answer

Yes. Qwen 2.5 72B is open-weights and runs locally on a workstation GPU. osFoundry's local runtime handles model loading, quantisation, and routing.

Question 6

What is Qwen 2.5 72B best at?

Accepted Answer

Qwen 2.5 72B is well-suited to general chat and Q&A, code generation and review, mathematical reasoning.

Question 7

How do I use Qwen 2.5 72B in osFoundry?

Accepted Answer

Paste your Alibaba API key in the key dialog (or deploy the open weights for self-hostable models), assign Qwen 2.5 72B to a Maestro role in the Pipeline tab, then use it in chat, Room Apps via invokeAI, or your own apps.