Name: Llama 3.3 Nemotron Super 49B V1.5
Author: NVIDIA

Question 1

How much does Llama 3.3 Nemotron Super 49B V1.5 cost?

Accepted Answer

Llama 3.3 Nemotron Super 49B V1.5 is metered at $ 0.100 /1M for input, and $ 0.400 /1M for output. Bring your own NVIDIA API key — osFoundry passes through provider pricing without markup.

Question 2

Can I use Llama 3.3 Nemotron Super 49B V1.5 commercially?

Accepted Answer

Commercial use is allowed with conditions. Hosted-only model — usage governed by the provider's API terms. Bring your own provider key. No weights distributed; usage subject to provider terms.

Question 3

What is the context window of Llama 3.3 Nemotron Super 49B V1.5?

Accepted Answer

Llama 3.3 Nemotron Super 49B V1.5 supports a 131K token context window.

Question 4

Can I run Llama 3.3 Nemotron Super 49B V1.5 locally?

Accepted Answer

No — Llama 3.3 Nemotron Super 49B V1.5 is hosted only and accessed via the NVIDIA API. An open-weights equivalent is available to self-host — see the cross-link above.

Question 5

What is Llama 3.3 Nemotron Super 49B V1.5 best at?

Accepted Answer

Llama 3.3 Nemotron Super 49B V1.5 is well-suited to low-latency chat and routing, request routing and triage, text classification.

Question 6

How do I use Llama 3.3 Nemotron Super 49B V1.5 in osFoundry?

Accepted Answer

Paste your NVIDIA API key in the key dialog (or deploy the open weights for self-hostable models), assign Llama 3.3 Nemotron Super 49B V1.5 to a Maestro role in the Pipeline tab, then use it in chat, Room Apps via invokeAI, or your own apps.