Name: NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16
Author: RedHatAI

Question 1

NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 可以免费使用吗？

Accepted Answer

NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 在您自己的硬件上本地运行时可免费使用。通过 osFoundry 的托管访问按用量计费（输入 Free (local)，输出 Free (local)）。您可随时在本地与托管方式之间切换。

Question 2

我可以将 NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 用于商业用途吗？

Accepted Answer

允许有条件的商业使用。 许可证条款未指定——商业使用前请核对上游模型卡。 请查阅上游文档。

Question 3

NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 需要多少 VRAM？

Accepted Answer

Q4 量化下约 6 GB，FP16 全精度下约 22 GB。可在单张 24GB 消费级 GPU 上运行。

Question 4

我可以在本地运行 NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 吗？

Accepted Answer

可以。NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 为开源权重模型，可在工作站 GPU 上本地运行。osFoundry 的本地运行时负责模型加载、量化与路由。

Question 5

NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 最擅长什么？

Accepted Answer

NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 非常适合text generation。

Question 6

如何在 osFoundry 中使用 NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16？

Accepted Answer

在密钥对话框中粘贴您的 RedHatAI API 密钥（若为可自托管的开源权重模型，则部署其权重），在 Pipeline 标签中将 NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16 分配给某个 Maestro 角色，然后即可在对话、通过 invokeAI 的 Room App 或您自己的应用中使用。