Name: Llama-3-8B-Instruct-262k-GGUF-smashed
Author: PrunaAI

Question 1

Llama-3-8B-Instruct-262k-GGUF-smashed 可以免费使用吗？

Accepted Answer

Llama-3-8B-Instruct-262k-GGUF-smashed 在您自己的硬件上本地运行时可免费使用。通过 osFoundry 的托管访问按用量计费（输入 Free (local)，输出 Free (local)）。您可随时在本地与托管方式之间切换。

Question 2

我可以将 Llama-3-8B-Instruct-262k-GGUF-smashed 用于商业用途吗？

Accepted Answer

允许有条件的商业使用。 许可证条款未指定——商业使用前请核对上游模型卡。 请查阅上游文档。

Question 3

Llama-3-8B-Instruct-262k-GGUF-smashed 需要多少 VRAM？

Accepted Answer

Q4 量化下约 5 GB，FP16 全精度下约 20 GB。可在单张 24GB 消费级 GPU 上运行。

Question 4

我可以在本地运行 Llama-3-8B-Instruct-262k-GGUF-smashed 吗？

Accepted Answer

可以。Llama-3-8B-Instruct-262k-GGUF-smashed 为开源权重模型，可在工作站 GPU 上本地运行。osFoundry 的本地运行时负责模型加载、量化与路由。

Question 5

Llama-3-8B-Instruct-262k-GGUF-smashed 最擅长什么？

Accepted Answer

Llama-3-8B-Instruct-262k-GGUF-smashed 非常适合低延迟对话与路由, 请求路由与分诊, 文本分类。

Question 6

如何在 osFoundry 中使用 Llama-3-8B-Instruct-262k-GGUF-smashed？

Accepted Answer

在密钥对话框中粘贴您的 PrunaAI API 密钥（若为可自托管的开源权重模型，则部署其权重），在 Pipeline 标签中将 Llama-3-8B-Instruct-262k-GGUF-smashed 分配给某个 Maestro 角色，然后即可在对话、通过 invokeAI 的 Room App 或您自己的应用中使用。