Mistral Small 3
Mistral Small 3 (Mistral AI, 2025) is a 24 billion parameter chat model. Compact 24B open-weights model. Tuned for low-latency chat and tool use with a permissive licence for commercial deployment.
by Mistral AI · 24B parameters · 32K token context window
Best for
- low-latency chat and routing
- tool calling and function use
- edge deployment on consumer GPUs
Ways to use Mistral Small 3 in osFoundry
Connect with your own key (BYOK)
Open the key dialog and paste your Mistral AI API key. osFoundry discovers Mistral Small 3 automatically — assign it to a Maestro role (router, direct, orchestrator, or fallback) in the Pipeline tab and it is live in every chat. Your key, your provider account — no token markup.
Deploy a dedicated endpoint
Mistral Small 3 is open-weights — run it locally for free, or deploy a dedicated GPU endpoint in your workspace for reserved capacity with no rate limits.
Use it in a Room App
Room Apps declare AI features in their manifest, then call them with invokeAI:
import { invokeAI } from '@osfoundry/app-sdk'
// 'summarize' is an AI feature declared in your app manifest.
const result = await invokeAI('summarize', userText)
Call it from your own apps
Once a model is wired into your workspace you can host it as an API and reach it from your own services, scripts, or CI — outside osFoundry.
What hardware can run Mistral Small 3
Mistral Small 3 runs on a single 16GB consumer GPU (~15 GB VRAM with KV-cache headroom). Full-precision inference fits on a single H100 80GB at FP16 precision (~58 GB).
Mistral Small 3 vs similar models
Licence
Apache 2.0 — commercial use allowed — Permits commercial use, modification, distribution, and patent grants without royalties.
Attribution required (preserve copyright + licence notices).
Frequently asked about Mistral Small 3
Is Mistral Small 3 free to use?
Mistral Small 3 is free to run locally on your own hardware. Hosted access through osFoundry is metered (input Free (local) / $ 0.10 /1M, output Free (local) / $ 0.30 /1M). You can switch between local and hosted at any time.
Can I use Mistral Small 3 commercially?
Yes — commercial use is allowed. Permits commercial use, modification, distribution, and patent grants without royalties. Attribution required (preserve copyright + licence notices).
What is the context window of Mistral Small 3?
Mistral Small 3 supports a 32K token context window.
How much VRAM does Mistral Small 3 need?
Approximately 15 GB at Q4 quantisation, or 58 GB at full FP16 precision. Fits on a single 24GB consumer GPU.
Can I run Mistral Small 3 locally?
Yes. Mistral Small 3 is open-weights and runs locally on a workstation GPU. osFoundry's local runtime handles model loading, quantisation, and routing.
What is Mistral Small 3 best at?
Mistral Small 3 is well-suited to low-latency chat and routing, tool calling and function use, edge deployment on consumer GPUs.
How do I use Mistral Small 3 in osFoundry?
Paste your Mistral AI API key in the key dialog (or deploy the open weights for self-hostable models), assign Mistral Small 3 to a Maestro role in the Pipeline tab, then use it in chat, Room Apps via invokeAI, or your own apps.
Published by Mistral AI on January 30, 2025.