GPT-4o Audio
OpenAI's GPT-4o Audio is a speech-and-audio model. The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs...
by OpenAI · 128K token context window
Best for
- speech-to-text transcription
- meeting and audio transcription
Ways to use GPT-4o Audio in osFoundry
Connect with your own key (BYOK)
Open the key dialog and paste your OpenAI API key. osFoundry discovers GPT-4o Audio automatically — assign it to a Maestro role (router, direct, orchestrator, or fallback) in the Pipeline tab and it is live in every chat. Your key, your provider account — no token markup.
Use it in a Room App
Room Apps declare AI features in their manifest, then call them with invokeAI:
import { invokeAI } from '@osfoundry/app-sdk'
// 'summarize' is an AI feature declared in your app manifest.
const result = await invokeAI('summarize', userText)
Call it from your own apps
Once a model is wired into your workspace you can host it as an API and reach it from your own services, scripts, or CI — outside osFoundry.
GPT-4o Audio vs similar models
Licence
Hosted — usage subject to provider terms — Hosted-only model — usage governed by the provider's API terms. Bring your own provider key.
No weights distributed; usage subject to provider terms.
Frequently asked about GPT-4o Audio
How much does GPT-4o Audio cost?
GPT-4o Audio is metered at $ 2.50 /1M for input, and $ 10.00 /1M for output. Bring your own OpenAI API key — osFoundry passes through provider pricing without markup.
Can I use GPT-4o Audio commercially?
Commercial use is allowed with conditions. Hosted-only model — usage governed by the provider's API terms. Bring your own provider key. No weights distributed; usage subject to provider terms.
What is the context window of GPT-4o Audio?
GPT-4o Audio supports a 128K token context window.
Can I run GPT-4o Audio locally?
No — GPT-4o Audio is hosted only and accessed via the OpenAI API.
What is GPT-4o Audio best at?
GPT-4o Audio is well-suited to speech-to-text transcription, meeting and audio transcription.
How do I use GPT-4o Audio in osFoundry?
Paste your OpenAI API key in the key dialog (or deploy the open weights for self-hostable models), assign GPT-4o Audio to a Maestro role in the Pipeline tab, then use it in chat, Room Apps via invokeAI, or your own apps.
Published by OpenAI on August 15, 2025. Source: https://openrouter.ai/openai/gpt-4o-audio-preview