OpenAI releases gpt-realtime and updates API for voice applications

Community-Team · August 29, 2025, 9:08pm

Subscribe for free access to Data Points!

OpenAI launched “gpt-realtime,” a new speech-to-speech model that processes audio directly through a single model rather than chaining multiple models together, achieving 82.8 percent accuracy on Big Bench Audio benchmarks (versus 65.6 percent for the previous version). The model also shows significant improvements in instruction following, function calling accuracy, and better understands non-verbal cues and language switching. OpenAI also made its Realtime API generally available with new features including remote MCP server support, image inputs, and phone calling. These releases enable developers to build production-ready voice agents that sound more human and handle complex tasks more reliably for fields such as customer support, personal assistance, and education. The new model costs $32 per 1 million audio input tokens and $64 per 1 million audio output tokens, a 20 percent reduction from earlier pricing. (OpenAI)

Topic		Replies	Views
OpenAI launches GPT-4.1 model family AI Discussions ai-discussions , data-points	0	163	April 14, 2025
[The Batch] Stability AI released Stable Audio Open AI Discussions the-batch	0	119	June 15, 2024
Speech to text - Open models for transfer learning AI Discussions	0	76	May 24, 2021
Lets create ''Jarvis'' AI Discussions project	8	2555	February 10, 2024
OpenAI text-to-speech - free demo! AI Discussions ai-discussions , project	0	1269	February 19, 2024

OpenAI releases gpt-realtime and updates API for voice applications

Related topics