Voxtral by Mistral AI is a new family of open-source speech understanding models. Available in 24B and 3B sizes, it goes beyond transcription to offer Q&A, summarization, and function calling directly from voice with SOTA performance.
Mistral AI's new Voxtral models are a big step for open-source speech AI. They're built to go beyond simple transcription and into true understanding.
This means the AI can answer questions about audio, summarize conversations, and even trigger functions directly from voice commands.
It's great that they've released both a powerful 24B model and a smaller 3B version for on-device use under an Apache 2.0 license. This makes high-quality speech understanding much more accessible. For access, you can run it locally, use their API, or try it in Le Chat's voice mode which is rolling out in the coming weeks.
Congrats on the launch! This looks like a major leap for open-source voice interfaces. Excited to see function calling and summarization directly from raw audio!
Love seeing powerful open-source speech models pushing the boundaries like this. Q\&A and function calling directly from voice is a huge leap. Excited to try it out when I get a chance. Congrats on the launch 🚀
Replies
Hi everyone!
Mistral AI's new Voxtral models are a big step for open-source speech AI. They're built to go beyond simple transcription and into true understanding.
This means the AI can answer questions about audio, summarize conversations, and even trigger functions directly from voice commands.
It's great that they've released both a powerful 24B model and a smaller 3B version for on-device use under an Apache 2.0 license. This makes high-quality speech understanding much more accessible. For access, you can run it locally, use their API, or try it in Le Chat's voice mode which is rolling out in the coming weeks.
Mozart AI
Congrats on the launch! This looks like a major leap for open-source voice interfaces. Excited to see function calling and summarization directly from raw audio!
PopPop AI Vocal Remover
Congrats and this will be helpful for many individual developers.
BestPage.ai
Open models with a permissive license? That’s a gamechanger for devs who hate jumping thru hoops, tbh. Realy loving this direction—props to the team!
Linkinize
Love seeing powerful open-source speech models pushing the boundaries like this. Q\&A and function calling directly from voice is a huge leap. Excited to try it out when I get a chance. Congrats on the launch 🚀
Mistral.ai’s new models and Voxtral speech‑understanding tools are super impressive open‑source and enterprise-ready is a rare combo.