Product Hunt logo dark
  • Launches
    Coming soon
    Upcoming launches to watch
    Launch archive
    Most-loved launches by the community
    Launch Guide
    Checklists and pro tips for launching
  • Products
  • News
    Newsletter
    The best of Product Hunt, every day
    Stories
    Tech news, interviews, and tips from makers
    Changelog
    New Product Hunt features and releases
  • Forums
    Forums
    Ask questions, find support, and connect
    Streaks
    The most active community members
    Events
    Meet others online and in-person
  • Advertise
Subscribe
Sign in
Subscribe
Sign in
Qwen2.5-Omni

Qwen2.5-Omni

The End-to-End Model Powering Multimodal Chat

5.0
•1 review•

124 followers

The End-to-End Model Powering Multimodal Chat

5.0
•1 review•

124 followers

Visit website
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, Understands text, images, audio & video; generates text & natural streaming speech.
  • Overview
  • Launches1
  • Reviews1
  • Team
  • More
Company Info
github.com/QwenLM/Qwen2.5-Omni
Qwen2.5-Omni Info
Launched in 2025View 1 launch
Forum
p/qwen2-5-omni
  • Blog
  • •
  • Newsletter
  • •
  • Questions
  • •
  • Forums
  • •
  • Product Categories
  • •
  • Apps
  • •
  • About
  • •
  • FAQ
  • •
  • Terms
  • •
  • Privacy and Cookies
  • •
  • X.com
  • •
  • Facebook
  • •
  • Instagram
  • •
  • LinkedIn
  • •
  • YouTube
  • •
  • Advertise
© 2025 Product Hunt
SocialX
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Qwen2.5-Omni gallery image
Free
Launch tags:
Open Source•Artificial Intelligence•GitHub
Launch Team
Zac Zuochen chengBinyuan Hui

What do you think? …

Zac Zuo
Zac Zuo
Hunter
📌

Hi everyone!

You can now use Voice and Video Chat directly in Qwen Chat! Powering these new multimodal interactions is Qwen's latest open-source model: Qwen2.5-Omni.

This "omni" model is a single system that understands text, audio, images, and video, while outputting both text and natural-sounding audio.

Key aspects:

🔄 End-to-End Multimodal: A single "Thinker-Talker" architecture designed for seamless input/output across modalities.
💬 Real-Time Interaction: Built for streaming, enabling smooth voice and video chat experiences.
🗣️ Natural Speech Output: Claims strong performance in speech generation quality.
💪 Strong Across Modalities: Performs well on benchmarks for vision, audio, and text tasks.
🔓 Openly Available with Apache 2.0 license: Released on Hugging Face, ModelScope, and GitHub, with API access via DashScope.

The Qwen team believes this type of omni model is key for the future of AI agents. While this is still just the 7B version, it's impressive to see this level of multimodality in an open model.

Head over to Qwen Chat, toggle the new voice & video chat button, and experience it!

Report
5mo ago
Ruslan O
Ruslan O
It's very interesting ai! Congratulations on the launch! There are three my favorite ai : Qwen, Deepseek and ChatGPT. And two of three is free. I wish you to grow up!
Report
4mo ago
Shushant Lakhyani
Shushant Lakhyani
Flex-Worthy Templates

Flex-Worthy Templates

Alibaba is shipping fast

Report
5mo ago
Ambassador
AutoForm
AutoForm — Automate the busywork from your files and your tools.
Automate the busywork from your files and your tools.
Promoted

Do you use Qwen2.5-Omni?

5.0
Based on 1 review
Review Qwen2.5-Omni?
Reviews
Helpful
Fitz₿🍷
Fitz₿🍷
•5 reviews
Qwen2.5-Omni seems like a powerful multimodal tool! Its ability to handle both text and multimedia inputs, while generating natural speech, could be a huge asset in various AI applications.
Report
4mo ago