SmolVLA is a compact (450M) open-source Vision-Language-Action model for robotics. Trained on community data, it runs on consumer hardware & outperforms larger models. Released with code & recipes.
Hi everyone!
I think there are a few really important ingredients for bringing AI agents into the physical world. First, they need to be able to interact with real environments. Second, due to the limits of on-robot hardware, the models need to be lightweight and efficient. And third, for the good of the community and wider adoption, these foundational models should ideally be open-source.
SmolVLA is an exciting new release because it squarely addresses these points. It's a compact (450M) Vision-Language-Action (VLA) model that runs on consumer-grade hardware, is fully open-source, and was trained entirely on open, community-contributed robotics datasets from the LeRobot project.
Despite its small size, SmolVLA outperforms much larger VLAs on both simulation and real-world tasks. The team has also implemented things like asynchronous inference to make it even more responsive. This is a fantastic contribution for making capable, real-world robotics research more accessible to everyone.
Hugging Face is doing incredible work! Their open-source model hub—packed with thousands of pre-trained models—makes it a breeze to dive into NLP, vision, and generative AI. I love how the community and APIs make complex AI feel so accessible and fun. Huge kudos to the team for building such a welcoming ecosystem!
SmolVLA is a great example of efficient design meeting real-world usability — compact, open-source, and high-performing. Love that it’s accessible to the broader robotics community right out of the box.
Hugging Face's 9th launch with SmoIVLA? 🤖🔥 Running powerful robotics VLA (Vision-Language-Action) on consumer hardware is a breakthrough! Must be using heavy quantization techniques or distilled multi-modal models to achieve this. The "View more ->" tease suggests edge-compute optimizations - possibly ROS 2 integration? Game-changer for indie robotics devs!
Replies
Hugging Face is doing incredible work! Their open-source model hub—packed with thousands of pre-trained models—makes it a breeze to dive into NLP, vision, and generative AI. I love how the community and APIs make complex AI feel so accessible and fun. Huge kudos to the team for building such a welcoming ecosystem!
This is incredibly cool.
SmolVLA is a great example of efficient design meeting real-world usability — compact, open-source, and high-performing. Love that it’s accessible to the broader robotics community right out of the box.
Hugging Face's 9th launch with SmoIVLA? 🤖🔥 Running powerful robotics VLA (Vision-Language-Action) on consumer hardware is a breakthrough! Must be using heavy quantization techniques or distilled multi-modal models to achieve this. The "View more ->" tease suggests edge-compute optimizations - possibly ROS 2 integration? Game-changer for indie robotics devs!
Great to see this. Right application of AI. Way to go!
Interesting mission! How does open science translate into practical tools for users? 🤔
Love the mission of making AI more accessible! How are you planning to balance open-source values with sustainability? 🚀