Qdrant Cloud Inference lets you generate embeddings for text, image, and sparse data directly inside your managed Qdrant cluster. Better latency, lower egress costs, simpler architecture, and no external APIs required.
We built FastEmbed to make local embedding fast and easy. But as customers moved to production, they needed something more scalable and integrated.
Qdrant Cloud Inference brings that next step; embedding models that run inside your cluster with no external services. It supports dense, sparse, and image data, and runs on AWS, Azure, and GCP (US only for now).
You can embed, store, and search with a single API call. Lower latency, fewer moving parts, and reduced egress costs.
Amazing work Qdrant team and congratulations on the launch @daniel_azoulai2 ,@andre_zayarni and @generall. This will help in managing embeddings and working with existing Qdrant more easier.
Single API call for embed, store, and search could simplify a lot of data pipelines. Are there plans to expand beyond US regions for the multi-cloud support? @daniel_azoulai2
Replies
Amazing work Qdrant team and congratulations on the launch @daniel_azoulai2 ,@andre_zayarni and @generall. This will help in managing embeddings and working with existing Qdrant more easier.
Amazing! I'm excited to see how you use this and how it helps your projects. Please let us know!
The best one yet, PrompX uses Qdrant for hybrid search and embedding.
Thanks Qdrant.
Really cool addition to your product!! Will def try it out!
@while always great to hear from you!
Smoopit
Single API call for embed, store, and search could simplify a lot of data pipelines. Are there plans to expand beyond US regions for the multi-cloud support? @daniel_azoulai2
@rachitmagon thank you for the comment! We plan to expand to regions beyond the US. We don't have a date set yet, though.
Smoopit
@daniel_azoulai2 Sounds good, thanks for the info.