Daniel Azoulai

Qdrant Cloud Inference - Unify embeddings and vector search across modalities

Qdrant Cloud Inference lets you generate embeddings for text, image, and sparse data directly inside your managed Qdrant cluster. Better latency, lower egress costs, simpler architecture, and no external APIs required.

Add a comment

Replies

Best
Daniel Azoulai
Maker
📌
We built FastEmbed to make local embedding fast and easy. But as customers moved to production, they needed something more scalable and integrated. Qdrant Cloud Inference brings that next step; embedding models that run inside your cluster with no external services. It supports dense, sparse, and image data, and runs on AWS, Azure, and GCP (US only for now). You can embed, store, and search with a single API call. Lower latency, fewer moving parts, and reduced egress costs.
Saurabh Rai

Amazing work Qdrant team and congratulations on the launch @daniel_azoulai2 ,@andre_zayarni and @generall. This will help in managing embeddings and working with existing Qdrant more easier.

Daniel Azoulai

Amazing! I'm excited to see how you use this and how it helps your projects. Please let us know!

PrompX

The best one yet, PrompX uses Qdrant for hybrid search and embedding.

Thanks Qdrant.

Vilhelm von Ehrenheim

Really cool addition to your product!! Will def try it out!

Daniel Azoulai

@while always great to hear from you!

Rachit Magon

Single API call for embed, store, and search could simplify a lot of data pipelines. Are there plans to expand beyond US regions for the multi-cloud support? @daniel_azoulai2

Daniel Azoulai

@rachitmagon thank you for the comment! We plan to expand to regions beyond the US. We don't have a date set yet, though.

Rachit Magon

@daniel_azoulai2 Sounds good, thanks for the info.