Qdrant Cloud Inference - Unify embeddings and vector search across modalities

Qdrant Cloud Inference lets you generate embeddings for text, image, and sparse data directly inside your managed Qdrant cluster. Better latency, lower egress costs, simpler architecture, and no external APIs required.

Replies

Best

Daniel Azoulai

Maker

📌

We built FastEmbed to make local embedding fast and easy. But as customers moved to production, they needed something more scalable and integrated. Qdrant Cloud Inference brings that next step; embedding models that run inside your cluster with no external services. It supports dense, sparse, and image data, and runs on AWS, Azure, and GCP (US only for now). You can embed, store, and search with a single API call. Lower latency, fewer moving parts, and reduced egress costs.

Report

24d ago

Saurabh Rai

Amazing work Qdrant team and congratulations on the launch @daniel_azoulai2 ,@andre_zayarni and @generall. This will help in managing embeddings and working with existing Qdrant more easier.

Report

23d ago

Daniel Azoulai

Maker

Amazing! I'm excited to see how you use this and how it helps your projects. Please let us know!

Report

15d ago

PrompX

The best one yet, PrompX uses Qdrant for hybrid search and embedding.

Thanks Qdrant.

Report

23d ago

Vilhelm von Ehrenheim

Really cool addition to your product!! Will def try it out!

Report

23d ago

Daniel Azoulai

Maker

@while always great to hear from you!

Report

15d ago

Rachit Magon

Smoopit

Single API call for embed, store, and search could simplify a lot of data pipelines. Are there plans to expand beyond US regions for the multi-cloud support? @daniel_azoulai2

Report

17d ago

Daniel Azoulai

Maker

@rachitmagon thank you for the comment! We plan to expand to regions beyond the US. We don't have a date set yet, though.

Report

16d ago

Rachit Magon

Smoopit

@daniel_azoulai2 Sounds good, thanks for the info.

Report

16d ago