
Exla FLOPs
Launched this week
On-Demand GPU clusters - The Cheapest H100s Anywhere
99 followers
Exla FLOPs is the only service where you can instantly spin up 64, 128, or more GPUs - no waitlists, no commitments. Just clusters at your command.
99 followers
Exla FLOPs is the only service where you can instantly spin up 64, 128, or more GPUs - no waitlists, no commitments. Just clusters at your command.
Exla FLOPs
Hello!
I'm very excited to launch this product!
We built this because during our own AI training and fine‑tuning, we hit a wall when trying to scale past 8 GPUs. We were manually stitching together nodes across different clouds, so we decided to productize the solution.
Exla FLOPs has the lowest pricing of H100s among any cloud providers.
The cluster is built for developers with insane availabilities for all types of GPUs.
We’re thrilled to see what you build with it! Your feedback means a lot!
We are looking to give out free credits as well. Please fill this out and we'll be depositing credits soon after: https://tally.so/r/meGzgE
Bild AI
@viraat_das This is solving such a desperate need as training is so expensive! Congrats on the launch and thank you for making this!
Exla FLOPs
@roop_pal Appreciate it Roop! Excited to see this being used to solve real problems!
Congrats on the launch! This looks promising.
How do you handle node failures during long training runs? Any automatic recovery mechanisms?
What storage options do you provide for datasets and model checkpoints?
You mentioned "The cluster is built for developers with insane availabilities for all types of GPUs." - how do you get around chip shortages?
Exla FLOPs
@olga_s52 thank you!
Node failures:
Right now, Exla FLOPs gives you direct SSH access to your own bare metal nodes — no scheduler in the middle. This means you have full control over your setup (Slurm, DeepSpeed, Kubernetes, or otherwise). While we don’t currently handle automatic job recovery ourselves, most users bring their own checkpointing or orchestration setup to manage long training runs.
Storage options:
We provide fast, local NVMe on each node for high-speed I/O. For persistence across runs, users typically mount their own S3/GCS buckets or connect to remote storage solutions. Shared storage or NFS-style volumes are in the works.
Chip availability:
Rather than relying on a single cloud, we dynamically source idle GPU capacity across a network of bare metal providers, resale markets, and underutilized datacenter inventory. This allows us to provision 64+ GPUs seemlessly, even when traditional clouds are maxed out or heavily reserved.
Do reach out for any feedback or feature requests though: viraat@exla.ai
The ability to instantly scale GPU clusters without the usual hassles is a huge win for AI devs! For teams working on large - scale, multi - GPU AI projects, how does Exla FLOPs ensure seamless integration and stable performance across different GPU types and quantities?