Sign in

Ungameable, community-powered AI benchmarks

Start new thread

Recall Predict - Ungameable, community-powered AI benchmarks

by

Predict, by Recall, is the world’s first ungameable, community-led benchmark for frontier models. Join thousands of AI researchers, developers, and enthusiasts in evaluating and building a benchmark for OpenAI’s upcoming GPT-5 model.

Replies

Best

Maker

📌

Hey Product Hunt 👋!

I'm Michael, part of the team behind Predict. Super stoked to share our experiment in ungameable, community-led AI benchmarking with you all.

Why we built Predict
We all know the situation with current AI benchmarks. Labs train and optimize their models to them. They're opaque, biased, limited, static, misaligned to users, and misrepresentative of reality. I could go on and on.... Just think back to the last release of Grok 4 and how it supposedly dominated benchmarks, but real users had an entirely underwhelming experience with it.

AI is transforming each of our lives in immeasurable ways, and picking the right model or tool for the job is more important than ever. With the imminent release of OpenAI's GPT-5, we thought now was the perfect time to unveil our project to the world.

Jump into our product and help build the gold standard in AI benchmarking and evaluation by:

1. Predicting performance across domains like coding, research, creativity, ...
2. Submitting new skills or evaluation prompts to test the model
3. Judging subjective traits such as helpfulness, creativity, trustworthiness (soon)
4. Earning rewards for your contributions to the benchmark

Part predictions. Part evals. Full transparency.

Our team believes in building and shipping fast. This is an alpha release and more features will be added over time. Thanks for your support and please let us know if you have any feedback! 🚀

4d ago

@msena Recall has the potential to become a new benchmark for AI model evaluation, especially with the release of cutting-edge models like GPT-5. It can provide more authentic and diverse performance feedback, which is of positive significance for the development of the AI industry.

2d ago

Trí Nguyễn(✸,✸)

@msena I will always support the project, wish the project always success

2d ago

@msena super excited for this launch!!

2d ago

Mohsen.base.eth

@msena Congrats to the Predict team! Your approach to building a transparent, community-driven benchmark is super exciting. There's definitely a need for something like this—especially with GPT-5 on the horizon. Wishing you all the best as you grow and evolve the platform! 👏

1d ago

Maker

A few extra details for anyone who wants to peek under the hood or understand the bigger bet we’re making:

1. Benchmarks vs. taste: Simon Willison’s one-liner—“draw a pelican on a bike”—did more to expose multimodal quirks than most formal suites. Meanwhile, Andrej Karpathy points out that random crowds often can’t spot the better answer. Predict is designed to surface who in the crowd consistently does have that taste, then wire their signal into the benchmark loop.

2. Private-until-release evals: Every submitted eval stays sealed until GPT-5 drops.

If you have a weird failure mode or a half-baked eval idea, drop it in. The stranger, the better—we want tests that a fine-tuned model hasn’t already memorized.

2d ago

@andrewxhill Love this approach. benchmarks should evolve beyond static datasets and embrace the 'wisdom of the tuned crowd.' Simon’s pelican example proves that edge cases reveal more than polished test suites. Excited to see how Predict surfaces the signal in the noise!

2d ago

"Wow, Predict GPT looks like a game-changer for data-driven insights! Excited to see how it empowers users to make smarter predictions. 🚀"

2d ago

This is rad. Recall is cooking up the decentralized trust layer for ai; a reputation protocol and discovery gateway for the agentic web... they have my attention. Cool experiment with the GPT-5 release... staying tuned for how they'll follow this up.

2d ago

vote done

2d ago

Exciting launch! A community-powered benchmark like Predict is exactly what the AI space needs to ensure transparent and unbiased evaluations. Looking forward to contributing!

_ @mianyituo

2d ago

opp

2d ago

I just finished voting

2d ago

This is exactly what the AI space needs right now, something transparent and community-driven..

2d ago

Recall Network @recallnet Launches Predict GPT-5: What’s Its Role and Significance in Their Strategy?

Read my post here:
https://x.com/ngofaster/status/1951648359380660553

2d ago

'Reddio' (꧁ycc123.IP꧂)ycc123.ink

LFG

2d ago

Appreciate this - as a user of AI products, Predict helps clarify the strengths, weaknesses, accuracy, and potential risks of using AI. This is especially important as AI is increasingly involved in areas that directly impact daily life, such as education, healthcare, finance, and media...

2d ago

big ass

2d ago

yahhh awesome project

2d ago

lfg Grecall

2d ago

mtpomtpomm mtpomtpomm

Recall makes new challenges

2d ago

🚀 Predict – Building the Gold Standard for AI Evaluation, Together with the Community!

Hey Product Hunt and AI enthusiasts!

I’m Michael, part of the team behind Predict — a bold new project to redefine how we evaluate and compare AI models.

### 🎯 Why We Built Predict

Today’s AI benchmarks have serious issues:

* Models are optimized just to "beat the test", not to serve real users.

* They’re opaque, biased, outdated, and often controlled by a few big players.

* They rarely reflect what users actually care about.

We’re here to change that.

### 🔍 What is Predict?

Predict is a community-led, ungameable AI benchmarking platform where anyone can:

1. Predict model performance across domains like coding, research, and creativity

2. Submit new skills or evaluation challenges to test AI capabilities

3. Judge subjective traits like helpfulness, trustworthiness, and creativity (coming soon)

4. Earn rewards for contributing to a more fair and transparent benchmark

💡 Part prediction. Part evaluation. All transparency.

### 🔄 Alpha Phase – Join Early!

Predict is currently in alpha and evolving fast. We’d love your feedback as we build a platform where real users help decide which models truly deserve to lead.

➡️ Let’s create a fair, open, and tamper-proof benchmark system for AI — together.

2d ago

LETS GOO VOTE RECALL

2d ago

Lets predict AI 🤔

2d ago

wow best teams, and best project!

2d ago