Join us to solve the problem of cheating on AI Benchmarking
🎉
Our paper "Benchmarking is Broken - Don't Let AI be its Own Judge" just got accepted to NeurIPS 2025 (Read on arXiv)
We will be publishing a series of subsequent papers on PeerBench and want to reward those who make it possible. Get your name in the paper by contributing to the community (create prompts, comment, review).
PeerBench.ai is an open-source, non-profit community implementation of the NeurIPS paper, bringing the research to life.