Join us to solve the problem of cheating on AI Benchmarking

🎉

Our paper "Benchmarking is Broken - Don't Let AI be its Own Judge" just got accepted to NeurIPS 2025

Read the paper on arXiv
General Review
Review general prompts and help improve AI benchmarking quality
Medical Prompts
Review medical domain prompts and contribute to healthcare AI evaluation