Challenge the newest AI models with your hardest PhD-level exercises

— and learn how to use AI in your math research —

Nov 1, 2025: We are releasing our newest public benchmark
We have collected 209 research-level mathematics problems from Combinatorics, Algebra, Geometry, Number Theory, and other research areas.

Benchmarks & Samples

View the public benchmark, or browse several sample prompts including their model answers.

View Benchmarks View Samples

Why Join Our Community?

Let the best models solve your exercises
Study and solve the exercises of others
Show that your exercises go far beyond the capabilities of LLMs
Challenge the following models with your prompts:
DeepSeek-V3.1 GPT-5 Claude Opus 4.1 Grok-4 Gemini 2.5 Pro o3

For Companies & Institutions

Want to benchmark your models against PhD-level math problems from professional researchers?
Contact Us
contact@science-bench.ai