Challenge the newest AI models with your hardest PhD-level exercises

— and learn how to use AI in your math research —

Mar 1, 2026: We are introducing the Chat feature
Users can now chat with the latest AI models in parallel about their mathematical research directly on the platform.

Nov 26, 2025: We are releasing our newest public benchmark
It includes 140 research-level mathematics problems. Includes Gemini 3 Pro, GPT 5.1, and Claude Opus 4.5.

Benchmarks & Samples

View the public benchmark, or browse several sample prompts including their model answers.

View Benchmarks View Samples

Why Join Our Community?

Study and solve the exercises of others
Let the best models solve your exercises
Chat with the best AI models about mathematics
Show that your exercises go far beyond the capabilities of LLMs
Chat with the following models and challenge them with your problems:
GPT-5.2 DeepSeek-V3.2 Claude Opus 4.5 Gemini 3 Pro Grok-4.1

Get in Touch

Questions, feedback, or want to contribute to the project? We'd love to hear from you.
Contact Us
contact@science-bench.ai