Hard image-plus-text reasoning across 30 college subjects, the multimodal counterpart to MMLU-Pro.
MMMU-Pro is the standard test of college-level multimodal reasoning. Every question requires reading a diagram, chart, photo, or scientific figure alongside the text. The Pro version is a harder, more contamination-resistant variant of the original MMMU benchmark.
Multiple-choice questions with one correct answer. Models must read the image and text together to answer. The Pro variant strips text-only solvable items and adds harder distractors.
No scores yet for this benchmark.
Not enough scored models yet.
Not enough scored models yet.
No. MMMU-Pro is the harder, cleaner variant that drops text-only solvable questions and adds harder distractors. Use it for frontier multimodal models.
Based on score correlations across our database.