Artificial Analysis composite score that blends a dozen reasoning, coding, and math benchmarks into a single number.
The Intelligence Index is Artificial Analysis’s way of giving every LLM a single comparable score across reasoning, coding, math, and long-context ability. They run a dozen public benchmarks under the same prompt and scoring settings, normalize each one to a 0–100 range, and average them with fixed weights. It is most useful as a quick way to spot the top tier without studying every sub-benchmark.
Artificial Analysis runs each model on the underlying evaluations using a shared prompting harness, normalizes each per-benchmark score, and aggregates them with published weights. The composite is recomputed whenever a new model lands or a benchmark is added.
| # | Model | Lab | Source | Score |
|---|---|---|---|---|
| 01 | GPT-5.5 | OpenAI | Closed | 60.2 |
| 02 | Claude Opus 4.7 | Anthropic | Closed | 57.3 |
| 03 | Gemini 3.1 Pro Preview | Closed | 57.2 | |
| 04 | GPT-5.4 | OpenAI | Closed | 56.8 |
| 05 | Gemini 3.5 Flash | Closed | 55.3 | |
| 06 | Kimi K2.6 | Moonshot AI | Open | 53.9 |
| 07 | MiMo-V2.5-Pro | Xiaomi | Closed | 53.8 |
| 08 | Grok 4.3 | xAI | Closed | 53.2 |
| 09 | Qwen3.6 Max Preview | Alibaba | Closed | 51.8 |
| 10 | DeepSeek-V4-Pro | DeepSeek | Open | 51.5 |
| 11 | GLM-5.1 | Z.ai | Open | 51.4 |
| 12 | GPT-5.2 | OpenAI | Closed | 51.3 |
| 13 | Qwen3.6-Plus | Alibaba | Closed | 50.0 |
| 14 | GLM-5 | Z.ai | Open | 49.8 |
| 15 | MiniMax M2.7 | MiniMax | Closed | 49.6 |
32 model(s) with undisclosed parameter counts not shown. Most closed-source labs do not publish model size.
Our average pulls in every benchmark we track, including Arena scores and HF leaderboards. The AA Intelligence Index is Artificial Analysis’s own composite over their evaluation suite. Use it as a second opinion, not a replacement.
Yes. Every benchmark inside the Intelligence Index has its own deep-dive page, so you can drill into MMLU-Pro, GPQA Diamond, LiveCodeBench, and the others independently.
Based on score correlations across our database.