A reward model that judges text-image alignment, fidelity, and aesthetic quality on a single combined score.
ImageReward bundles three signals — does the image match the prompt, is it visually faithful, and is it aesthetically pleasing — into a single reward score. The model is also widely used to fine-tune diffusion generators via reward-weighted training.
The reward model was trained on hundreds of thousands of human preferences plus structured ratings on alignment, fidelity, and aesthetic. Benchmark score is the mean reward across a standard prompt set.
No scores yet for this benchmark.
Not enough scored models yet.
Not enough scored models yet.
They correlate strongly. Use ImageReward if you care about a balance of prompt alignment and aesthetics; use HPS v2 if you want the closest single proxy to live human voting. Most benchmark reports use both as a sanity check.
Based on score correlations across our database.