Methodology

Every model is scored on a normalized scale from 0.00 to 0.99. A model's score is calculated from a set of underlying factors and then ranked relative to every other model in the field.

What goes into a score

Scores are derived from several input factors, including:

Use-case performance (coding, writing, reasoning, vision)
Price (cost per token — cheaper is weighted as better value)
Speed and latency
Reliability and security posture
Context window size

These raw factor values are inputs to the calculation only — they are not displayed individually. Only the resulting score is shown.

Segments

The same factors are weighted differently depending on who the ranking is for:

Overall

A balanced blend of capability, reliability, speed and value.

Enterprise

Weighted toward reliability, security, deep reasoning and large context.

Consumer

Weighted toward value for money, speed and everyday usefulness.

← Back to rankings