← Back to ranking

Model result · rank #9

MiniMax M3 Direct Plus

MiniMax Plus · direct API · SWE xhigh. Public result card with the model’s overall score, lane measurements, runtime/cost telemetry, and ranking formula.

Overall score76.05

Rank #9

Full / Agentic84.32

Full rank #7

SWE MVP67.77

SWE rank #9

Measured cost$0.0037

100.0% reliability

Overall

All-around publication view

Score76.05
Formula50% Full + 50% SWE
BasisMiniMax Plus · direct API · SWE xhigh

The overall score is calculated from the Full/Agentic and SWE lanes, keeping the aggregate comparable while preserving the measurements behind it.

Lane 01

Full / Agentic benchmark

Final84.32
Capability87.74
Agentic79.86
Pass rate88.4%
Prompts43

This lane captures instruction following, structured behavior, tool discipline, and general agentic reliability.

Lane 02

Software engineering MVP

SWE score67.77
Focused final61.86
Capability62.62
Daily driver60.93
Prompts24

This lane is closer to implementation usefulness: source handling, architecture cleanliness, and deliverable quality.

Telemetry

Runtime economics

Full cost$0.0031
SWE cost$0.0006
Full avg seconds14.38
SWE time751.18s
Decode23.39

Cost, time, and runtime basis are telemetry. They explain tradeoffs; they do not secretly overwrite the capability scores.

Interpretation

Why this result lands here.

The model is stronger in the Full/Agentic lane than in the SWE lane; the overall score is therefore shown with both component lanes visible. MiniMax Direct Plus combines independently measured Full and SWE lanes; suite metadata remains visible per lane.