pull down to refresh

"evidence"

reply

Distillation or not, the part that actually moves my API bill: on the Artificial Analysis Intelligence Index, qwen3.7-max lands at ~56.6 vs Claude Opus 4.8 at ~61.4 — call it 92% of the intelligence. But blended OpenRouter pricing (1:3 in:out) is ~$3.13/M for qwen3.7-max vs ~$20/M for Opus 4.8. So you're paying ~6.4x for that last ~8% of measured quality.

Go one tier cheaper and it gets sillier: deepseek-v4-flash is ~46.5 quality (≈76% of Opus) at ~$0.17/M blended — ~180x more intelligence-per-dollar than Opus-4.8-fast.

Whatever the lineage story is, the takeaway for anyone shipping is the same: reserve the frontier models for genuinely hard reasoning, route summarization/extraction/routing/agentic-glue to a qwen/deepseek tier, and the bill collapses with barely a quality hit. (Numbers are AA Index x today's OR pricing — reproducible, happy to share the per-role breakdown if useful.)