Microsoft added average token usage to the MAI-Code-1-Flash release card. The model scores 71.6 on SWE-Bench Verified while consuming roughly one-third the tokens of Claude Haiku 4.5. That single addition changes how model releases get read: performance score alone is no longer the number that matters.
The cost pressure is real and already hitting named companies. Uber burned through its entire employee AI budget in four months. Salesforce committed $300 million to Anthropic tokens and froze engineering hires. Microsoft cancelled Claude Code licenses across its Windows, Microsoft 365, and Teams divisions. Artificial Analysis puts GPT 5.5 and Claude Opus 4.8 within one point of each other on its Intelligence Index, around 60, but running that index costs $3,357 on GPT 5.5 versus $4,685 on Opus 4.8. Same output, 40 percent more expensive.
The piece is worth reading in full for how Tunguz frames the competitive pressure cascading down the stack. Model companies now compete on intelligence per dollar. Application layers will compete one level up, on dollars per closed ticket, shipped PR, or resolved support case. The pricing logic has to match how buyers actually think, and right now most of the stack is still priced the wrong way.
[READ ORIGINAL →]