2 Comments
User's avatar
Jacky's avatar

After a week running high-frequency browser agents via OpenClaw, the “token” gap feels huge.

Codex behaves like a senior employee—goal-driven, self-correcting, and reliable in messy agent workflows. MiniMax is more like an intern: it can execute, but breaks context easily unless instructions are extremely explicit.

This is why I doubt tokens are fully commoditized. Cheap tokens may be “good enough” for low-value tasks—Dario’s “fix my computer for 1 cent” idea—but the upside for high-value work (e.g., discovering new drugs) is essentially unbounded, and that’s where better tokens still matter a lot.

FD's avatar

Thanks for the comment. I agree with you - also mentioned here in the article:

1/ Are we comparing like-for-like models?

“Cheap can be expensive.” Lower-quality models may require more retries, longer prompts, and additional iterations - inflating token counts. As one friend put it: “Have you ever tried toilet paper from the dollar store vs. Target? One takes half a roll vs. the other just a few pieces.” LMAO.