327 - Anthropomorphic Claims

One of the rare papers in AI that is actually worth reading and mentioning in 2025 has crossed my path today: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!

The paper concisely makes the point from multiple angles how wildly anthropomorphic claims of “reasoning”, “thought”, and “interpretability” are complete BS, and does so much more politely than I would. A sufficiently large and growing mountain of evidence has shown that intermediate tokens are better understood through the non-anthropomorphic, and fairly elegant, charts and terms that the paper offers. TLDR, it is simply “prompt augmentation”, which is wholly reliant on formal verifiers in the currently trending configurations.

“Interpretability” has also become an explicit mechanism for fraud in both research and industry, as it amounts to “reading” chicken bones or tea leaves, the “interpretation” of a thing absent any causal relationship or actual verification capacity. Much like having someone feel the bumps on your head isn’t going to allow them to tell your fortune or diagnose your physical health, the “interpretation” of intermediate tokens is so flagrantly and absurdly baseless that no reasonable doubt remains for the intentionality of fraud backing the activity.

A dividing line can be drawn between those supplying and benefiting from information and those who merely accept or reject it:

  • Anthropomorphism of these terms by “Scientists” and “Business Leaders” is: Fraud
  • Anthropomorphism of these terms by consumers and enthusiasts is: Naïve

I’ve pointed this out for quite some time, which is why the paper was recommended to me, but the trend has only festered with time. Fraud has never been more obvious than it is today, and tomorrow is only likely to get worse, at least until the tech giants engaged in it are all plowed under.

A litmus test that most people can keep in mind is that you don’t test capacities like reasoning, thought, and general understanding within trivial closed systems where formal verifiers can be used. If you have the formal verifier then you don’t need the system that is being “evaluated”, and the results won’t generalize. If a company is claiming that their “general” agent played an Atari game better than anyone else, there is no reasonable doubt that they’re committing fraud. Real scientists research (these capacities) in open systems, not Atari games.