224 - Stream of Exploits
There is a constant stream of exploits that work against trashbots (LLMs), but some are more potent than others when operationalized. Recently people have been showing how the highest value and most obvious frauds in AI, OpenAI and Anthropic, offer systems that can’t reliably tell if “9.11 is less than or greater than 9.9”. Anyone with the faintest idea of how these systems operate shouldn’t be surprised by this, nor should they expect that any method of modifying LLMs could reliably overcome this type of failure while under real-world adversarial pressures.
Another thing that they can expect is that any company stupid enough to integrate an LLM into systems that are intended to negotiate anything will be broken by such exploits. The currently popular "9.11 vs 9.9" is an example of how these systems reliably fail to tell the difference between a large number, and fraction that contains more total digits.
Just imagine if a major bank was this stupid, you could get a $1bn loan at an interest rate of 0.00000000009%. In fact, if I ever come across such a bank, I’ll do precisely that, and publish my results.
Similar exploits have been shown previously, like the $1 truck sold by one auto dealer, but the least competent companies on the block continue buying into trash technology. It was comically pointed out last year that companies like OpenAI encouraged people to use their trashbots for such customer service roles, and yet those same companies don’t use those same systems for that purpose themselves. They know the technology well enough to not make that stupid mistake, but they’re happy to push anyone else off of that cliff that they can.
There are worse use cases for such trashbots, like cybersecurity, but beyond a certain point playing the game of “which is worse” serves no real purpose. What actually matters is prioritizing the development and deployment of viable technologies, not sorting the contents of the tech industry’s landfill of failed technologies.
Viable technology looks nothing like trashbots or “agents” in ~99% of use cases. In most cases, you need some combinations of capacities that those technologies fundamentally lack, such as human-like understanding, reasoning, concept learning, explainability, transparency, data efficiency, sustainability, social learning, (non-trivial) alignment, memory, and more.
Those capacities have already been demonstrated, starting with a system we brought online a full 5 years ago, and ran for half of that time before beginning a rebuild for the next generation. The future is an option on the table, viable technology exists even if the frauds of tech don’t offer it, it is just a matter of full-time engineering hours now.
Statistically, sooner or later, someone will make the wise choice, even if only due to random and nonsensical impulses. Population dynamics can play out across all of the businesses that make the maximally stupid choices, granting the market a kind of natural selection.