256 - Funnier Questions
One of the funnier questions that people ask about my team’s technology is how it competes with things like OpenAI’s LLMs in terms of speed. This is funny because OpenAI’s systems fundamentally can’t do what ours do at any speed, and futile attempts to try, like “o1” aka “Fraudberry” just end up looking absurd.
Even for the modest level of complexity found in ARC-AGI puzzles OpenAI’s newest absurdity took over 18 times longer for attempting to solve puzzles than the fragment of our systems that we applied to it. For that 18x+ increase in runtime, they managed to score all of 21% on the same benchmark where a fragment of our systems scored 83% (88% with an ensemble of 2 runs). A wrong answer from a simpler system may always be faster than our systems, but CoT (“Chain of Thought”) is very far from even matching our speed on modest complexity problems, and the performance of CoT is demonstrably crap.
They also likely required 10x as much hardware, but the precise depth of stupidity that their latest systems reach in terms of wasted hardware remains a closely guarded secret. They look bad enough already, with their finances telling the horrifying story of how even extremely steep discounts from Microsoft for compute can’t make them profitable, not even close.
Above the level of complexity required for ARC-AGI, you find practically every real-world problem that people want to apply AI to, and as that level of complexity increases the performance of systems like LLMs that were fundamentally never designed for it crumbles. You could run 100 billion GPUs and boil the ocean without such systems overcoming the difference because that difference between toy problems and real-world problems is astronomical.
The real world is full of combinatorial explosions. Chaos Theory and the Three-body Problem tell this story of why attempting to predict outputs based on the shadow of a multi-step process is feeble and futile. That is what neural networks do, and neural networks aren’t even designed to natively handle graph format data without shredding the vast majority of value that format offers.
Our systems are graph-native, with the graph dynamically growing and being refined, in both content and connectome, while storing actual data of any type desired, not the “weights” of running brute-force compute over that data. This structure and set of dynamics is a hard requirement for handling complexity and aligning systems with humans in any meaningful sense.
You can fail at any speed you like, and on any budget, if your technology and vision don't mesh with reality.