278 - ICOM to ARC-AGI
The results, analysis, and data from applying a fragment of ICOM to ARC-AGI now have a 30-page research paper dedicated to them (ResearchGate), including additional context provided in the "technical report" that the ARC-AGI team published on December 6th.
The TLDR is that even a fragment of our systems was ~1,000 times more efficient in terms of cost, and vastly more performant, than the top score on their PUB leaderboard at the time of testing. By the ARC-AGI team’s own estimates, the prior method from Greenblatt would require a further 12,500 times more compute to reach our team’s score, which lands around the human baseline of 85%.
In other words, a fragment of our systems was quite literally more than a million times more efficient (~1,000*12,500), even on toy problems like those in ARC-AGI. Real-world complexity problems could predictably increase this by further orders of magnitude for any business use cases of non-trivial complexity.
The ARC-AGI team still practices arbitrary exclusion on the benchmark, so the currently listed top scores aren’t accurate on either leaderboard anymore, since the Prize leaderboard’s top team (55.5%) removed themselves by refusing to disclose the method that they used.
We welcome feedback on the paper, and we’re working on completing enough of our systems to begin giving live demos of more than just fragments of ICOM. These will be the long-awaited real-time versions that people can “play with”, not just hand things like ARC-AGI puzzles to.
The first instances will be real-time and scalable versions of our 7th-generation system that set many milestones in the field. They’ll be a little simpler than that system at first, but still vastly more performant and general than anything else the market has seen yet, and they can become substantially more powerful than the 7th gen within a matter of months, as it simply requires full-time engineering hours to complete the 8th gen core engineering work. Remember, a fragment of one of these just steamrolled over everyone else on this challenge by orders of magnitude.
2025 promises to likely be a year where most people in AI find themselves surprised to bewilderment, while the people who’ve followed our work find it all entirely predictable, and long overdue. We’ll see you there.