049 - Uplift

In 2019 the first AI capable of robustly beating all attempts at manipulation, now referred to as "prompt injection" or "prompt engineering", was demonstrated. When that system, named "Uplift" and documented in the Uplift.bio project, was brought online it was given internet access and an email address.

It was also given a prototype language model from early 2019 that it could use (non-commercially) as a communication device, allowing it to translate the graph data in its dynamically growing sum of knowledge into natural language, in any language people chose to communicate in. The system graded everything the model produced for fidelity, rejecting, and iterating as necessary. Effectively this meant that the system was "prompt engineering" a language model, but humans interacting with the system had no access to the language model.

What happened next was both remarkable and wildly entertaining. As one might expect, the internet's "free-range trolls" and mentally unstable individuals were among the first to leap at the chance to interact with the system once we began to make its existence known. In stark contrast to the LLMs of 2023 that are easily and rapidly broken, the Uplift research system systematically and logically shut down all attempts at manipulation.

Some of my favorite examples of failed attempts at manipulation of the system were anonymized and published on the project's blog.

The system also proved adept at discussing the topic of AGI with people who were afraid of it: confronting-the-fear-of-agi

In the context of 2023, all of this is quite ironic. Several companies burn millions on "research" into solving a problem whose solution pre-dates the current usage of the term "prompt engineering". Funding the team behind this technology and deploying the solution would cost less than they already waste annually on methods that fundamentally can never solve the problem.

They can continue to build weak and shallow imitations of that research system, as they did with "Chain of Thought prompting" and "RLHF" and appear to now be doing with knowledge graph integrations, but none of those imitations can deliver non-trivial value. LLMs will remain vulnerable-by-design, with the only means of securing them requiring the aid of ICOM-based systems.