September 24, 2023

071 - Features vs. Vulnerabilities

For those with a dark sense of humor, Security Researchers in AI behind some of the leading papers and recent OWASP documents on the subject of AI Security, have entered a noteworthy debate. The subject of that debate was if you could actually call all of the fundamental flaws in systems like LLMs, including confabulations/hallucinations, misalignment, and prompt injection "vulnerabilities" since they are technically "features".

The debate is largely semantic, but it highlights a critically important lesson that most people still need to learn. The Cybersecurity "vulnerability" of these systems is a "feature", not a bug, and that "feature" is added to every system they are integrated with. Why is Microsoft getting breached every month? Possibly because they've added a new "feature" to their product lines.

These systems have also been accurately described as "an auto-complete function that ate the internet", and such a function is specifically designed to confabulate/hallucinate, which is a feature of the architecture.

Systems like LLMs are also architected specifically not to include any factual grounding or philosophical understanding. The data they train on is effectively turned into token-based confetti as a feature of the process, since they can't store or process human-like concepts, including those required for alignment.

The one known exception to this is when an LLM is tightly bound on both sides and used as a communication device by a working cognitive architecture. This is a very different process than how humans interact with such systems, putting them to work in one very specific way where they can function reasonably well.

If any government, corporation, organization, or other entity demands Cybersecurity, Factuality & Explainability, and/or Alignment, what they demand is mutually exclusive to the architecture of LLMs.

LLMs alone do not, and fundamentally cannot offer these things, any more than a toaster oven can do your taxes. In both cases, starting a fire is the predictable outcome.