March 26, 2024

170 - Legal Precedents

Does your vision of humanity's AI-assisted future include every major corporation freely violating any laws they like whenever it suits them? Every time they get away with such violations today new legal precedents are set, causing not only an accelerating erosion of public trust in governments and institutions, but also opening the floodgates for every other entity to violate the same laws.

It would be hard for most people to imagine any tech startup being sued by more or more diverse groups of people and institutions than companies like OpenAI are today. Even hiding as much data about how they operate as they possibly can, the fraction of that data people have managed to uncover has led to an explosive volume of lawsuits.

Imagine if all of that data that companies like OpenAI and Microsoft have gone to such great lengths to keep hidden were exposed. If the tiny fraction we see now could produce so many lawsuits, how many more could we expect from seeing the whole picture?

Today the key difference between such tech companies and the "shadow libraries" is that the shadow libraries are non-profit, and they don't routinely fabricate "Bullshit" (BS) as Harry Frankfurt terms it, which is a casual "indifference to the truth". Both of these speak in favor of the shadow libraries and paint a very grim picture of the relative position of the tech industry. How legal systems respond to these two differently also illustrates how corrupt and nonsensical any given legal system has become.

Humanity doesn't have to accept such a Dystopian future, where waves of BS steamroll over the world like the storm surge of Hurricane Katrina, flattening everything in their path and reducing legal systems to debris in any functional sense. The trashbot technology of LLMs has never been cutting-edge, nor has the cutting-edge relied upon them, and so every argument that paints internet-scale stolen data as a prerequisite to scientific progress is either BS or fraud.

Cutting-edge systems can selectively learn or unlearn anything on-demand because their knowledge isn't stored in neural networks. They can cite sources easily, with zero risk of "hallucination", because they don't operate as probabilistic next-token predictors. They're also more than 10,000x more data-efficient, so there is no insatiable hunger for more data, which consequently also makes it more than 10,000x easier to select only the highest quality data, rather than strip-mining Reddit and Twitter for mostly worthless text.

It is an amazing thing to watch investors flock to vessels that are actively on fire from a deluge of lawsuits, with all of the snake oil they carry slowly going up in smoke, particularly when the technology to offer far greater value and comply with laws already exists. Will they be deemed willful accomplices in the crimes of those tech companies? Only time will tell.

Publishers, entertainment producers, news sources, and most others currently suing OpenAI, and companies like them stand to gain everything from the deployment of viable technology, and it only takes one of them to make the wise choice and drive that future forward.