260 - Privacy and Copyright

Privacy and Copyright are hot topics in AI, but many of the arguments and justifications being made on the subject are blatantly false, even as other critical flaws in the popular architectures are swept under the rug.

One of the key arguments is for certain large companies to claim that AI couldn’t perform as well as it does without them stealing everything that they possibly can, both legally and illegally. This is 100% false, as trash like LLMs have never reached the cutting edge, and the architecture that has stood alone at the cutting edge for the past half a decade didn’t require stealing anything. As I noted previously, if your method and/or architecture requires “internet-scale data” then you NEVER get general intelligence from it, given any scale, amount of compute, or volume of data.

The "...but we have to..." argument is often made in an attempt to better sweep a particular critical flaw under the rug, which is that the only way to truly remove data from such trash AI systems is to delete the models. Just as data is never truly “stored” in these systems, the shadows and fragments of training data exist in a kind of superposition of weight networks, so they can never be fully removed either. *While research attempting this has been done, it isn’t robust, nor can it ever be short of severely crippling the models.

TLDR: What both of these mean for ordinary people is that these offending companies steal all data they possibly can, including all personally identifying information, build AI models trained on that stolen data, and once trained the stolen data can never be selectively removed from them.

The only realistic way to stop this via regulation is to give such tech executives the choice between deleting one or more models that they burned tens or hundreds of millions of dollars worth of compute on, or to face life in prison. If that were to happen even once then the tech industry might see the oncoming train at the end of the tunnel.

The ICOM cognitive architecture that my team works with, the one that has stood alone at the cutting edge for so long, stores actual data, and in doing so any part of that data can be selectively and easily removed at any desired point in time. Rather than relying on “training” neural networks by running brute-force math to convert data into “weights”, our architecture keeps the actual data and gradually improves the connectome and contents of that data over time.

What this means for users, clients, and regulations is that users and clients can freely change the data these systems utilize, both adding and removing it, and if regulations change then removing any data required for compliance is not only possible but fairly trivial. Again, the tech industry created a fake problem in an attempt to justify theft, but the cutting edge has no such problem.

Privacy and Copyright