Sam Altman was right, the 'Dead Internet Theory' could kill the web within 3 years — "LLMs can suffer from brain rot!"

Young Teen in front of a laptop computer and on a bed, night time student

A new study claims that LLMs can get "brain rot" due to prolonged exposure to low-quality data. (Image credit: Getty Images | ljubaphoto)

Generative AI has evolved, scaling greater heights across a wide range of fields in computing, education, medicine, and more. The technology has come a long way from the early days, where it was synonymous with hallucinations and generating outright incorrect responses to queries.

As you may know, top AI labs like Anthropic, OpenAI, Google, and more are heavily dependent on content uploaded and otherwise shared by humans on the internet to train their LLMs (large language models). Last year, a report suggested that these companies had hit a wall due to a lack of high-quality content for training, preventing them from developing advanced AI models.

For context, this kind of internet "brain rot" refers to prolonged exposure and consumption of low-quality and trivial online content. Studies show that this negatively impacts human cognitive capabilities, reasoning, and focus. The same can also be said about AI-powered models.

The researchers used two measures to assess and identify internet junk content. The first test was centered on engagement with short and viral posts, with a lot of engagement, while the latter focused on semantic quality with a bias on posts considered as low-quality and rife with a clickbait writing style.

Consequently, the researchers used the measures to construct datasets containing varying proportions of junk and high-quality content. They used the datasets to determine the impact of low-quality content on LLMs like Llama 3 and Qwen 2.5.

The goal behind the study was to determine the impact on AI systems when they continuously depend on low-quality content uploaded to the web, which is seemingly flooded with short, viral, or machine-generated content.

Perhaps of more concern, the study revealed that the accuracy of AI models purely using junk content fell from 74.9% to 57.2%. Their long-context comprehension capabilities were also negatively impacted, dropping from 84.4% to 52.3%. The researchers further revealed that the AI models' cognitive and comprehension capabilities would only worsen with prolonged exposure to low-quality content for training, a phenomenon they referred to as a dose-response effect.

The study also revealed that prolonged exposure to low-quality content negatively impacted the models' ethical consistency, prompting a "personality drift". As a result, the models were even more prone to generating incorrect responses to queries, making them less reliable.

Exposure to junk data also impacted the models' thought process, often skipping the step-by-step chain of thought. This prompted the models to rush through the process only to generate superficial responses.

The Dead Internet Theory is turning into a reality

Sam Altman, chief executive officer of OpenAI, at the Hope Global Forums annual meeting in Atlanta, Georgia, US, on Monday, Dec. 11, 2023. — Sam Altman and Reddit's co-founder claim much of the internet today is already dead due to the rise of bots and AI. (Image credit: Getty Images | Bloomberg)

Over the past few months, top figures in the tech industry, including Reddit co-founder Alexis Ohanian and OpenAI CEO Sam Altman, have sparked interesting conversations about the "dead internet theory" becoming a reality in the agentic AI era.

Ohanian recently claimed that much of the internet today is dead because of the rise of bots and quasi-AI. However, he predicted the emergence of a next generation of social media that's verifiably human.

You all prove the point that so much of the internet is now just dead—this whole dead internet theory, right? Whether it’s botted, whether it’s quasi-AI, LinkedIn slop. Having proof of life, like live viewers and live content, is really f–king valuable to hold attention.
Reddit co-founder, Alexis Ohanian

OpenAI CEO Sam Altman seemingly shares similar sentiments, suggesting that the dead internet theory is manifesting right before our eyes. He further claimed that most X accounts are being managed by LLMs.

Last year, a study by Amazon Web Services (AWS) researchers suggested that 57% of content published online is AI-generated or translated using an AI algorithm, negatively impacting the quality of search results.

Former Twitter CEO and co-founder Jack Dorsey warned that it'll be impossible to tell what's real from the fake "because of the way images are created, deep fakes, and videos." He warned that users will need to be more vigilant and experience things by themselves to assert its authenticity.

Click to follow Windows Central on Google News

Follow Windows Central on Google News to keep our latest news, insights, and features at the top of your feeds!

TOPICS

Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.