Copilot and ChatGPT went against a 4 KB Atari chess game from the 70s — with an embarrassing effort from Microsoft's AI
Despite promising "a strong fight", Microsoft's golden AI chatbot couldn't even remember the chessboard state.

Citrix engineer Robert Caruso recently posted on LinkedIn about testing both Microsoft's Copilot chatbot and ChatGPT against an Atari chess game from the 1970s — a challenge that left the 46-year-old, 4 KB Atari 2600 Video Chess game undefeated against modern Artificial Intelligence.
According to Caruso, the decision to pit ChatGPT against an Atari-built "AI" was born from a conversation with the chatbot regarding the differences in the open-source chess engine Stockfish and the program AlphaZero. The conversation took a turn, Caruso shared, when ChatGPT's chatbot claimed it was "a strong player in its own right and would easily beat Atari's Video Chess."
So, the engineer set up a game of chess in the Atari 2600 Video Chess game, originally released in 1979, via the Stella emulator. During the match-up, ChatGPT confused the game pieces and lost track of the board state, even with assistance from Caruso correcting board awareness in a 90-minute match that ultimately led to the chatbot's defeat at the beginner level. Not a great start.
The battle between humans, AI, and the game of chess has been raging on since the late 90s, when an IBM Deep Blue supercomputer defeated the Russian chess grandmaster and former World Chess Champion, Garry Kasparov.
It was only natural that Caruso couldn't stop with a failed match between ChatGPT and Atari. He decided to repeat the experiment with Microsoft's golden child, Copilot. "Imagine everyone's head exploding if a MICROSOFT product outperformed ChatGPT," Caruso wrote.
He repeated the beginning of the experiment with Microsoft's Copilot AI, the same as he had with ChatGPT, by having a pre-game "conversation" with the chatbot. Copilot reportedly claimed it could keep track of the board, unlike ChatGPT, but once Caruso asked Microsoft's chatbot to render the board as it imagined it, the reality of the match became clearer.
Copilot's board was different from the previous screenshot Caruso had fed into it.
All the latest news, reviews, and guides for Windows and Xbox diehards.
"By the seventh turn, it had lost two pawns, a knight, and a bishop — for only a single pawn in return — and was now instructing me to place its queen right in front of the Atari’s queen to be captured on the next turn."
Atari 2600 Video Chess: 2.
Modern LLMs: 0.
Caruso has suggested he may even retry the experiment with other LLMs such as Google Gemini.
I can't get on board with these AI chatbots
With the rise of consumer-facing AI technology comes an overwhelming need for companies to boast about what their Large Language Model chatbots are capable of. We're constantly fed a steady drip of news posts about how AI is going to replace software engineers and middle managers.
When put to the actual test, however, the cracks in modern AI's PR image start to show. These AI chatbots can't even best a nearly 50-year-old program that consists of 4 KB of data at a game we teach children to play.
LLM AI like ChatGPT and Microsoft Copilot simply aren't capable of abstract thinking and persistent memory, even when an engineer is desperately anthropomorphizing the chatbot and giving it every opportunity to succeed. The models don't actually "learn" anything in a meaningful capacity; they simply regurgitate what has been fed to them, like a glorified predictive text module.
Even Microsoft's former co-founder, Bill Gates, doesn't believe that AI can be trained to replicate creativity and human judgment. Still, companies like OpenAI, Google, and Microsoft continue to push for humans to be replaced by AI even for managerial roles.
Microsoft executives touted AI tools for resume writing on LinkedIn following the most valuable company in the world's recent wave of mass layoffs — which, in and of themselves, were a tool for Microsoft to further bolster its investments in AI.
If AI can't tell the difference between a rook and a bishop in an 8-bit game from the 70s, why on earth are we supposed to believe that this technology is capable of tracking sensitive medical data? Or generate solutions for energy systems? And why are we so determined to burn up our planet trying to do it?

Cole is the resident Call of Duty know-it-all and indie game enthusiast for Windows Central. She's a lifelong artist with two decades of experience in digital painting, and she will happily talk your ear off about budget pen displays.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.