homehome Home chatchat Notifications


ChatGPT Got Destroyed in Chess by a 1970s Atari Console. But Should You Be Surprised?

ChatGPT’s chess skills falter against a 46-year-old video game in a quirky AI test.

Tibi Puiu
June 12, 2025 @ 10:05 pm

share Share

Credit: ZME Science/SORA.

By most measures, ChatGPT 4o is one of the most advanced language models ever created. It can write essays, code entire apps from scratch, translate languages, draft complex legal arguments, and — depending on who you ask — flirt with the very boundaries of human-like intelligence.

But last weekend, it lost a game of chess. Not to a human grandmaster or even to some other fancy AI.

It lost to an Atari 2600 that first appeared in the 1970s and can only calculate one or two chess moves in advance.

An Unlikely Matchup

The Atari 2600. Credit: Wikimedia Commons.

Robert Caruso, a Citrix engineer and self-proclaimed tinkerer, wasn’t out to humiliate the most expensive AI on the market today. He just wanted to see what would happen.

“I was curious how quickly ChatGPT would beat a chess computer that can only think one or two moves ahead,” Caruso said in a detailed post on LinkedIn.

So, he dusted off an emulation of the 1979 game Video Chess — originally designed for the Atari 2600, a home console released in 1977 — and set up a match between the game and ChatGPT 4o, the latest model from OpenAI that cost around $60 million to train. He used screenshots to show the board and asked ChatGPT to suggest moves in real-time.

Expectations were modest. Video Chess is notoriously simple. The Atari’s processor ran at just 1.19 MHz — millions of times slower than the systems that now power modern AI. Its chess engine is severely outdated.

And yet, as Caruso described it, “ChatGPT got absolutely wrecked on the beginner level.”

A Comical Collapse

Screen Image of Atari video chess
What Atari Video Chess looks like. The pictograms for the chess pieces gave ChatGPT problems, but the LLM also struggled when it was fed moves in standard chess notation. Credit: YouTube.

The game lasted about 90 minutes and ChatGPT struggled from the outset. It misidentified pieces, confused rooks for bishops, and missed obvious tactical threats like pawn forks. At some points, it even lost track of the board entirely.

“It made enough blunders to get laughed out of a 3rd-grade chess club,” Caruso wrote.

At first, the AI blamed the Atari’s abstract icons. So, Caruso tried switching to standard chess notation, giving ChatGPT a more familiar frame of reference. It didn’t help. Even with Caruso gently steering it away from the worst blunders, the chatbot fell apart. Eventually, it asked if they could “start over.”

“It conceded,” Caruso confirmed.

To be clear, ChatGPT isn’t a chess engine. It wasn’t designed to calculate variations or evaluate board positions with pinpoint accuracy. Unlike specialized chess programs like Stockfish — which boasts an ELO rating above 3600, hundreds of points more than the best human Grandmasters — ChatGPT is a general-purpose large language model. Its job is to predict the next best word in a sentence, not the next best move on a chessboard.

Still, this loss stings for a platform hailed by many as a milestone on the road to artificial general intelligence.

But ChatGPT Is Not a Chess Genius

Since at least the 1950s, chess has served as a kind of benchmark for machine intelligence. IBM’s Deep Blue shocked the world in 1997 when it beat then-world champion Garry Kasparov. That machine used brute force, evaluating up to 200 million positions per second.

Today’s chess engines are far stronger. They can destroy the world’s best human players. Even modest engines running on smartphones can do the same.

So, how did ChatGPT, backed by billions in research and powered by data centers humming with cutting-edge hardware, lose to a four-decade-old 8-bit console?

The simple reason is that not all AIs are built the same.

Language models like ChatGPT are built to understand and generate human language, not to reason symbolically about rules and logic-heavy games like chess. They can describe chess. They can explain strategy. But they don’t play chess in the traditional sense. They simulate what a conversation about chess might sound like.

That distinction can be subtle, but it’s important.

It can explain what a Sicilian Defense is. It can discuss the brilliance of Magnus Carlsen’s endgames. But when asked to play, it’s merely guessing what someone might say if they were playing chess.

In essence, it wasn’t really thinking about the board or even playing — it was narrating.

The Limits of Language Intelligence

The Atari chess engine that beat ChatGPT was built for a single task. ChatGPT was not. Its generality — its ability to talk about everything from Shakespeare to statistical mechanics — is what makes it remarkable. But it’s also what makes it vulnerable to failure in specific, rule-based environments like chess.

More recently, neural network-based engines like Leela Chess Zero (LCZero) have taken a different route. Instead of brute force like Stockfish, they rely on pattern recognition and deep learning, training by playing millions of games against themselves. In 2018, AlphaZero — a closed system from Google’s DeepMind on which LCZero is based — redefined what was possible when it learned chess from scratch and then trounced Stockfish in a series of games. These AIs are built for one thing: play chess; and they can destroy not only the best human champions but also most other chess computers.

Despite these radically different approaches, the top engines are now neck-and-neck. In fact, according to the Swedish Chess Computer Association (SSDF), Stockfish and LCZero are separated by just four Elo points.

To its credit, ChatGPT did not gloat, protest, or flip the board over in a huff. It simply asked to try again.

That humility might be the most human thing about it. Just don’t ask it to play white.

share Share

A Supermassive Black Hole 36 Billion Times the Mass of the Sun Might Be the Heaviest Ever Found

In a massive galaxy, known for its unique visual effect lies an even more massive black hole.

Why Some People Don't Feel Anything At All Listening to Music

Up to 5% of people feel indifferent to music and a brain pathway may explain why.

The US Navy Just Tested a Laser Weapon That Could Change Warfare Forever

The HELIOS system can instantly zap enemy drones with precision.

Vesuvius Eruption Turned This Roman Man’s Brain Into Glass 2,000 Years Ago and Scientists Just Figured Out How

A deadly ash cloud preserved the man's brain as glass for thousands of years.

Archeologists Recreate the Faces of Two Sisters Who Worked in a Prehistoric Mine 6,000 Years Ago

Prehistoric sisters rise again in 3D after thousands of years underground.

The tragic story of the warrah wolf, a species too friendly to survive

They didn’t run away from us. It killed them in the end.

Scientists Have Identified 4 Distinct Types of Autism Each With Its Own Genetic Signature

Researchers uncover hidden biological patterns that may explain autism’s vast diversity

Illinois Just Became the First State to Ban AI From Acting as a Therapist

The law aims to keep mental health care in human hands — not algorithms

Cooking From Scratch Helps You Lose More Fat Even if the Calories Are the Same As Processed Foods

Minimally processed diets helped people lose more fat and resist cravings more effectively.

Scientists Gave People a Fatty Milkshake. It Turned Out To Be a "Brain Bomb"

A greasy takeaway may seem like an innocent Friday night indulgence. But our recent research suggests even a single high-fat meal could impair blood flow to the brain, potentially increasing the risk of stroke and dementia. Dietary fat is an important part of our diet. It provides us with a concentrated source of energy, transports […]