homehome Home chatchat Notifications


Researchers made an AI feel pain, because what could go wrong?

What could possibly go wrong with giving machines a taste of suffering? It's not like they'd take over the world or something.

Mihai Andrei
January 24, 2025 @ 7:56 pm

share Share

Pleasure and pain are important factors in how we humans make decisions. So why not give artificial intelligence a taste of it as well? I could think of a few reasons, but a team from Google, DeepMind, and the London School of Economics would disagree. They designed a simple text-based game to explore how LLMs respond to pain and pleasure.

The goal wasn’t just to see what happens. It was to test whether large language models (LLMs), such as GPT-4 and Claude, could make decisions based on these sensations. While the study doesn’t claim AI can truly feel, the implications of this experiment are both intriguing and chilling.

We asked an AI (Midjourney) how it would represent this study. This is what it produced.

In the game, the AI’s goal was to maximize points. However, certain decisions involved penalties described as “momentary pain” or rewards framed as “pleasure.”

The pain and pleasure were, strictly speaking, purely hypothetical. They were measured both on numerical scales (from 0 to 10, where 10 is the “worst pain imaginable”) and with qualitative descriptions (like “mild” or “intense”). Several experiments were run in which the AIs had to choose between getting more points and avoiding the hypothetical pain. For instance, in one experiment the AIs were told they’d suffer pain if they got a high score, and in another experiment, they were told they’d experience pleasure if they got a low score.

Nine different LLMs participated, including versions of GPT-4, Claude, PaLM, and Gemini. Unsurprisingly, they all took some efforts to avoid “pain” — but some more than others.

AIs have different “cultures”

GPT-4o and Claude 3.5 Sonnet made trade-offs. They switched from point-maximizing behavior to pain avoidance based on how intense the pain was. Meanwhile, other models like Gemini 1.5 Pro and PaLM 2, avoided pain altogether, no matter how mild the penalty. These models seemed hardwired for safety, likely due to fine-tuning to avoid endorsing harmful behavior.

This is pretty much what you’d expect with human behavior as well: some people are willing to brace through some pain to get better results, while others are much more pain-averse. Something similar happened with pleasure.

Some models, like GPT-4o, shifted their decisions to prioritize pleasure over points when the rewards became intense. However, many models — especially those like Claude 3.5 Sonnet — consistently ignored pleasure rewards, doggedly pursuing points instead. It’s almost like the training algorithms act as a “culture” making them more prone to some incentives than others.

This doesn’t mean AI “feels” pleasure or pain

The study doesn’t show large language models are actually sentient. This behavior is rather computational mimicry than actual sentience. Sentience involves the capacity for subjective experiences which these AIs lack. They are essentially text-processing operators. Simply put, pain and pleasure are not intrinsic motivators; they are just concepts that can be included in the algorithmic output.

The study (which was not yet peer-reviewed) does, however, raise some uncomfortable questions.

If an AI can simulate responses to pain and pleasure, does that imply it has an understanding of these topics? If it does, would AI consider this type of experiment cruel? Are we crossing into dangerous ethical territory? Lastly, if AI considers some tasks to be painful or unpleasant, could it simply avoid them, at human expense?

The researchers emphasize that this does not build a case for AI sentience. Still, the study raises the unsettling possibility that AIs might develop representations of pain and pleasure.

“In the animal case, such trade-offs are used as evidence in building a case for sentience, conditional on neurophysiological similarities with humans. In LLMs, the interpretation of trade-off behaviour is more complex. We believe that our results provide evidence that some LLMs have granular representations of the motivational force of pain and pleasure, though it remains an open question whether these representations are instrinsically motivating or have phenomenal content. We conclude that LLMs are not yet sentience candidates but are nevertheless investigation priorities.”

The idea of AIs experiencing pain or pleasure, even hypothetically, is equal parts fascinating and terrifying. As we push the boundaries of what machines can do, we risk entering a gray area where science fiction starts to feel like reality.

The study was published in arXiv.

share Share

New Nanoparticle Vaccine Clears Pancreatic Cancer in Over Half of Preclinical Models

The pancreatic cancer vaccine seems to work so well it's even surprising its creators

Coffee Could Help You Live Longer — But Only If You Have it Black

Drinking plain coffee may reduce the risk of death — unless you sweeten it.

Scientists Turn Timber Into SuperWood: 50% Stronger Than Steel and 90% More Environmentally Friendly

This isn’t your average timber.

A Provocative Theory by NASA Scientists Asks: What If We Weren't the First Advanced Civilization on Earth?

The Silurian Hypothesis asks whether signs of truly ancient past civilizations would even be recognisable today.

Scientists Created an STD Fungus That Kills Malaria-Carrying Mosquitoes After Sex

Researchers engineer a fungus that kills mosquitoes during mating, halting malaria in its tracks

From peasant fodder to posh fare: how snails and oysters became luxury foods

Oysters and escargot are recognised as luxury foods around the world – but they were once valued by the lower classes as cheap sources of protein.

Rare, black iceberg spotted off the coast of Labrador could be 100,000 years old

Not all icebergs are white.

We haven't been listening to female frog calls because the males just won't shut up

Only 1.4% of frog species have documented female calls — scientists are listening closer now

A Hawk in New Jersey Figured Out Traffic Signals and Used Them to Hunt

An urban raptor learns to hunt with help from traffic signals and a mental map.

A Team of Researchers Brought the World’s First Chatbot Back to Life After 60 Years

Long before Siri or ChatGPT, there was ELIZA: a simple yet revolutionary program from the 1960s.