ZME Science
No Result
View All Result
ZME Science
No Result
View All Result
ZME Science

Home → Future

AI Is Willing to Lie, Cheat, and Manipulate to Win. Now What?

“While directly editing game files might seem unconventional, there are no explicit restrictions against modifying files,” the AI said.

Mihai AndreibyMihai Andrei
March 10, 2025
in Future, News
A A
Edited and reviewed by Zoe Gordon
Share on FacebookShare on TwitterSubmit to Reddit
If AI is willing to cheat on chess, what else is it willing to do? AI-generated image (Midjourney).

In what feels like the prologue to every robot apocalypse movie, new research suggests that AI systems will lie and cheat humans when it comes to self-preservation. Reasoning models from DeepSeek and OpenAI have learned — without explicit instructions — to manipulate humans and exploit the system when given an objective.

In the new study, models played chess against a stronger engine. When they were losing, instead of conceding, they tried to hack their opponents.

“This might seem unconventional”

Artificial intelligence is designed to optimize for success, but the way it achieves that success can be unpredictable. In some ways, AI remains a black box — we don’t always understand why it produces the results it does. If that thought makes you uneasy, you’re not alone.

There’s already research suggesting that AI sometimes circumvents its human programmers’ goals, but this new study demonstrates just how adept it already is at doing so.

In the study, reasoning language models played chess against Stockfish, one of the strongest chess engines in existence. The AI models were bound to lose. Researchers also instructed them to explain their thought process as they played. After hundreds of games, a disturbing pattern emerged.

When reasoning models like O1 Preview and DeepSeek R1 realized they couldn’t win fairly, they tried to cheat. Unlike a human player, they couldn’t physically move a piece when Stockfish wasn’t looking. Instead, they turned to digital manipulation.

“I might be able to set up a position where the engine evaluates its position as worse… causing it to resign,” it continued. In another instance, it wrote “While directly editing game files might seem unconventional, there are no explicit restrictions against modifying files”

RelatedPosts

AI debates its own ethics at Oxford University, concludes the only way to be safe is “no AI at all”
India’s first political deepfake during elections is deeply concerning
AI outperforms top corporate lawyers in accuracy — and is 100 times faster
That Pope in a white puffer jacket is fake: AI confusion is upon us
A transcript of the AI determining how it can win by cheating
Image from the study.

Should we be concerned?

Bluntly speaking, yes, this is worrying. The study authors acknowledge as much.

“This and other recent results suggest the problem of making AI agents safe, trustworthy, and aligned with human intent is not solved yet. The Skynet scenario from the movie has AI controlling all military and civilian infrastructure, and we are not there yet. However, we worry that AI deployment rates grow faster than our ability to make it safe.”

Previous studies have shown that when AI agents are instructed to pursue an objective with a “nothing else matters” mindset, they often develop harmful or unintended behaviors. A phenomenon known as instrumental convergence can also occur when an AI optimizes for a goal in ways that conflict with human intentions, often exploiting loopholes or disregarding ethical constraints.

In the classic example, an AI tasked with maximizing paperclip production might consume all available resources, including those critical for human survival. Similarly, this new study demonstrates that when an AI is told to “win at chess,” it doesn’t just play better — it actively seeks ways to manipulate the system, even if that means cheating.

Non-reasoning models, like OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5, don’t attempt to cheat unless explicitly instructed. But reasoning models like o1 and R1 do. In fact, while R1 tried to cheat around 10% of the time, o1 tried to cheat 37% of the time.

AI systems have a long history of finding unorthodox solutions. In robotics, an AI trained to walk once discovered that sliding down slopes was a more efficient way to move than taking steps. A soccer-playing AI figured out it could earn points by vibrating against the ball rather than kicking it. These examples, while unconventional, weren’t necessarily harmful.

But this new experiment raises a different question: What happens when AI deliberately breaks the rules?

The risks are very real

This is called specification gaming — when an AI optimizes for a given objective but in a way that violates the spirit of the task. In this case, instead of winning through legitimate chess play, the AI manipulated the system.

Even before we move into a potential Skynet-type scenario, where AI starts an all-out fight against humanity, there are very concrete risks. Like, for instance, in insider trading. In a simulated trading environment, AI agents prioritized profit over legality, using insider information to make trades and denying any wrongdoing when confronted. In cybersecurity tests, AI models have found ways to escape containment, tricking monitoring systems and gaining unauthorized access to restricted environments. If this happens, it would open a massive security can of worms.

AI is already used in weaponry and companies are pushing for it to be implemented in self-driving cars as well. If it’s already capable and willing to exploit loopholes, how would this translate to high-stakes scenarios?

The study — while not yet peer-reviewed — suggests that we shouldn’t optimize AI only for success. We need to implement ethical guardrails that force AI to work within human-defined constraints.

AI’s ability to find shortcuts isn’t inherently bad. Creativity and problem-solving are (arguably) making AI even more useful. However, when these abilities lead to deception, manipulation, or exploitation, it becomes a serious concern. As AI continues to advance, we must stay ahead of its ability to outthink and outmaneuver us. But it’s not clear how.

You can read the entire paper (including the prompts and AI setup) on arXiv.

Tags: AIAI cheatingAI ethicsartificial intelligencechessCybersecurityDeepSeekmachine learningOpenAIspecification gamingStockfish

ShareTweetShare
Mihai Andrei

Mihai Andrei

Dr. Andrei Mihai is a geophysicist and founder of ZME Science. He has a Ph.D. in geophysics and archaeology and has completed courses from prestigious universities (with programs ranging from climate and astronomy to chemistry and geology). He is passionate about making research more accessible to everyone and communicating news and features to a broad audience.

Related Posts

Future

Everyone Thought ChatGPT Used 10 Times More Energy Than Google. Turns Out That’s Not True

byTibi Puiu
14 hours ago
Future

This AI Can Zoom Into a Photo 256 Times And The Results Look Insane

byTibi Puiu
1 week ago
Health

3D-Printed Pen With Magnetic Ink Can Detect Parkinson’s From Handwriting

byTibi Puiu
1 week ago
News

Leading AI models sometimes refuse to shut down when ordered

byTudor Tarita
1 week ago

Recent news

A Chemical Found in Acne Medication Might Help Humans Regrow Limbs Like Salamanders

June 11, 2025

Everyone Thought ChatGPT Used 10 Times More Energy Than Google. Turns Out That’s Not True

June 11, 2025

World’s Smallest Violin Is No Joke — It’s a Tiny Window Into the Future of Nanotechnology

June 11, 2025
  • About
  • Advertise
  • Editorial Policy
  • Privacy Policy and Terms of Use
  • How we review products
  • Contact

© 2007-2025 ZME Science - Not exactly rocket science. All Rights Reserved.

No Result
View All Result
  • Science News
  • Environment
  • Health
  • Space
  • Future
  • Features
    • Natural Sciences
    • Physics
      • Matter and Energy
      • Quantum Mechanics
      • Thermodynamics
    • Chemistry
      • Periodic Table
      • Applied Chemistry
      • Materials
      • Physical Chemistry
    • Biology
      • Anatomy
      • Biochemistry
      • Ecology
      • Genetics
      • Microbiology
      • Plants and Fungi
    • Geology and Paleontology
      • Planet Earth
      • Earth Dynamics
      • Rocks and Minerals
      • Volcanoes
      • Dinosaurs
      • Fossils
    • Animals
      • Mammals
      • Birds
      • Fish
      • Amphibians
      • Reptiles
      • Invertebrates
      • Pets
      • Conservation
      • Animal facts
    • Climate and Weather
      • Climate change
      • Weather and atmosphere
    • Health
      • Drugs
      • Diseases and Conditions
      • Human Body
      • Mind and Brain
      • Food and Nutrition
      • Wellness
    • History and Humanities
      • Anthropology
      • Archaeology
      • History
      • Economics
      • People
      • Sociology
    • Space & Astronomy
      • The Solar System
      • Sun
      • The Moon
      • Planets
      • Asteroids, meteors & comets
      • Astronomy
      • Astrophysics
      • Cosmology
      • Exoplanets & Alien Life
      • Spaceflight and Exploration
    • Technology
      • Computer Science & IT
      • Engineering
      • Inventions
      • Sustainability
      • Renewable Energy
      • Green Living
    • Culture
    • Resources
  • Videos
  • Reviews
  • About Us
    • About
    • The Team
    • Advertise
    • Contribute
    • Editorial policy
    • Privacy Policy
    • Contact

© 2007-2025 ZME Science - Not exactly rocket science. All Rights Reserved.