homehome Home chatchat Notifications


From hallucinations to discovery: For the first time a large language model finds new solutions to math problems

DeepMind's FunSearch revolutionizes AI, mastering complex challenges like the Cap Set Problem and ushering in a new age of machine-led scientific breakthroughs."

Tibi Puiu
December 15, 2023 @ 5:31 pm

share Share

illustration of AI finding new solutions in math
Credit: DALL-E 3.
Key takeaways:
  • 🤖 DeepMind’s FunSearch is a groundbreaking AI tool pairing a language model with an evaluator for problem-solving.
  • 🧮 FunSearch successfully tackles the Cap Set Problem, offering new insights and solutions in combinatorics that have not been seen in 20 years.
  • 🌟 This represents a significant leap in AI-assisted scientific discovery, with potential applications in various fields.

Large Language Models (LLMs) like ChatGPT have a lot of things going for them. These powerful AI systems can synthesize and interpret vast amounts of information and are surprisingly human-like with language. At the same time, they’re also notorious for making up facts with confidence. Put simply, they “hallucinate”, as people have come to describe this annoying behavior.

A huge question ever since this technology was released is whether LLMs are capable of discovering new knowledge, rather than repurposing and rehashing existing information. As it turns out, they can.

Researchers at Google’s DeepMind branch have shown a new AI method called FunSearch, which can forge new paths to find solutions to complex problems in mathematics and computer science.

The innovation of FunSearch lies in the pairing of a pre-trained LLM with an automated evaluator. This setup is designed to leverage the LLM’s strength in generating creative solutions in the form of computer code, while the evaluator rigorously checks these solutions for accuracy. The highest-performing solutions are continuously fed back into the cycle, fostering a self-improving loop of problem-solving and innovation.

This partnership enables an iterative refinement process, transforming initial creative outputs into verified, novel knowledge. The focus on discovering “functions” in computer code is what gives FunSearch its distinctive name and operational approach.

FunSearch process schematic
Schematic of how FunSearch works in finding novel solutions to open problems in math and computer science. Credit: DeepMind.

This initiative marks the first time LLMs have contributed to solving open problems in the scientific and mathematical community. FunSearch found novel solutions to the cap set problem, a long-standing mathematical challenge.

The Cap Set Problem in mathematics involves finding the largest subset of integers from 0 to 3n−1 (where each integer is represented in base 3) such that no three integers in the subset sum to another integer in base 3. It’s a challenge in combinatorics, a field concerned with counting, arrangement, and structure. Terence Tao, the highest IQ person in the world and one of the world’s leading mathematicians, once described the cap set problem as one of his favorite open questions in the field.

FunSearch succeeded in discovering new, larger cap sets, contributing valuable insights to the problem and demonstrating the potential of AI in advancing mathematical research. FunSearch’s contribution marks the largest increase in the size of cap sets in the past two decades.

“These results demonstrate that the FunSearch technique can take us beyond established results on hard combinatorial problems, where intuition can be difficult to build. We expect this approach to play a role in new discoveries for similar theoretical problems in combinatorics, and in the future it may open up new possibilities in fields such as communication theory,” wrote the DeepMind researchers in a blog post.

Moreover, FunSearch has proven itself further by enhancing algorithms for the “bin-packing” problem. The bin-packing problem is a classic algorithmic challenge. It involves efficiently packing objects of different sizes into a finite number of bins or containers in a way that minimizes the number of bins used.

Illustrative example of bin packing using existing heuristic – Best-fit heuristic (left), and using a heuristic discovered by FunSearch (right).

Contrary to many computational tools that offer solutions without explanation like a “black box”, FunSearch provides a detailed account of how its conclusions are reached.

“This show-your-working approach is how scientists generally operate, with new discoveries or phenomena explained through the process used to produce them,” add the DeepMind researchers.

The ability of FunSearch to not only generate innovative solutions but also provide the details of the problem-solving process holds immense potential. With the continual advancement of LLM technology, the capabilities of tools like FunSearch are expected to expand, paving the way for groundbreaking discoveries and solutions to some of society’s most pressing scientific and engineering challenges.

The findings were reported in the journal Nature.

share Share

Dinosaurs Were Doing Just Fine Before the Asteroid Hit

New research overturns the idea that dinosaurs were already dying out before the asteroid hit.

Denmark could become the first country to ban deepfakes

Denmark hopes to pass a law prohibiting publishing deepfakes without the subject's consent.

Archaeologists find 2,000-year-old Roman military sandals in Germany with nails for traction

To march legionaries across the vast Roman Empire, solid footwear was required.

Mexico Will Give U.S. More Water to Avert More Tariffs

Droughts due to climate change are making Mexico increasingly water indebted to the USA.

Chinese Student Got Rescued from Mount Fuji—Then Went Back for His Phone and Needed Saving Again

A student was saved two times in four days after ignoring warnings to stay off Mount Fuji.

The perfect pub crawl: mathematicians solve most efficient way to visit all 81,998 bars in South Korea

This is the longest pub crawl ever solved by scientists.

This Film Shaped Like Shark Skin Makes Planes More Aerodynamic and Saves Billions in Fuel

Mimicking shark skin may help aviation shed fuel—and carbon

China Just Made the World's Fastest Transistor and It Is Not Made of Silicon

The new transistor runs 40% faster and uses less power.

Ice Age Humans in Ukraine Were Masterful Fire Benders, New Study Shows

Ice Age humans mastered fire with astonishing precision.

The "Bone Collector" Caterpillar Disguises Itself With the Bodies of Its Victims and Lives in Spider Webs

This insect doesn't play with its food. It just wears it.