homehome Home chatchat Notifications


AI passes math test like an average high-school student

Researchers from the University of Washington and the Allen Institute for Artificial Intelligence (AI2) have developed a computer software that scored 49% on high-school geometry SAT tests - an average score for a human, but a great one for current AIs.

Mihai Andrei
September 22, 2015 @ 4:06 am

share Share

Researchers from the University of Washington and the Allen Institute for Artificial Intelligence (AI2) have developed a computer software that scored 49% on high-school geometry SAT tests – an average score for a human, but a great one for current AIs.

Image via Pixabay.

Considering how computers work, you’d think they should ace any test – especially math test – but the key difference is the way the test was presented. It wasn’t presented in a binary form or a form that the AI would naturally understand and perform well. It was presented in actual text, just like a regular student would receive it. This mean that it understood not only the explanations, but also the accompanying diagrams and charts.

“Unlike the Turing Test, standardized tests such as the SAT provide us today with a way to measure machines ability to reason and to compare its abilities with that of a human,” said Oren Etzioni, CEO of AI2. “Much of what we understand from text and graphics is not explicitly stated, and requires far more knowledge than we appreciate. Creating a system to successfully take these tests is challenging, and we are proud to achieve these unprecedented results.”

The AI is called GeoS, and its breakthrough is indeed significant. While software programmers have no problem putting things into a perspective that software can understand and crunch, they struggle quite a lot when they have to make a computer understand things like a human.

It works by reading and interpreting the text and diagrams, then matching it with possible logical solutions and puts them through its geometry solver. It then compares its solution with multiple choice options given in the paper.

“We’re excited about GeoS performance on real-world tasks,” said Ali Farhadi, a senior research manager at AI2. “Our biggest challenge was converting the question to a computer-understandable language. One needs to go beyond standard pattern matching approaches for problems like solving geometry questions that require in-depth understating of text, diagram and reasoning.”

GeoS is just one of the many projects which are currently trying to take different human exams. The Allen Institute’s Project Aristo is trying to master fourth grade science, while Fujitsu and IBM are working on passing the University of Tokyo entrance exam.

share Share

Ronan the Sea Lion Can Keep a Beat Better Than You Can — and She Might Just Change What We Know About Music and the Brain

A rescued sea lion is shaking up what scientists thought they knew about rhythm and the brain

Did the Ancient Egyptians Paint the Milky Way on Their Coffins?

Tomb art suggests the sky goddess Nut from ancient Egypt might reveal the oldest depiction of our galaxy.

Dinosaurs Were Doing Just Fine Before the Asteroid Hit

New research overturns the idea that dinosaurs were already dying out before the asteroid hit.

Denmark could become the first country to ban deepfakes

Denmark hopes to pass a law prohibiting publishing deepfakes without the subject's consent.

Archaeologists find 2,000-year-old Roman military sandals in Germany with nails for traction

To march legionaries across the vast Roman Empire, solid footwear was required.

Mexico Will Give U.S. More Water to Avert More Tariffs

Droughts due to climate change are making Mexico increasingly water indebted to the USA.

Chinese Student Got Rescued from Mount Fuji—Then Went Back for His Phone and Needed Saving Again

A student was saved two times in four days after ignoring warnings to stay off Mount Fuji.

The perfect pub crawl: mathematicians solve most efficient way to visit all 81,998 bars in South Korea

This is the longest pub crawl ever solved by scientists.

This Film Shaped Like Shark Skin Makes Planes More Aerodynamic and Saves Billions in Fuel

Mimicking shark skin may help aviation shed fuel—and carbon

China Just Made the World's Fastest Transistor and It Is Not Made of Silicon

The new transistor runs 40% faster and uses less power.