homehome Home chatchat Notifications


AI passes math test like an average high-school student

Researchers from the University of Washington and the Allen Institute for Artificial Intelligence (AI2) have developed a computer software that scored 49% on high-school geometry SAT tests - an average score for a human, but a great one for current AIs.

Mihai Andrei
September 22, 2015 @ 4:06 am

share Share

Researchers from the University of Washington and the Allen Institute for Artificial Intelligence (AI2) have developed a computer software that scored 49% on high-school geometry SAT tests – an average score for a human, but a great one for current AIs.

Image via Pixabay.

Considering how computers work, you’d think they should ace any test – especially math test – but the key difference is the way the test was presented. It wasn’t presented in a binary form or a form that the AI would naturally understand and perform well. It was presented in actual text, just like a regular student would receive it. This mean that it understood not only the explanations, but also the accompanying diagrams and charts.

“Unlike the Turing Test, standardized tests such as the SAT provide us today with a way to measure machines ability to reason and to compare its abilities with that of a human,” said Oren Etzioni, CEO of AI2. “Much of what we understand from text and graphics is not explicitly stated, and requires far more knowledge than we appreciate. Creating a system to successfully take these tests is challenging, and we are proud to achieve these unprecedented results.”

The AI is called GeoS, and its breakthrough is indeed significant. While software programmers have no problem putting things into a perspective that software can understand and crunch, they struggle quite a lot when they have to make a computer understand things like a human.

It works by reading and interpreting the text and diagrams, then matching it with possible logical solutions and puts them through its geometry solver. It then compares its solution with multiple choice options given in the paper.

“We’re excited about GeoS performance on real-world tasks,” said Ali Farhadi, a senior research manager at AI2. “Our biggest challenge was converting the question to a computer-understandable language. One needs to go beyond standard pattern matching approaches for problems like solving geometry questions that require in-depth understating of text, diagram and reasoning.”

GeoS is just one of the many projects which are currently trying to take different human exams. The Allen Institute’s Project Aristo is trying to master fourth grade science, while Fujitsu and IBM are working on passing the University of Tokyo entrance exam.

share Share

Archaeologists May Have Found Odysseus’ Sanctuary on Ithaca

A new discovery ties myth to place, revealing centuries of cult worship and civic ritual.

The World’s Largest Sand Battery Just Went Online in Finland. It could change renewable energy

This sand battery system can store 1,000 megawatt-hours of heat for weeks at a time.

A Hidden Staircase in a French Church Just Led Archaeologists Into the Middle Ages

They pulled up a church floor and found a staircase that led to 1500 years of history.

The World’s Largest Camera Is About to Change Astronomy Forever

A new telescope camera promises a 10-year, 3.2-billion-pixel journey through the southern sky.

AI 'Reanimated' a Murder Victim Back to Life to Speak in Court (And Raises Ethical Quandaries)

AI avatars of dead people are teaching courses and testifying in court. Even with the best of intentions, the emerging practice of AI ‘reanimations’ is an ethical quagmire.

This Rare Viking Burial of a Woman and Her Dog Shows That Grief and Love Haven’t Changed in a Thousand Years

The power of loyalty, in this life and the next.

This EV Battery Charges in 18 Seconds and It’s Already Street Legal

RML’s VarEVolt battery is blazing a trail for ultra-fast EV charging and hypercar performance.

DARPA Just Beamed Power Over 5 Miles Using Lasers and Used It To Make Popcorn

A record-breaking laser beam could redefine how we send power to the world's hardest places.

Why Do Some Birds Sing More at Dawn? It's More About Social Behavior Than The Environment

Study suggests birdsong patterns are driven more by social needs than acoustics.

Nonproducing Oil Wells May Be Emitting 7 Times More Methane Than We Thought

A study measured methane flow from more than 450 nonproducing wells across Canada, but thousands more remain unevaluated.