AI solves, explains, and proposes new university-level math problems

Although computers can famously calculate much faster and more accurately than humans, they are lackluster in terms of general intelligence. Even when faced with rather narrow and well-defined math questions, machine learning algorithms often get stumped — and that’s even for some simple high school-level math problems.

Now, researchers at MIT have taken things to the next level with a neural network model that not only solves university-level math problems in an instant but can also explain the solutions step by step as if a professor was guiding a student. Moreover, the AI can come up with its own math problems.

The applications could be quite useful and immediate — and no, I don’t mean cheating on your math homework. Many students across the world are enrolled in so-called massive open online courses (MOOCs), some of which see thousands of concurrent students participating in them. The main shortcoming of MOOCs compared to a traditional classroom is that student-teacher interactions are minimal or nonexistent. After all, how many emails can a teacher answer in only 24 hours? The new AI could fill this gap, acting as an automated tutor that shows undergrad students the steps required to solve a math problem.

“We think this will improve higher education,” says Iddo Drori, a lecturer in the MIT Department of Electrical Engineering and Computer Science (EECS) and the study’s lead author. “It will help students improve, and it will help teachers create new content, and it could help increase the level of difficulty in some courses. It also allows us to build a graph of questions and courses, which helps us understand the relationship between courses and their prerequisites, not just by historically contemplating them, but based on data.”

When Drori and colleagues first embarked on their ambitious task of making a new AI that can solve more complex math problems, they initially ran into a lot of roadblocks. When they tried out models pre-trained using text only, the accuracy on high school math problems was atrocious, nailing the right answer only 8% of the time. They had much better luck with graph neural networks that very accurately answered questions from a machine learning course but the drawback was that it would need at least a week to train.

The turning point was when the researchers applied some out-of-the-box thinking. They presented their model with a bunch of questions from undergrad math courses it had never seen before, and turned the math questions into programming tasks. For instance, rather than asking the AI ‘find the distance between points A and B’, the researchers prompted the computer program to ‘write a program that finds the distance between two points.’ That’s pretty meta, but it worked.

As you may imagine, converting a math question into a programming task is no trivial undertaking. Many such problems require some additional context in order to be parsed and solved correctly — context that students typically pick up from attending courses but which a neural network doesn’t necessarily have access to unless it is ‘spoon fed’ by the researchers.

Astronomers find hidden planets in distant radio signals

Potentially habitable planet found close to our solar system

Scientists detect the highest-energy light shining from the Sun. It’s a trillion times brighter than normal

US space flight and ISS missions are dependent on Russia. What happens if the country pulls a squeeze?

To work around these many challenges, the researchers used a pre-trained neural network called Codex that was shown millions of examples of code from online repositories like GitHub, but also millions of natural language words. Essentially, the model they built could understand both pieces of text and code. With just a few question-to-code examples, the new AI could then interpret a text question, such as a math problem, and then run code that answers the problem.

“When you just ask a question in text, it is hard for a machine-learning model to come up with an answer, even though the answer may be in the text,” Drori explains. “This work fills in that missing piece of using code and program synthesis.”

The test-to-code approach registered an accuracy of over 80% in solving math problems, compared to just 8% for previous models.

The model was also used to generate new questions. The neural network was first given a series of math problems on a topic and then asked to create new problems. When students from campus were shown ten math problems for their undergrad math course (five of which were created by humans and the other five by the AI), they couldn’t which was machine-generated.

“In some topics, it surprised us. For example, there were questions about quantum detection of horizontal and vertical lines, and it generated new questions about quantum detection of diagonal lines. So, it is not just generating new questions by replacing values and variables in the existing questions,” Drori says.

The researchers have now extended the model to also handle math proofs, which are technically speaking even more challenging. The most immediate practical goal for the neural network is to improve course design and curricula, which is why the researchers at MIT plan on scaling the model up to hundreds of different courses.

The findings appeared in the Proceedings of the National Academy of Sciences.