homehome Home chatchat Notifications


More than 24,000 AI-readable coronavirus scientific articles go online

The sum of human knowledge on the new coronavirus is now online, in a format readable by artificial intelligence.

Tibi Puiu
March 19, 2020 @ 2:05 am

share Share

Credit: Pixabay.

Scientists all over the world are racing around the clock on candidate vaccines, antiviral treatments, and just about anything they can throw at the novel coronavirus. In order to aid their efforts and accelerate unprecedented scientific action, a database that pools more than 24,000 research papers related to SARS-CoV-2 (the scientific name for the virus that causes the COVID-19 pandemic) and other coronaviruses is now online in a single place.

The most comprehensive coronavirus scientific database

The Covid-19 Open Research Dataset (CORD-19) is the work of several philanthropic and research organizations, including The National Library of Medicine (NLM) at the National Institutes of Health, the Allen Institute for AI, Georgetown University, the Chan Zuckerberg Initiative, Kaggle, Microsoft, and the White House Office of Science and Technology Policy (OSTP).

Each organization contributed with resources and know-how to the best of their ability. For instance, the NLM provided access to scientific literature while Microsoft used its engineering abilities to index and map all these thousands of articles that were scattered across the web. The Allen Institute for Artificial Intelligence (AI2), a non-profit, converted all the articles into a common structured format that can be parsed by algorithms.

Additionally, the entire dataset is machine-readable, allowing artificial intelligence (AI) systems to access and interpret the huge body of knowledge. This way, scientists might find existing safe drugs and therapies designed to treat other conditions that could prove useful in the current war on the coronavirus. Or perhaps they might find a chink in the coronavirus’ armor that has so far escaped scientists.

Previously, Microsoft researchers had employed machine learning and natural language analysis to interpret the content of thousands of biomedical papers. This initiative led to a representation of cellular regulatory networks that was exploited to make recommendations for cancer therapies.

According to MIT Technology Review, the dataset is part of AI2’s Semantic Scholar service, which employs natural language models like ELMo and BERT to plot relationships between papers.

For a long time, there has been a fierce debate among scholars regarding access to scientific papers, many of which are behind paywalls controlled by a handful of publishers.

Proponents of open access — free, unrestricted access to scientific papers — will be at least happy to learn that in this situation great efforts have been made to ensure the global research community has unhindered access to the coronavirus-related papers.

“It’s my hope that the machine-readable content will stimulate advances in computing methods that can help investigators to develop deeper understandings and approaches to addressing the COVID-19 pandemic. Developing tools to help scientists to do research and synthesize new understandings has been a long-term aspiration in AI. Work has been underway over years on methods that can answer questions, analyze and summarize the content of numerous scientific papers, assess the credibility of clinical trials, generate and test hypotheses, and guide experimentation,” Eric Horvitz, Technical Fellow and Chief Scientific Officer at Microsoft, wrote in a recent blog post.

The dataset also includes pre-publication research posted on servers like medRxiv and bioRxiv, which are open access archives for pre-print health sciences and biology research.

“Sharing vital information across scientific and medical communities is key to accelerating our ability to respond to the coronavirus pandemic,” Chan Zuckerberg Initiative Head of Science Cori Bargmann said refering to the CORD-19 project.

share Share

Archaeologists May Have Found Odysseus’ Sanctuary on Ithaca

A new discovery ties myth to place, revealing centuries of cult worship and civic ritual.

The World’s Largest Sand Battery Just Went Online in Finland. It could change renewable energy

This sand battery system can store 1,000 megawatt-hours of heat for weeks at a time.

A Hidden Staircase in a French Church Just Led Archaeologists Into the Middle Ages

They pulled up a church floor and found a staircase that led to 1500 years of history.

The World’s Largest Camera Is About to Change Astronomy Forever

A new telescope camera promises a 10-year, 3.2-billion-pixel journey through the southern sky.

AI 'Reanimated' a Murder Victim Back to Life to Speak in Court (And Raises Ethical Quandaries)

AI avatars of dead people are teaching courses and testifying in court. Even with the best of intentions, the emerging practice of AI ‘reanimations’ is an ethical quagmire.

This Rare Viking Burial of a Woman and Her Dog Shows That Grief and Love Haven’t Changed in a Thousand Years

The power of loyalty, in this life and the next.

This EV Battery Charges in 18 Seconds and It’s Already Street Legal

RML’s VarEVolt battery is blazing a trail for ultra-fast EV charging and hypercar performance.

This new blood test could find cancerous tumors three years before any symptoms

Imagine catching cancer before symptoms even appear. New research shows we’re closer than ever.

DARPA Just Beamed Power Over 5 Miles Using Lasers and Used It To Make Popcorn

A record-breaking laser beam could redefine how we send power to the world's hardest places.

Why Do Some Birds Sing More at Dawn? It's More About Social Behavior Than The Environment

Study suggests birdsong patterns are driven more by social needs than acoustics.