homehome Home chatchat Notifications


Researchers encode data in DNA hundreds of times faster than before — with panda pics

Two images were stored in and retrieved from DNA sequences faster than ever before. This could be a game-changer for our data storage.

Mihai Andrei
October 24, 2024 @ 6:23 pm

share Share

AI image of DNA used for storage
DNA could soon become a reliable medium of storage. AI-generated image.

DNA can hold a staggering amount of information. Not only is it the blueprint for all life on Earth, but a single gram of DNA can store the equivalent of 215 million gigabytes of data. That’s enough to hold every digital book, song, and movie ever created. Gram for gram, DNA can store up to a billion times more data than silicon-based storage.

The traditional method of storing data in DNA involves encoding binary information (the ones and zeros of computing) into sequences of nucleotide bases — adenine (A), thymine (T), guanine (G) and cytosine (C) — and then synthesizing these sequences chemically. This method is promising but high costs and slow data writing speeds hamper it. The new study addresses these challenges by introducing a method that encodes data without synthesizing new DNA sequences.

The new method sidesteps these limitations.

In this new system, the research team, led by Cheng Zhang developed a method that uses epigenetic modifications to encode data. Epigenetic modifications involve chemical changes to DNA that do not alter its sequence but can influence its function. One common type of epigenetic modification is DNA methylation, where methyl groups are added to cytosine bases in the DNA sequence.

“It’s encouraging to see that epigenetic principles from biochemistry textbooks and taught in my classroom can be applied seamlessly to DNA data storage applications to solve some of the unmet challenges in this field,” says corresponding author Hao Yan.

How it all works

The team’s approach essentially “prints” data onto DNA using these methylation marks as binary data bits, or “epi-bits.” By using a library of prefabricated DNA templates and short DNA strands known as bricks, the researchers could guide where methyl groups are placed on the DNA, allowing them to encode complex information without having to synthesize new DNA molecules from scratch.

One of the most remarkable features of this new approach is its ability to write data in parallel. Traditional DNA synthesis is a serial process — each nucleotide must be added one at a time, which is time-consuming and costly. However, the new system allows the researchers to add multiple epi-bits of information simultaneously, increasing the speed and scalability of data storage.

Let’s say you’re writing a letter by hand. You’re writing all the letters one by one, which is not very efficient. But, when you print something, you print an entire row, which is much faster.

“This new approach demonstrates how one can harness molecular mechanisms for innovative data solutions, bridging the fields of biology and digital information,” says Laura Na Liu, a co-author of the new study.

Coding panda pics into DNA

Images of tigers showing the results of different DNA data storage methods
Recovered tiger images from samples 1 to 4 with stepwise improved writing-reading pipelines. 

The team tested their approach by storing an image of a panda and a rubbing in the shape of a tiger from ancient China. They then retrieved them with a DNA sequencer.

In their experiments, the researchers stored approximately 275,000 bits of information using their new system (about a third of a megabyte). They achieved this by employing a set of 700 DNA “movable types” (i.e., pre-made short DNA sequences) and five universal DNA templates. This allowed them to write 350 bits of data in a single reaction, a significant improvement over traditional methods. The approach was also reliable, having high fidelity and minimal error rates (less than 3%).

The DNA coding scheme for the image of a panda
 Compression and error correction coding scheme for panda image (i), and a schematic of the retrieved epi-bits on sequencing reads along with the restored image (ii).

To ensure that the data stored using epigenetic modifications could be accurately read, the researchers used high-throughput nanopore sequencing, a technology that reads DNA sequences by passing them through a tiny pore and detecting changes in electrical current.

The research also demonstrated a novel aspect of their technology: its accessibility. They conducted a pilot experiment called “iDNAdrive,” where 60 student volunteers with no professional biolab experience successfully encoded their own data into DNA using a simple kit. This shows that their system is not only scalable but also user-friendly.

This marks a significant departure from current DNA data storage methods, which could only be done in a lab before. In this distributed system, users could “write” data to DNA in their own homes and then retrieve it later through sequencing.

Big promise, big challenges

This research highlights the incredible potential of DNA as a medium for storing vast amounts of data in a compact, stable, and durable form. The innovative use of epigenetic modifications to encode data provides a new way to overcome the limitations of traditional DNA synthesis methods.

DNA is much more stable than silicon and other traditional storage media. Properly stored, DNA can last for thousands of years, making it ideal for archival purposes, such as preserving cultural artifacts, historical records, or scientific data. This method’s potential for distributed data storage could revolutionize personal data privacy and security. Instead of relying on cloud storage or data centers, individuals could store their most sensitive information in DNA, which could be kept in a secure location and accessed only when needed.

However, there are also enormous challenges ahead. For starters, only very small amounts of information were stored, and the error rates, while relatively low (<3%), are not acceptable for data we work with routinely.

Another challenge is the speed of data retrieval. Although nanopore sequencing allows for high-throughput reading of DNA, it is still slower than the reading speeds of conventional digital storage devices. Advances in sequencing technology will be crucial to making DNA data storage competitive with silicon-based systems.

The study “Parallel molecular data storage by printing epigenetic bits on DNA” was published in Nature.

share Share

Researchers Say Humans Are In the Midst of an Evolutionary Shift Like Never Before

Humans are evolving faster through culture than through biology.

Archaeologists Found A Rare 30,000-Year-Old Toolkit That Once Belonged To A Stone Age Hunter

An ancient pouch of stone tools brings us face-to-face with one Gravettian hunter.

Scientists Crack the Secret Behind Jackson Pollock’s Vivid Blue in His Most Famous Drip Painting

Chemistry reveals the true origins of a color that electrified modern art.

China Now Uses 80% Artificial Sand. Here's Why That's A Bigger Deal Than It Sounds

No need to disturb water bodies for sand. We can manufacture it using rocks or mining waste — China is already doing it.

Over 2,250 Environmental Defenders Have Been Killed or Disappeared in the Last 12 Years

The latest tally from Global Witness is a grim ledger. In 2024, at least 146 people were killed or disappeared while defending land, water and forests. That brings the total to at least 2,253 deaths and disappearances since 2012, a steady toll that turns local acts of stewardship into mortal hazards. The organization’s report reads less like […]

After Charlie Kirk’s Murder, Americans Are Asking If Civil Discourse Is Even Possible Anymore

Trying to change someone’s mind can seem futile. But there are approaches to political discourse that still matter, even if they don’t instantly win someone over.

Climate Change May Have Killed More Than 16,000 People in Europe This Summer

Researchers warn that preventable heat-related deaths will continue to rise with continued fossil fuel emissions.

New research shows how Trump uses "strategic victimhood" to justify his politics

How victimhood rhetoric helped Donald Trump justify a sweeping global trade war

Biggest Modern Excavation in Tower of London Unearths the Stories of the Forgotten Inhabitants

As the dig deeper under the Tower of London they are unearthing as much history as stone.

Millions Of Users Are Turning To AI Jesus For Guidance And Experts Warn It Could Be Dangerous

AI chatbots posing as Jesus raise questions about profit, theology, and manipulation.