We have become very good at storing data with hard drives closing in on 20 terabytes, but even our best 21st-century engineering can’t come close to the elegance and density of DNA. Most of the cells in your body contain a complete genetic copy of what makes you a human being, and DNA is surprisingly durable compared to chips and spinning platters that will probably end up in a landfill inside of a decade. DNA might even be viable for storing digital data, but we’re not limited by the way human DNA works. Researchers from the University of Illinois Urbana-Champaign have expanded the capabilities of DNA data storage by adding more letters to its alphabet

The genetic information in your cells relies on four primary base pairs, also known as nucleotides or nucleic acid. There’s adenine, guanine, cytosine, and thymine — the A, G, C, and T you’ve seen when genetic information is written out. The human body also uses another base called uracil in place of thymine when translating genetic information into RNA to make proteins. 

Even without any modifications, DNA is a very dense storage medium. The researchers note that the world creates several petabytes of new data every day, and a single gram of DNA could store it all. That’s what you get with the standard four-base system from life on Earth, but there are plenty more nucleotides in chemistry that can link up to form a DNA strand. The team created an encoding scheme relying on 11 different bases, which gives the synthetic DNA much higher data density than a system of just four bases. 

So why aren’t we all using DNA hard drives? While DNA can last for thousands of years without irreparable data loss, it’s difficult to encode and decode that data. You need advanced laboratory equipment, and most tools can’t even interpret the 11-base DNA strands created in the new study. The team found that ring-like proteins known as MspA nanopores, which are commonly used in DNA sensing, could correctly read the synthetic and natural DNA. Interpreting the recovered data required machine learning and artificial intelligence, but the result is a system that correctly read all 77 different combinations of bases used in the study. They believe this system could roughly double the data density of DNA, which is already much higher than any technology we’ve devised. 

This work is still very early, but it’s a fascinating proof of concept. The addition of synthetic chemistry to natural biological storage mechanisms could unlock functionally unlimited data storage. And it works, with just a little AI assistance. Such a technology would be limited to long-term archival storage at first, but no one knows what the future may bring.

Now Read:

Source From Extremetech
Author: Ryan Whitwam