10 March 2022

400 petabytes per gram of DNA

Scientists have doubled the capacity of DNA memory

Stepan Ikaev, Hi-tech+

American scientists have turned the molecular composition of DNA into a reliable and stable data storage platform. A team of engineers and biologists expanded the "alphabet" of DNA, after which they developed a new sequencing method for collecting digital information. According to the authors of the project, in the future this system will allow you to fit all the information of the modern Internet in a box the size of a shoe box.

Several petabytes of data are generated on the Internet every day. Just one gram of DNA is enough to store this data. That's how dense DNA is as a carrier of information, Kasra Tabatabai, co-author of the study, said in a press release from the Beckman Institute Expanded alphabet, precise sequencing make DNA the next data storage solution.

As the scientists explained, the natural data storage system in DNA is noticeably superior to any technology in terms of its potential for information processing. Natural DNA consists of a combination of four nitrogenous bases: adenine, guanine, cytosine and thymine. The latter are indicated by the letters "A", "G", "C" and "T". When these letters are grouped in different sequences, they form "blueprints" of reproduction for living organisms. At the same time, the density of information storage in DNA is record — only one gram is able to store up to 215 petabytes of data.

Scientists from the University of Illinois at Urbana-Champaign have doubled the capacity of the natural data storage system and at the same time developed their own approach to reading information from DNA. In addition to A, G, C and T, the developers have added seven new "letters" to the DNA. They take the form of chemically modified nucleotides, opening up more diverse combinations that allow you to store more information in the same volume of physical space. Thus, the potential capacity of DNA has doubled — up to 400 petabytes per gram.

"Imagine the English alphabet. If you had only four letters, you could make up so many words. If you had a complete alphabet, you could create an unlimited number of word combinations. It's the same with DNA. Instead of converting zeros and ones to A, G, C and T, we can convert zeros and ones to A, G, C, T and seven new letters in the storage alphabet," Tabatabai added.

In the process of creating additional nucleotides, scientists had to abandon existing methods of reading DNA data based on the ACGT-"alphabet". Therefore, the team created a new system using machine learning algorithms. As part of the new platform, DNA strands pass through nanopores in a specially designed protein that identifies individual data units regardless of whether they are natural or synthetic. The AI then decodes the information stored inside and outputs the read result.

A series of experiments confirmed the viability of the concept. The scientists tried 77 different combinations of 11 nucleotides, with which the platform was able to perfectly differentiate each composition. The authors stated that they will continue to work on improving the system and plan to apply it in applied tests in the near future.

Article by Tabatabaei et al. Expanding the Molecular Alphabet of DNA-Based Data Storage Systems with Neural Network Nanopore Readout Processing is published in the journal Nano Letters – VM.

Portal "Eternal youth" http://vechnayamolodost.ru

Found a typo? Select it and press ctrl + enter Print version