Microsoft is digging into the building blocks of life to find a way to encode and store humanity’s digital future, with the vendor flagging a recent demonstration of what it claims is the first fully automated system to store and retrieve data in manufactured DNA.
DNA (deoxyribonucleic acid), which codes genetic information, has long been touted as a potential contender for the future of data storage.
But as with any novel technology, development takes time, and many challenges need to be addressed before anything coming close to a commercially viable solution emerges. However, the potential of such storage technology is undeniable.
According to Microsoft, using DNA to archive data is an attractive possibility because it is extremely dense – up to about 1 exabyte per cubic millimetre – and durable, with a half-life of over 500 years.
And although it is not practical yet due to the current state of DNA synthesis and sequencing, these technologies are improving quite rapidly with advances in the biotech industry, according to Microsoft.
Clearly, things are coming along.
In a recent proof-of-concept test, researchers from Microsoft and the University of Washington (UW) successfully encoded the word “hello” in snippets of fabricated DNA and converted it back to digital data using a fully automated end-to-end system.
The test, which was described in a new paper published March 21 in Nature Scientific Reports, has been described by Microsoft as a key step in moving the technology out of the research lab and into commercial data centres.
But despite Microsoft’s optimism, there still appears to be quite a way to go before the tech industry can begin ordering DNA storage drives from suppliers.
While machines such as synthesisers and sequencers have already been developed to perform key parts of the process, many of the intermediate steps have, until now, required manual labour in the research lab, according to Microsoft.
Performing such tasks manually isn’t viable in a commercial setting, noted Chris Takahashi, senior research scientist at the UW’s Paul G. Allen School of Computer Science & Engineering.
“You can’t have a bunch of people running around a data centre with pipettes — it’s too prone to human error, it’s too costly and the footprint would be too large,” Takahashi said.
Which is why the successful trial of the automated DNA storage and retrieval system is something Microsoft wants to make a big show and dance of.
So, how does it work?
As is the case for DNA data storage techniques generally, information is stored in synthetic DNA molecules created in a lab, not DNA from humans or other living things. In the process Microsoft is developing, this information can be encrypted before it is sent to the storage system.
The automated DNA data storage system uses software developed by the Microsoft and UW team that converts the ones and zeros of digital data into the As, Ts, Cs and Gs that make up the building blocks of DNA, according to Microsoft.
Then it uses inexpensive, largely off-the-shelf lab equipment to flow the necessary liquids and chemicals into a synthesiser that builds manufactured snippets of DNA and to push them into a storage vessel, the vendor said.
When the system needs to retrieve the information, it adds other chemicals to properly prepare the DNA and uses microfluidic pumps to push the liquids into other parts of the system that ‘read’ the DNA sequences and convert it back to information that a computer can understand.
The goal of the project was not to prove how fast or inexpensively the system could work, the researchers noted, but simply to demonstrate that automation is possible.
“Our ultimate goal is to put a system into production that, to the end-user, looks very much like any other cloud storage service — bits are sent to a data centre and stored there and then they just appear when the customer wants them,” said Microsoft principal researcher Karin Strauss. “To do that, we needed to prove that this is practical from an automation perspective.”