Audio Recordings Perpetually Stored on DNA For the First Time

September 30 2017, 03:00
Twist Bioscience, a company dedicated to DNA synthesis, working with Microsoft and University of Washington researchers, announced that they have successfully stored archival-quality audio recordings of two important music performances from the archives of the world-renowned Montreux Jazz Festival. Recordings were encoded and stored in DNA, for the first time. This is the first time DNA has been used as a long-term archival-quality storage medium. The tiny specks of DNA will preserve a part of UNESCO’s Memory of the World Archive, where valuable cultural heritage collections are recorded. 
 
Solely for the purpose of illustration, the lyrics of Deep Purple's Smoke on the Water encoded into DNA. Each letter, space and punctuation mark are represented by a unique triplet of the four bases (A, T, G, C), the building blocks of DNA. For example, "smoke" becomes GACCGACGTCAGAGC. In general, to encode digital data into DNA, a quaternary code is used and allows a base to encode two bits (e.g. A = 00, C = 01, G = 10, T = 11). Courtesy of Twist Bioscience; image developed by Martin Krzywinski

The Montreux Jazz Digital Project is a collaboration between the Claude Nobs Foundation, curator of the Montreux Jazz Festival audio-visual collection and the École Polytechnique Fédérale de Lausanne (EPFL) to digitize, enrich, store, show, and preserve this notable legacy created by Claude Nobs, the Festival’s founder.

In this proof-of-principle project, two quintessential music performances from the Montreux Jazz Festival – "Smoke on the Water," performed by Deep Purple and "Tutu," performed by Miles Davis – have been encoded onto DNA and read back with 100 percent accuracy. After being decoded, the songs were played on September 29th at the ArtTech Forum (see below) in Lausanne, Switzerland. Smoke on the Water was selected as a tribute to Claude Nobs, the Montreux Jazz Festival’s founder. The song memorializes a fire and Funky Claude’s rescue efforts at the Casino Barrière de Montreux during a Frank Zappa concert promoted by Claude Nobs. Miles Davis’ Tutu was selected for the role he played in music history and the Montreux Jazz Festival’s success.

“We archived two magical musical pieces on DNA of this historic collection, equating to 140MB of stored data in DNA,” says Karin Strauss, Ph.D., a Senior Researcher at Microsoft, and one of the project’s leaders. “The amount of DNA used to store these songs is much smaller than one grain of sand. Amazingly, storing the entire six petabyte Montreux Jazz Festival’s collection would result in DNA smaller than one grain of rice.”

Luis Ceze, Ph.D., a professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, says, “DNA, nature’s preferred information storage medium, is an ideal fit for digital archives because of its durability, density and eternal relevance. Storing items from the Montreux Jazz Festival is a perfect way to show how fast DNA digital data storage is becoming real.”

“With advancements in nanotechnology, I believe we can expect to see people living prolonged lives, and with that, we can also expect to see more developments in the enhancement of how we live. For me, life is all about learning where you came from in order to get where you want to go, but in order to do so, you need access to history! And with the unreliability of how archives are often stored, I sometimes worry that our future generations will be left without such access... So, it absolutely makes my soul smile to know that EPFL, Twist Bioscience and others are coming together to preserve the beauty and history of the Montreux Jazz Festival for our future generations, on DNA!... I've been a part of this festival for decades and it truly is a magnificent representation of what happens when different cultures unite for the sake of music. Absolute magic. And I'm proud to know that the memory of this special place will never be lost,” added Quincy Jones.

“Our partnership with EPFL in digitizing our archives aims not only at their positive exploration, but also at their preservation for the next generations,” says Thierry Amsallem, president of the Claude Nobs Foundation. “By taking part in this pioneering experiment which writes the songs into DNA strands, we can be certain that they will be saved on a medium that will never become obsolete!”
 
A few drops of DNA would be enough to store all the world's music! © 2017 EPFL / Alain Herzog

Nature’s Preferred Storage Medium
Nature selected DNA as its hard drive billions of years ago to encode all the genetic instructions necessary for life. These instructions include all the information necessary for survival. DNA molecules encode information with sequences of discrete units. In computers, these discrete units are the 0s and 1s of “binary code,” whereas in DNA molecules, the units are the four distinct nucleotide bases: adenine (A), cytosine (C), guanine (G) and thymine (T).

Like music, which can be widely varied with a finite number of notes, DNA encodes individuality with only four different letters in varied combinations. When using DNA as a storage medium, there are several advantages in addition to the universality of the format and incredible storage density. DNA can be stable for thousands of years when stored in a cool dry place and is easy to copy using polymerase chain reaction to create back-up copies of archived material. In addition, because of PCR, small data sets can be targeted and recovered quickly from a large dataset without needing to read the entire file.

Each cell within the human body contains approximately three billion base pairs of DNA. With 75 trillion cells in the human body, this equates to the storage of 150 zettabytes (1021) of information within each body. By comparison, the largest data centers can be hundreds of thousands to even millions of square feet to hold a comparable amount of stored data.

“DNA is a remarkably efficient molecule that can remain stable for millennia,” explains Bill Peck, Ph.D., chief technology officer of Twist Bioscience. “This is a very exciting project: we are now in an age where we can use the remarkable efficiencies of nature to archive master copies of our cultural heritage in DNA. As we develop the economies of this process new performances can be added any time. Unlike current storage technologies, nature’s media will not change and will remain readable through time. There will be no new technology to replace DNA, nature has already optimized the format.”
 
DNA is the densest and most reliable storage medium ever. © 2017 EPFL / Alain Herzog

How to Store Digital Data in DNA
To encode the music performances into archival storage copies in DNA, Twist Bioscience worked with Microsoft and University of Washington researchers to complete four steps: Coding, synthesis/storage, retrieval and decoding. First, the digital files were converted from the binary code using 0s and 1s into sequences of A, C, T and G. For purposes of the example, 00 represents A, 10 represents C, 01 represents G and 11 represents T. Twist Bioscience then synthesizes the DNA in short segments in the sequence order provided. The short DNA segments each contain about 12 bytes of data as well as a sequence number to indicate their place within the overall sequence. This is the process of storage. And finally, to ensure that the file is stored accurately, the sequence is read back to ensure 100 percent accuracy, and then decoded from A, C, T or G into a two-digit binary representation.

Importantly, to encapsulate and preserve encoded DNA, the collaborators are working with Professor Dr. Robert Grass of ETH Zurich. Grass has developed an innovative technology inspired by preservation of DNA within prehistoric fossils. With this technology, digital data encoded in DNA remains preserved for millennia.
www.arttechfoundation.org | www.epfl.ch | www.claudenobsfoundation.com
Learn more about UNESCO’s Memory of the World Register
www.twistbioscience.com
related items