Improving the Practical Capacity of Random-Access based DNA Storage

Wed Sep 18 | 11:35am
Location:
Lafayette, San Tomas
Abstract

Deoxyribonucleic Acid (DNA), with its ultra-high storage density and long durability, is a promising long-term archival storage medium and is attracting much attention today. A DNA storage system encodes and stores digital data with synthetic DNA sequences and decodes DNA sequences back to digital data via sequencing. Many encoding schemes have been proposed to enlarge DNA storage capacity by increasing DNA encoding density. However, only increasing encoding density is insufficient because enhancing DNA storage capacity is a multifaceted problem. This talk will introduce all factors affecting random-access based DNA storage capacity under current technologies and systematically investigate the practical DNA storage capacity with several popular encoding schemes. The investigation result shows the collision between primers and DNA payload sequences is a major factor limiting DNA storage capacity.

Based on this discovery, we will introduce our proposed new encoding scheme called Collision Aware Code (CAC) to trade some encoding density for the reduction of primer-payload collisions. Compared with the best result among the five existing encoding schemes, CAC can extricate 120\% more primers from collisions and increase the DNA tube capacity from 211.96 GB to 295.11 GB. Besides, we will also demonstrate CAC's recoverability from DNA storage errors. The result shows CAC is comparable to those of existing encoding schemes.

Learning Objectives

DNA storage background
Limitations of the practical capacity of random-access based DNA storage
A new scheme to improve DNA storage capacity

Abstract

Deoxyribonucleic Acid (DNA), with its ultra-high storage density and long durability, is a promising long-term archival storage medium and is attracting much attention today. A DNA storage system encodes and stores digital data with synthetic DNA sequences and decodes DNA sequences back to digital data via sequencing. Many encoding schemes have been proposed to enlarge DNA storage capacity by increasing DNA encoding density. However, only increasing encoding density is insufficient because enhancing DNA storage capacity is a multifaceted problem. This talk will introduce all factors affecting random-access based DNA storage capacity under current technologies and systematically investigate the practical DNA storage capacity with several popular encoding schemes. The investigation result shows the collision between primers and DNA payload sequences is a major factor limiting DNA storage capacity.

Based on this discovery, we will introduce our proposed new encoding scheme called Collision Aware Code (CAC) to trade some encoding density for the reduction of primer-payload collisions. Compared with the best result among the five existing encoding schemes, CAC can extricate 120\% more primers from collisions and increase the DNA tube capacity from 211.96 GB to 295.11 GB. Besides, we will also demonstrate CAC's recoverability from DNA storage errors. The result shows CAC is comparable to those of existing encoding schemes.

Learning Objectives

DNA storage background
Limitations of the practical capacity of random-access based DNA storage
A new scheme to improve DNA storage capacity


---

Bingzhe Li
University of Texas at Dallas
  • Yixun Wei
    University of Minnesota, Twin Cities
  • David Du
    University of Minnesota, Twin Cities
Related Sessions