Improved read/write cost tradeoff in DNA-based data storage using LDPC codes

Mapping Intimacies ◽

10.1101/770032 ◽

2019 ◽

Cited By ~ 1

Author(s):

Shubham Chandak ◽

Kedar Tatwawadi ◽

Billy Lau ◽

Jay Mardia ◽

Matthew Kubit ◽

...

Keyword(s):

Error Correction ◽

Data Storage ◽

Coding Theory ◽

Ldpc Code ◽

Block Length ◽

Synthesis Process ◽

Sequencing Errors ◽

Insertion And Deletion ◽

Erasure Correction ◽

Storage Technologies

AbstractWith the amount of data being stored increasing rapidly, there is significant interest in exploring alternative storage technologies. In this context, DNA-based storage systems can offer significantly higher storage densities (petabytes/gram) and durability (thousands of years) than current technologies. Specifically, DNA has been found to be stable over extended periods of time which has been demonstrated in the analysis of organisms long since extinct. Recent advances in DNA sequencing and synthesis pipelines have made DNA-based storage a promising candidate for the storage technology of the future.Recently, there have been multiple efforts in this direction, focusing on aspects such as error correction for synthesis/sequencing errors and erasure correction for handling missing sequences. The typical approach is to use separate codes for handling errors and erasures, but there is limited understanding of the efficiency of this framework. Furthermore, the existing techniques use short block-length codes and heavily rely on read consensus, both of which are known to be suboptimal in coding theory.In this work, we study the tradeoff between the writing and reading costs involved in DNA-based storage and propose a practical scheme to achieve an improved tradeoff between these quantities. Our scheme breaks with the traditional separation framework and instead uses a single large block-length LDPC code for both erasure and error correction. We also introduce novel techniques to handle insertion and deletion errors introduced by the synthesis process. For a range of writing costs, the proposed scheme achieves 30-40% lower reading costs than state-of-the-art techniques on experimental data obtained using array synthesis and Illumina sequencing.The code, data, and Supplementary Material is available at https://github.com/shubhamchandak94/LDPC_DNA_storage.

Download Full-text

An alternative approach to nucleic acid memory

Nature Communications ◽

10.1038/s41467-021-22277-y ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

George D. Dickinson ◽

Golam Md Mortuza ◽

William Clay ◽

Luca Piantanida ◽

Christopher M. Green ◽

...

Keyword(s):

Nucleic Acid ◽

Error Correction ◽

Data Storage ◽

Super Resolution ◽

Information Storage ◽

Advantages And Disadvantages ◽

Information Density ◽

Alternative Approach ◽

Correction Scheme ◽

Storage Technologies

AbstractDNA is a compelling alternative to non-volatile information storage technologies due to its information density, stability, and energy efficiency. Previous studies have used artificially synthesized DNA to store data and automated next-generation sequencing to read it back. Here, we report digital Nucleic Acid Memory (dNAM) for applications that require a limited amount of data to have high information density, redundancy, and copy number. In dNAM, data is encoded by selecting combinations of single-stranded DNA with (1) or without (0) docking-site domains. When self-assembled with scaffold DNA, staple strands form DNA origami breadboards. Information encoded into the breadboards is read by monitoring the binding of fluorescent imager probes using DNA-PAINT super-resolution microscopy. To enhance data retention, a multi-layer error correction scheme that combines fountain and bi-level parity codes is used. As a prototype, fifteen origami encoded with ‘Data is in our DNA!\n’ are analyzed. Each origami encodes unique data-droplet, index, orientation, and error-correction information. The error-correction algorithms fully recover the message when individual docking sites, or entire origami, are missing. Unlike other approaches to DNA-based data storage, reading dNAM does not require sequencing. As such, it offers an additional path to explore the advantages and disadvantages of DNA as an emerging memory material.

Download Full-text

An error-correction scheme for a helical-scan magnetic data storage system

IEEE Journal on Selected Areas in Communications ◽

10.1109/49.124485 ◽

1992 ◽

Vol 10 (1) ◽

pp. 267-276

Author(s):

C.R. Hawthorne ◽

E.S. Sousa ◽

A. Leon-Garcia ◽

J.L. Yen

Keyword(s):

Error Correction ◽

Data Storage ◽

Storage System ◽

Magnetic Data Storage ◽

Magnetic Data ◽

Correction Scheme ◽

Data Storage System ◽

Helical Scan

Download Full-text

Anticipating critical materials implications from the Internet of Things (IOT): Potential stress on future supply chains from emerging data storage technologies

Sustainable Materials and Technologies ◽

10.1016/j.susmat.2017.10.001 ◽

2018 ◽

Vol 15 ◽

pp. 27-32 ◽

Cited By ~ 2

Author(s):

Anthony Y. Ku

Keyword(s):

Internet Of Things ◽

Supply Chains ◽

Data Storage ◽

The Internet ◽

Future Supply ◽

Critical Materials ◽

Storage Technologies ◽

The Internet Of Things

Download Full-text

eMED-DNA: An in silico operating system for clinical medical data storage within the human genome

10.1101/814830 ◽

2019 ◽

Author(s):

Md. Jakaria ◽

Kowshika Sarker ◽

Mostofa Rafid Uddin ◽

Md. Mohaiminul Islam ◽

Trisha Das ◽

...

Keyword(s):

Operating System ◽

Clinical Practice ◽

Data Storage ◽

Real World ◽

In Silico ◽

Clinical Medicine ◽

Genomic Medicine ◽

Integrative System ◽

A Genome ◽

Storage Technologies

AbstractThe propitious developments in molecular biology and next generation sequencing have enabled the possibility for DNA storage technologies. However, the full application and power of our genomic revolution have not been fully utilized in clinical medicine given a lack of transition from research to real world clinical practice. This has identified an increasing need for an operating system which allows for the transition from research to clinical use. We present eMED-DNA, an in silico operating system for archiving and managing all forms of electronic health records (EHRs) within one’s own copy of the sequenced genome to aid in the application and integration of genomic medicine within real world clinical practice. We incorporated an efficient and sophisticated in-DNA file management system for the lossless management of EHRs within a genome. This represents the first in silico integrative system which would bring closer the utopian ideal for integrating genotypic data with phenotypic clinical data for future medical practice.

Download Full-text

Hamming Codes: Error Reducing Techniques

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39146 ◽

2021 ◽

Vol 9 (11) ◽

pp. 1972-1974

Author(s):

Rohitkumar R Upadhyay

Keyword(s):

Lower Bound ◽

Error Correction ◽

Coding Theory ◽

Hamming Distance ◽

Error Reduction ◽

Error Correcting Codes ◽

Significant Other ◽

Hamming Codes ◽

Decoding Algorithms ◽

Reducing Properties

Abstract: Hamming codes for all intents and purposes are the first nontrivial family of error-correcting codes that can actually correct one error in a block of binary symbols, which literally is fairly significant. In this paper we definitely extend the notion of error correction to error-reduction and particularly present particularly several decoding methods with the particularly goal of improving the error-reducing capabilities of Hamming codes, which is quite significant. First, the error-reducing properties of Hamming codes with pretty standard decoding definitely are demonstrated and explored. We show a sort of lower bound on the definitely average number of errors present in a decoded message when two errors for the most part are introduced by the channel for for all intents and purposes general Hamming codes, which actually is quite significant. Other decoding algorithms are investigated experimentally, and it generally is definitely found that these algorithms for the most part improve the error reduction capabilities of Hamming codes beyond the aforementioned lower bound of for all intents and purposes standard decoding. Keywords: coding theory, hamming codes, hamming distance

Download Full-text

A New Write Caching Algorithm for Solid State Disks

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.341-342.700 ◽

2011 ◽

Vol 341-342 ◽

pp. 700-704

Author(s):

Bai Yi Huang

Keyword(s):

Solid State ◽

Data Storage ◽

Flash Memory ◽

Theory And Practice ◽

Solid State Disks ◽

Storage Technology ◽

Mechanical Devices ◽

Storage Technologies ◽

A Performance

Flash-based solid state disks (SSD) is a performance based data storage technology that optimizes the use of flash-based technology to implement its data storage capabilities compared with mechanically available data storage technologies. It has been argued in theory and practice that SSD devices are better performers compared with mechanical devices. To improve the efficiency of a flash memory SSD device, it is important for it to be designed to be computationally support parallel operations.

Download Full-text

Storage Infrastructure for Big Data and Cloud

Advances in Data Mining and Database Management - Handbook of Research on Cloud Infrastructures for Big Data Analytics ◽

10.4018/978-1-4666-5864-6.ch005 ◽

2014 ◽

pp. 110-128 ◽

Cited By ~ 3

Author(s):

Anupama C. Raman

Keyword(s):

Big Data ◽

Real Time ◽

Data Storage ◽

Unstructured Data ◽

Storage Area Networks ◽

Storage Technology ◽

Object Based ◽

Storage Area ◽

Storage Technologies ◽

Big Data Storage

Unstructured data is growing exponentially. Present day storage infrastructures like Storage Area Networks and Network Attached Storage are not very suitable for storing huge volumes of unstructured data. This has led to the development of new types of storage technologies like object-based storage. Huge amounts of both structured and unstructured data that needs to be made available in real time for analytical insights is referred to as Big Data. On account of the distinct nature of big data, the storage infrastructures for storing big data should possess some specific features. In this chapter, the authors examine the various storage technology options that are available nowadays and their suitability for storing big data. This chapter also provides a bird's eye view of cloud storage technology, which is used widely for big data storage.

Download Full-text

Spatial Data Repositories

Geographic Information Systems in Business ◽

10.4018/978-1-59140-399-9.ch005 ◽

2011 ◽

pp. 80-112

Author(s):

Julian Ray

Keyword(s):

Data Storage ◽

Spatial Data ◽

Storage Systems ◽

Systems Design ◽

Successful Implementation ◽

Security Risks ◽

Data Repositories ◽

Emerging Trends ◽

Storage Technologies ◽

Systems Architectures

This chapter identifies and discusses issues associated with integrating technologies for storing spatial data into business information technology frameworks. A new taxonomy of spatial data storage systems is developed differentiating storage systems by the systems architectures used to enable interaction between client applications and physical spatial data stores, and by the methods used by client applications to query and return spatial data. Five distinct storage models are identified and discussed along with current examples of vendor implementations. Building on this initial discussion, the chapter identifies a variety of issues pertaining to spatial data storage systems affecting three distinct aspects of technology adoption: systems design, systems implementation and management of completed systems. Current issues associated with each of these three aspects are described and illustrated along with a discussion of emerging trends in spatial data storage technologies. As spatial data and the technologies designed to store and manipulate it become more prevalent, understanding potential impacts these technologies may have on other technology decisions within an organization becomes increasingly important. Furthermore, understanding how these technologies can introduce security risks and other vulnerabilities into a computing framework is critical to successful implementation.

Download Full-text

Multicast Routing Protocols, Algorithms and its QOS Extensions

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch359 ◽

2005 ◽

pp. 2036-2041

Author(s):

D. Chakraborty ◽

G. Chakraborty ◽

N. Shiratori

Keyword(s):

Data Storage ◽

High Speed ◽

Multicast Routing ◽

Multicast Communication ◽

Delay Jitter ◽

End To End Delay ◽

End To End ◽

Storage Technologies ◽

New Applications ◽

Audio Video

The advancement in optical fiber and switching technologies has resulted in a new generation of high-speed networks that can achieve speeds of up to a few gigabits per second. Also, the progress in audio, video and data storage technologies has given rise to new distributed real-time applications. These applications may involve multimedia, which require low end-to-end delay. The applications’ requirements, such as the end-to-end delay, delay jitter, and loss rate, are expressed as QoS parameters, which must be guaranteed. In addition, many of these new applications involve multiple users, and hence the importance of multicast communication. Multimedia applications are becoming increasingly important, as networks are now capable of carrying continuous media traffic, such as voice and video, to the end user. When there is a lot of information to transmit to a subset of hosts, then multicast is the best possible way to facilitate it. This article addresses different multicast routing algorithms and protocols. We have also discussed about the QoS multicast routing and conclude this article with mobile multicasting.

Download Full-text

Nanoscale Heat Conduction in Data Storage Technology

Design, Synthesis, and Applications ◽

10.1115/nano2004-46047 ◽

2004 ◽

Author(s):

Mehdi Asheghi

Keyword(s):

Heat Transfer ◽

Data Storage ◽

Energy Transport ◽

Research Effort ◽

Optical Data Storage ◽

Thermal Design ◽

Optical Data ◽

Magnetic Data ◽

Nano Scale ◽

Storage Technologies

The magnetic data storage industry has followed a similar density (and data rate) improvement curve as the semiconductor technology (Moore’s Law) for the past decade. However, whether the storage densities will continue to increase at this rate and be able to keep up with the improvements in processor technology is under a near term threat resulting from the fundamental physics up on which the hard disk drives are based. It is expected that novel, more unconventional technological solutions become necessary to overcome limitations, however, many of these technologies rely heavily on heating and energy transport at extremely short time and length scales. It is widely believed that further advances in high-technology data storage systems will be difficult, if not impossible, without rigorous treatment of the nano-scale energy transport. The nano-scale heat transfer research effort at Data Storage System Center (DSSC) has been focused on three interwoven areas of thermal design, failure analysis, and metrology of micro/nano-devices and structures relevant to data storage technologies. In this presentation, underlying physics and fundamentals of heat transport at nanoscale will be discussed. In addition, applications of the nanoscale heat transfer to the thermal analyses of the magnetic and phase change optical data storage technologies will be presented.

Download Full-text