Codes With Run-Length and GC-Content Constraints for DNA-Based Data Storage

2018 ◽  
Vol 22 (10) ◽  
pp. 2004-2007 ◽  
Author(s):  
Wentu Song ◽  
Kui Cai ◽  
Mu Zhang ◽  
Chau Yuen
Keyword(s):  
2016 ◽  
Vol 358 ◽  
pp. 103-107 ◽  
Author(s):  
Shuhei Yoshida ◽  
Yosuke Takahata ◽  
Shuma Horiuchi ◽  
Manabu Yamamoto

Author(s):  
Eka Prayoga ◽  
Kristien Margi Suryaningrum

[Id]Meningkatnya penggunaan media digital dalam kehidupan sehari-hari secara tidak langsung turut meningkatkan kebutuhan dalam penyimpanan data, oleh karena itu dibutuhkan sebuah metode untuk menangani hal tersebut, salah satunya adalah dengan menerapkan kompresi data. Kompresi adalah teknik dalam memampatkan suatu data untuk menghemat media penyimpanan yang digunakan, selain itu, kompresi pun dapat dimanfaatkan untuk kebutuhan lain, seperti backup data, proses pengiriman data, serta keamanan data. Pemampatan atau kompresi pada umumnya diterapkan pada mesin komputer, karena setiap simbol yang ditampilkan memiliki bit-bit yang berbeda. Penulis menggunakan algoritma Huffman dan Run Length Encoding dalam proses pemampatan yang dilakukan, dimana masukkannya adalah file TXT. Tujuan penelitian ini adalah untuk mengetahui bagaimana implementasi dari gabungan antara kedua algoritma tersebut, selain itu, penelitian ini juga bertujuan untuk mengetahui bagaimana rasio perbandingan ukuran file antara file awal dan file yang terkompresi. Implementasi sistem yang dilakukan memanfaatkan aplikasi berbasis web untuk memudahkan pengguna dalam memanfaatkan fitur sistem yang ada, dimana dalam sistem ini memuat proses kompresi dan dekompresi. Tahapan kompresi digunakan untuk proses pemampatan, dan tahapan dekompresi untuk proses pengembalian file ke bentuk dan ukuran yang semula. Penelitian dilakukan dengan menggunakan 5 data uji, dan menunjukkan ukuran file hasil dekompres tidak seperti semula karena proses kompresi yang bersifat lossy.Kata kunci :Kompresi, TXT, Dekompresi, Huffman, Run Length Encoding[En]Increasing the use of digital media in life indirectly also increases the need for data storage, therefore needed a method to handle it, one of them is by applying data compression. Compression is a technique which compress data to save used storage, in addition, any compression can be used for other needs, such as data backup, data transmission process, and data security. Compression or compression is generally applied to a computer machine, because every symbol displayed has different bits. Writer here used Huffman and Run Length Encoding algorithm in the compression process, where the input is TXT file. The purpose of this study is to find out how the implementation of the combination between the two algorithms, in addition, this study also aims to find out how the ratio of file sizes between the initial file and the compressed file. Implementation of the system made use of web-based applications to facilitate users in utilizing the features of existing systems, which in this system includes the compression and decompression process. The compression stages are used for the compression process, and the decompression stage for the process of returning the file to its original shape and size. The study was conducted using 5 test data, and showed the decompress file size is not as original because the compression process is categorized as lossy


PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0255376
Author(s):  
Li Xiaoru ◽  
Guo Ling

The development of information technology has produced massive amounts of data, which has brought severe challenges to information storage. Traditional electronic storage media cannot keep up with the ever-increasing demand for data storage, but in its place DNA has emerged as a feasible storage medium with high density, large storage capacity and strong durability. In DNA data storage, many different approaches can be used to encode data into codewords. DNA coding is a key step in DNA storage and can directly affect storage performance and data integrity. However, since errors are prone to occur in DNA synthesis and sequencing, and non-specific hybridization is prone to occur in the solution, how to effectively encode DNA has become an urgent problem to be solved. In this article, we propose a DNA storage coding method based on the equilibrium optimization random search (EORS) algorithm, which meets the Hamming distance, GC content and no-runlength constraints and can reduce the error rate in storage. Simulation experiments have shown that the size of the DNA storage code set constructed by the EORS algorithm that meets the combination constraints has increased by an average of 11% compared with previous work. The increase in the code set means that shorter DNA chains can be used to store more data.


2019 ◽  
Author(s):  
Zhi Ping ◽  
Shihong Chen ◽  
Guangyu Zhou ◽  
Xiaoluo Huang ◽  
Sha Joe Zhu ◽  
...  

AbstractMotivationDNA has been reported as a promising medium of data storage for its remarkable durability and space-efficient storage capacity. Here, we propose a robust DNA-based data storage method based on a new codec algorithm, namely ‘Yin-Yang’.ResultsUsing this strategy, we successfully stored different file formats in a single synthetic DNA oligonucleotide pool. Compared to most well-established DNA-based data storage coding schemes presented to date, this codec system can achieve a variety of user goals (e.g. reduce homopolymer length to 3 or 4 at most, maintain balanced GC content between 40% and 60% and simple secondary structure with the Gibbs free energy above −30 kcal/mol). It also shows enhanced robustness in transcoding of different data structure and practical feasibility. We tested this codec with an end-to-end experiment including encoding, DNA synthesis, sequencing and decoding. Through successful retrieval of 3 files totaling 2.02 Megabits after sequencing and decoding, our strategy exhibits great qualities of achieving high storing capacity per nucleotide (427.1 PB/gram) and high fidelity of data recovery.


Author(s):  
Richard S. Chemock

One of the most common tasks in a typical analysis lab is the recording of images. Many analytical techniques (TEM, SEM, and metallography for example) produce images as their primary output. Until recently, the most common method of recording images was by using film. Current PS/2R systems offer very large capacity data storage devices and high resolution displays, making it practical to work with analytical images on PS/2s, thereby sidestepping the traditional film and darkroom steps. This change in operational mode offers many benefits: cost savings, throughput, archiving and searching capabilities as well as direct incorporation of the image data into reports.The conventional way to record images involves film, either sheet film (with its associated wet chemistry) for TEM or PolaroidR film for SEM and light microscopy. Although film is inconvenient, it does have the highest quality of all available image recording techniques. The fine grained film used for TEM has a resolution that would exceed a 4096x4096x16 bit digital image.


Author(s):  
T. A. Dodson ◽  
E. Völkl ◽  
L. F. Allard ◽  
T. A. Nolan

The process of moving to a fully digital microscopy laboratory requires changes in instrumentation, computing hardware, computing software, data storage systems, and data networks, as well as in the operating procedures of each facility. Moving from analog to digital systems in the microscopy laboratory is similar to the instrumentation projects being undertaken in many scientific labs. A central problem of any of these projects is to create the best combination of hardware and software to effectively control the parameters of data collection and then to actually acquire data from the instrument. This problem is particularly acute for the microscopist who wishes to "digitize" the operation of a transmission or scanning electron microscope. Although the basic physics of each type of instrument and the type of data (images & spectra) generated by each are very similar, each manufacturer approaches automation differently. The communications interfaces vary as well as the command language used to control the instrument.


Sign in / Sign up

Export Citation Format

Share Document