Linear and nonlinear constructions of DNA codes with Hamming distance <mml:math altimg="si18.gif" display="inline" overflow="scroll" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.elsevier.com/xml/ja/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd"><mml:mi>d</mml:mi></mml:math>, constant GC-content and a reverse-complement constraint

Niema Aboluion; Derek H. Smith; Stephanie Perkins

doi:10.1016/j.disc.2011.11.021

Thermodynamic Post-Processing versus GC-Content Pre-Processing for DNA Codes Satisfying the Hamming Distance and Reverse-Complement Constraints

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2014.2299815 ◽

2014 ◽

Vol 11 (2) ◽

pp. 441-452 ◽

Cited By ~ 7

Author(s):

Dan Tulpan ◽

Derek H. Smith ◽

Roberto Montemanni

Keyword(s):

Hamming Distance ◽

Gc Content ◽

Post Processing ◽

Dna Codes ◽

Reverse Complement

Download Full-text

Bounds for DNA Codes with Constant GC-Content

The Electronic Journal of Combinatorics ◽

10.37236/1726 ◽

2003 ◽

Vol 10 (1) ◽

Cited By ~ 35

Author(s):

Oliver D. King

Keyword(s):

Lower Bounds ◽

Hamming Distance ◽

Gc Content ◽

Maximum Size ◽

Additional Constraint ◽

Upper And Lower Bounds ◽

Dna Codes ◽

Minimum Hamming Distance ◽

Reverse Complement

We derive theoretical upper and lower bounds on the maximum size of DNA codes of length $n$ with constant GC-content $w$ and minimum Hamming distance $d$, both with and without the additional constraint that the minimum Hamming distance between any codeword and the reverse-complement of any codeword be at least $d$. We also explicitly construct codes that are larger than the best previously-published codes for many choices of the parameters $n$, $d$ and $w$.

Download Full-text

Run-Length Constraint of Cyclic Reverse-Complement and Constant GC-Content DNA Codes

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.2019eap1053 ◽

2020 ◽

Vol E103.A (1) ◽

pp. 325-333 ◽

Cited By ~ 3

Author(s):

Ramy TAKI ELDIN ◽

Hajime MATSUI

Keyword(s):

Gc Content ◽

Run Length ◽

Dna Codes ◽

Length Constraint ◽

Reverse Complement

Download Full-text

Construction of Dual Cyclic Codes over {F}_{2}[u,v]/ for DNA Computation

Defence Science Journal ◽

10.14429/dsj.68.12344 ◽

2018 ◽

Vol 68 (5) ◽

pp. 467-472

Author(s):

Manoj Kumar Singh ◽

Abhay Kumar Singh ◽

Narendra Kumar ◽

Pooja Mishra ◽

Indivar Gupta

Keyword(s):

Cyclic Codes ◽

Gc Content ◽

Inner Product ◽

Dual Codes ◽

Dna Computation ◽

Dna Codes ◽

Reverse Complement

Here, we assume the construction of cyclic codes over ℜ={F}_{2}[u,v]/ < u^2, v^2 - v, uv - vu >. In particular, dual cyclic codes over ℜ= {F}_{2}[u]/ <u^2> with respect to Euclidean inner product are discussed. The cyclic dual codes over ℜ are studied with respect to DNA codes (reverse and reverse complement). Many interesting results are obtained. Some examples are also provided, which explain the main results. The GC-Content and DNA codes over ℜ are discussed. We summarise the article by giving a special DNA table.

Download Full-text

Linear and nonlinear constructions of DNA codes with Hamming distance d and constant GC-content

Discrete Mathematics ◽

10.1016/j.disc.2010.03.005 ◽

2011 ◽

Vol 311 (13) ◽

pp. 1207-1219 ◽

Cited By ~ 5

Author(s):

Derek H. Smith ◽

Niema Aboluion ◽

Roberto Montemanni ◽

Stephanie Perkins

Keyword(s):

Hamming Distance ◽

Dna Codes ◽

Linear And Nonlinear

Download Full-text

An Intelligent Optimization Algorithm for Constructing a DNA Storage Code: NOL-HHO

International Journal of Molecular Sciences ◽

10.3390/ijms21062191 ◽

2020 ◽

Vol 21 (6) ◽

pp. 2191 ◽

Cited By ~ 4

Author(s):

Qiang Yin ◽

Ben Cao ◽

Xue Li ◽

Bin Wang ◽

Qiang Zhang ◽

...

Keyword(s):

Optimization Algorithm ◽

Dna Sequences ◽

Learning Strategy ◽

Hamming Distance ◽

Experimental Testing ◽

Gc Content ◽

Smooth Transition ◽

Local Optima ◽

Dna Storage

The high density, large capacity, and long-term stability of DNA molecules make them an emerging storage medium that is especially suitable for the long-term storage of large datasets. The DNA sequences used in storage need to consider relevant constraints to avoid nonspecific hybridization reactions, such as the No-runlength constraint, GC-content, and the Hamming distance. In this work, a new nonlinear control parameter strategy and a random opposition-based learning strategy were used to improve the Harris hawks optimization algorithm (for the improved algorithm NOL-HHO) in order to prevent it from falling into local optima. Experimental testing was performed on 23 widely used benchmark functions, and the proposed algorithm was used to obtain better coding lower bounds for DNA storage. The results show that our algorithm can better maintain a smooth transition between exploration and exploitation and has stronger global exploration capabilities as compared with other algorithms. At the same time, the improvement of the lower bound directly affects the storage capacity and code rate, which promotes the further development of DNA storage technology.

Download Full-text

New DNA Codes from Cyclic Codes over Mixed Alphabets

Mathematics ◽

10.3390/math8111977 ◽

2020 ◽

Vol 8 (11) ◽

pp. 1977

Author(s):

Hai Q. Dinh ◽

Sachin Pathak ◽

Ashish Kumar Upadhyay ◽

Woraphon Yamaka

Keyword(s):

Cyclic Codes ◽

Sufficient Conditions ◽

Necessary And Sufficient Conditions ◽

Block Length ◽

Dna Codes ◽

Reverse Complement ◽

Gray Maps ◽

Cyclic Dna Codes ◽

One To One ◽

Necessary And Sufficient

Let R=F4+uF4,withu2=u and S=F4+uF4+vF4,withu2=u,v2=v,uv=vu=0. In this paper, we study F4RS-cyclic codes of block length (α,β,γ) and construct cyclic DNA codes from them. F4RS-cyclic codes can be viewed as S[x]-submodules of Fq[x]⟨xα−1⟩×R[x]⟨xβ−1⟩×S[x]⟨xγ−1⟩. We discuss their generator polynomials as well as the structure of separable codes. Using the structure of separable codes, we study cyclic DNA codes. By using Gray maps ψ1 from R to F42 and ψ2 from S to F43, we give a one-to-one correspondence between DNA codons of the alphabets {A,T,G,C}2,{A,T,G,C}3 and the elements of R,S, respectively. Then we discuss necessary and sufficient conditions of cyclic codes over F4, R, S and F4RS to be reversible and reverse-complement. As applications, we provide examples of new cyclic DNA codes constructed by our results.

Download Full-text

On cyclic DNA codes over the rings Z4 + wZ4 and Z4 + wZ4 + vZ4 + wvZ4

BIOMATH ◽

10.11145/j.biomath.2017.12.167 ◽

2017 ◽

Vol 6 (2) ◽

pp. 1712167 ◽

Cited By ~ 2

Author(s):

Abdullah Dertli ◽

Yasemin Cengellenmis

Keyword(s):

Cyclic Codes ◽

Finite Ring ◽

Finite Rings ◽

Binary Images ◽

Dna Codes ◽

Reverse Complement ◽

Skew Cyclic Codes ◽

Cyclic Dna Codes

The structures of cyclic DNA codes of odd length over the finite rings R = Z4 + wZ4, w^2 = 2 and S = Z4 + wZ4 + vZ4 + wvZ4; w^2 = 2; v^2 =v; wv = vw are studied. The links between the elements of the rings R, S and 16 and 256 codons are established, respectively. The cyclic codes of odd length over the finite ring R satisfy reverse complement constraint and the cyclic codes of odd length over the finite ring S satisfy reverse constraint and reverse complement constraint are studied. The binary images of the cyclic DNA codes over the finite rings R and S are determined. Moreover, a family of DNA skew cyclic codes over R is constructed, its property of being reverse complement is studied.

Download Full-text

A Quantitative Genomic View of the Coronaviruses: SARS-COV2

10.20944/preprints202003.0344.v1 ◽

2020 ◽

Cited By ~ 4

Author(s):

Sk Sarif Hassan ◽

Ranjeet Kumar Rout ◽

Vipul Sharma

Keyword(s):

Hurst Exponent ◽

Hamming Distance ◽

Gc Content ◽

Large Family ◽

Healthcare Facility ◽

Viral Disease ◽

Complete Sequences ◽

Quantitative Understanding ◽

The Common ◽

Purine Pyrimidine

In 2020, the pandemic caused by the Coronaviruses (CoV) that are a large family of viruses that cause illness ranging from the common cold to more severe diseases such as Middle East Respiratory Syndrome (MERS-CoV) and Severe Acute Respiratory Syndrome (SARS-CoV2). The Coronavirus disease (COVID-19) is a new strain that was discovered in 2019 and has not been previously identified in humans. It is the high time to investigate the quantitative and/or qualitative genomic informations of the virus SARS-CoV2 in order to strengthen the healthcare facility to fight against this viral disease. In this article, a through quantitative understanding of the purine and pyrimidine spatial distribution/organization of all 89 complete sequences of SARS-CoV (available as on date in the NCBI virus database, is made using different parameters such as fractal dimension, Hurst exponent, Shannon entropy and GC content of the nucleotide sequences of the genome of SARS-CoV2. Also a cluster among all the the SARS-CoV sequences of nucleotide have been made based on their phylogeny made through their closeness (Hamming distance) based on respective purine-pyrimidine distribution.

Download Full-text

Development of a program for in silico optimized selection of oligonucleotide-based molecular barcodes

PLoS ONE ◽

10.1371/journal.pone.0246354 ◽

2021 ◽

Vol 16 (2) ◽

pp. e0246354

Author(s):

In Seok Yang ◽

Sang Won Bae ◽

BeumJin Park ◽

Sangwoo Kim

Keyword(s):

Simple Sequence Repeats ◽

Hamming Distance ◽

Gc Content ◽

Computation Time ◽

High Tech ◽

Molecular Barcoding ◽

Dna Oligonucleotides ◽

Comparable Performance ◽

Simple Sequence ◽

Molecular Barcodes

Short DNA oligonucleotides (~4 mer) have been used to index samples from different sources, such as in multiplex sequencing. Presently, longer oligonucleotides (8–12 mer) are being used as molecular barcodes with which to distinguish among raw DNA molecules in many high-tech sequence analyses, including low-frequent mutation detection, quantitative transcriptome analysis, and single-cell sequencing. Despite some advantages of using molecular barcodes with random sequences, such an approach, however, makes it impossible to know the exact sequences used in an experiment and can lead to inaccurate interpretation due to misclustering of barcodes arising from the occurrence of unexpected mutations in the barcodes. The present study introduces a tool developed for selecting an optimal barcode subset during molecular barcoding. The program considers five barcode factors: GC content, homopolymers, simple sequence repeats with repeated units of dinucleotides, Hamming distance, and complementarity between barcodes. To evaluate a selected barcode set, penalty scores for the factors are defined based on their distributions observed in random barcodes. The algorithm employed in the program comprises two steps: i) random generation of an initial set and ii) optimal barcode selection via iterative replacement. Users can execute the program by inputting barcode length and the number of barcodes to be generated. Furthermore, the program accepts a user’s own values for other parameters, including penalty scores, for advanced use, allowing it to be applied in various conditions. In many test runs to obtain 100000 barcodes with lengths of 12 nucleotides, the program showed fast performance, efficient enough to generate optimal barcode sequences with merely the use of a desktop PC. We also showed that VFOS has comparable performance, flexibility in program running, consideration of simple sequence repeats, and fast computation time in comparison with other two tools (DNABarcodes and FreeBarcodes). Owing to the versatility and fast performance of the program, we expect that many researchers will opt to apply it for selecting optimal barcode sets during their experiments, including next-generation sequencing.

Download Full-text

Linear and nonlinear constructions of DNA codes with Hamming distance d, constant GC-content and a reverse-complement constraint

Thermodynamic Post-Processing versus GC-Content Pre-Processing for DNA Codes Satisfying the Hamming Distance and Reverse-Complement Constraints

Bounds for DNA Codes with Constant GC-Content

Run-Length Constraint of Cyclic Reverse-Complement and Constant GC-Content DNA Codes

Construction of Dual Cyclic Codes over {F}_{2}[u,v]/ for DNA Computation

Linear and nonlinear constructions of DNA codes with Hamming distance d and constant GC-content

An Intelligent Optimization Algorithm for Constructing a DNA Storage Code: NOL-HHO

New DNA Codes from Cyclic Codes over Mixed Alphabets

On cyclic DNA codes over the rings Z4 + wZ4 and Z4 + wZ4 + vZ4 + wvZ4

A Quantitative Genomic View of the Coronaviruses: SARS-COV2

Development of a program for in silico optimized selection of oligonucleotide-based molecular barcodes

Export Citation Format