Algorithms for the search of amino acid patterns in nucleic acid sequences

AbstractMotivationBiological sequence alignment is fundamental to their further interpretation. Current alignment algorithms typically align either nucleic acid or amino acid sequences. Using only nucleic acid sequence similarity, divergent sequences cannot be aligned reliably because of the limited alphabet and genetic saturation. To align divergent coding nucleic acid sequences, one can align using the translated amino acid sequences. This requires the detection of the correct open reading frame, is prone to eventual frame shift errors, and typically requires the treatment of genes separately. It was our motivation to design a nucleic acid sequence alignment algorithm to align a nucleic acid sequence against a (reference) genome sequence, that works equally well for similar and divergent sequences, and produces an optimal alignment considering simultaneously the alignment of all annotated coding sequences.ResultsWe define a genome alignment score for evaluating the quality of an alignment of a nucleic acid query sequence against a reference genome sequence, for which coding sequence features have been annotated (for example in a GenBank record). The genome alignment score combines the a ne gap score for the nucleic acid sequence with an a ne gap score for all amino acid alignments resulting from coding sequences in open reading frames contained within the query sequence. We present a Dynamic Programming algorithm to compute the optimal global or local alignment using this genomic alignment score and provide a formal proof of correctness. This algorithm allows the alignment of nucleic acid sequences from closely related and highly divergent sequences within the same software and using the same parameters, automatically correcting any eventual frame shift errors and produces at the same time the aligned translated amino acid sequences of all relevant coding sequence features.AvailabilityThe software is available as a web application at http://www.genomedetective.com/app/aga and as command-line application at https://github.com/emweb/aga

Download Full-text

Evolutionary Viewpoint on GnRH (gonadotropin-releasing hormone) in Chordata - Amino Acid and Nucleic Acid Sequences

Development & Reproduction ◽

10.12717/dr.2018.22.2.119 ◽

2018 ◽

Vol 22 (2) ◽

pp. 119-132 ◽

Cited By ~ 4

Author(s):

Donchan Choi

Keyword(s):

Amino Acid ◽

Nucleic Acid ◽

Gonadotropin Releasing Hormone ◽

Releasing Hormone ◽

Nucleic Acid Sequences

Download Full-text

Gold-Aptamer-Nanoconstructs Engineered to Detect Conserved Enteroviral Nucleic Acid Sequences

10.26434/chemrxiv.8312324.v1 ◽

2019 ◽

Author(s):

Veeren Chauhan ◽

Mohamed M Elsutohy ◽

C Patrick McClure ◽

Will Irving ◽

Neil Roddis ◽

...

Keyword(s):

Nucleic Acid ◽

In Silico ◽

Point Of Care ◽

Lateral Flow ◽

Nucleic Acid Sequence ◽

In Silico Screening ◽

Lateral Flow Assays ◽

Life Threatening ◽

Software And Hardware ◽

Nucleic Acid Sequences

Enteroviruses are a ubiquitous mammalian pathogen that can produce mild to life-threatening disease. Bearing this in mind, we have developed a rapid, accurate and economical point-of-care biosensor that can detect a nucleic acid sequences conserved amongst 96% of all known enteroviruses. The biosensor harnesses the physicochemical properties of gold nanoparticles and aptamers to provide colourimetric, spectroscopic and lateral flow-based identification of an exclusive enteroviral RNA sequence (23 bases), which was identified through in silico screening. Aptamers were designed to demonstrate specific complementarity towards the target enteroviral RNA to produce aggregated gold-aptamer nanoconstructs. Conserved target enteroviral nucleic acid sequence (≥ 1x10-7 M, ≥1.4×10-14 g/mL), initiates gold-aptamer-nanoconstructs disaggregation and a signal transduction mechanism, producing a colourimetric and spectroscopic blueshift (544 nm (purple) > 524 nm (red)). Furthermore, lateral-flow-assays that utilise gold-aptamer-nanoconstructs were unaffected by contaminating human genomic DNA, demonstrated rapid detection of conserved target enteroviral nucleic acid sequence (< 60 s) and could be interpreted with a bespoke software and hardware electronic interface. We anticipate our methodology will translate in-silico screening of nucleic acid databases to a tangible enteroviral desktop detector, which could be readily translated to related organisms. This will pave-the-way forward in the clinical evaluation of disease and complement existing strategies at overcoming antimicrobial resistance.

Download Full-text

Molecular biomarker analysis. General definitions and requirements for microarray detection of specific nucleic acid sequences

10.3403/30245292 ◽

2013 ◽

Keyword(s):

Nucleic Acid ◽

Molecular Biomarker ◽

Biomarker Analysis ◽

Microarray Detection ◽

Nucleic Acid Sequences

Download Full-text

Patenting Nucleic Acid Sequences: More Ambiguity From the High Court in DDArcy v Myriad Genetics Inc.?

SSRN Electronic Journal ◽

10.2139/ssrn.3187730 ◽

2018 ◽

Cited By ~ 1

Author(s):

Charles Lawson

Keyword(s):

Nucleic Acid ◽

High Court ◽

Myriad Genetics ◽

Nucleic Acid Sequences

Download Full-text

Dampable Waves along Nucleic Acid Sequences Mediating Nucleotides' Interactions

DNA Sequence ◽

10.1080/10425170410001683476 ◽

2004 ◽

Vol 15 (2) ◽

pp. 135-139

Author(s):

Tao Li ◽

Bin Han

Keyword(s):

Nucleic Acid ◽

Nucleic Acid Sequences

Download Full-text

KEC: unique sequence search by K-mer exclusion

Bioinformatics ◽

10.1093/bioinformatics/btab196 ◽

2021 ◽

Author(s):

Pavel Beran ◽

Dagmar Stehlíková ◽

Stephen P Cohen ◽

Vladislav Čurn

Keyword(s):

Amino Acid ◽

Nucleic Acid ◽

Source Code ◽

Unique Sequence ◽

Supplementary Information ◽

Supplementary Data ◽

Laptop Computers ◽

Sequence Search ◽

Target Sequences ◽

Cross Reference

Abstract Summary Searching for amino acid or nucleic acid sequences unique to one organism may be challenging depending on size of the available datasets. K-mer elimination by cross-reference (KEC) allows users to quickly and easily find unique sequences by providing target and non-target sequences. Due to its speed, it can be used for datasets of genomic size and can be run on desktop or laptop computers with modest specifications. Availability and implementation KEC is freely available for non-commercial purposes. Source code and executable binary files compiled for Linux, Mac and Windows can be downloaded from https://github.com/berybox/KEC. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text