scholarly journals tRNAscan-SE 2.0: Improved Detection and Functional Classification of Transfer RNA Genes

2019 ◽  
Author(s):  
Patricia P. Chan ◽  
Brian Y. Lin ◽  
Allysia J. Mak ◽  
Todd M. Lowe

ABSTRACTtRNAscan-SE has been widely used for whole-genome transfer RNA gene prediction for nearly two decades. With the increased availability of new genomes, a vastly larger training set has enabled creation of nearly one hundred specialized isotype-specific models, greatly improving tRNAscan-SE’s ability to identify and classify both typical and atypical tRNAs. We employ a new multi-model annotation strategy where predicted tRNAs are scored against a full set of isotype-specific covariance models. A post-filtering feature also better identifies tRNA-derived SINEs that are abundant in many eukaryotic genomes, and provides a “high confidence” tRNA gene set which improves upon prior pseudogene prediction. These new enhancements of tRNAscan-SE will provide researchers more accurate detection and more comprehensive annotation for tRNA genes.

1999 ◽  
Vol 45 (9) ◽  
pp. 791-796 ◽  
Author(s):  
Andrew M Kropinski ◽  
Mary Jo Sibbald

Using tRNAscan-SE and FAStRNA we have identified four tRNA genes in the delayed early region of the bacteriophage D3 genome (GenBank accession No. AF077308). These are specific for methionine (AUG), glycine (GGA), asparagine (AAC), and threonine (ACA). The D3 Thr- and Gly-tRNAs recognize codons, which are rarely used in Pseudomonas aeruginosa and presumably, influence the rate of translation of phage proteins. BLASTN searches revealed that the D3 tRNA genes have homology to tRNA genes from Gram-positive bacteria. Analysis of codon usage in the 91 ORFs discovered in D3 indicates patterns of codon usage reminiscent of Escherichia coli or P. aeruginosa.Key words: bacteriophage, Pseudomonas, D3, tRNA, codon usage.


2016 ◽  
Vol 14 (2) ◽  
pp. 215-224
Author(s):  
Lê Thanh Hòa ◽  
Nguyễn Thị Khuê ◽  
Nguyễn Thị Bích Nga ◽  
Đỗ Thị Roan ◽  
Đỗ Trung Dũng ◽  
...  

The small intestinal fluke, Haplorchis taichui Nishigori, 1924, belonging to genus Haplorchis (family Heterophyidae, class Trematoda, phylum Platyhelminthes), is a zoonotic pathogen causing disease in humans and animals. Complete mitochondrial genome (mtDNA) of H. taichui (strain HTAQT, collected from Quang Tri) was obtained and characterized for structural genomics providing valuable data for studies on epidemiology, species identification, diagnosis, classification, molecular phylogenetic relationships and prevention of the disease. The entire nucleotide mtDNA sequence of H. taichui (HTAQT) is 15.119 bp in length, containing 36 genes, including 12 protein-coding genes (cox1, cox2, cox3, nad1, nad2, nad3, nad4L, nad4, nad5, nad6, atp6 and cob); 2 ribosomal RNA genes, rrnL (16S) and rrnS (12S); 22 transfer RNA genes (tRNA or trn), and a non-coding region (NR), divided into two sub-regions of short non-coding (short, SNR) and long non-coding (long, LNR). LNR region, 1.692 bp in length, located between the position of trnG (transfer RNA-Glycine) and trnE (Glutamic acid), contains 6 tandem repeats (TR), arranged as TR1A, TR2A, TR1B, TR2B, TR3A, TR3B, respectively. Each protein coding gene (overall, 12 genes), ribosomal rRNA (2 genes) and tRNA (22 genes) were analyzed, in particular, protein-coding genes were defined in length, start and stop codons, and rRNA and tRNA genes for secondary structure.


2021 ◽  
Author(s):  
Valerie Cognat ◽  
Gael Pawlak ◽  
David Pflieger ◽  
Laurence Drouard

PlantRNA (http://plantrna.ibmp.cnrs.fr/) is a comprehensive database of transfer RNA (tRNA) gene sequences retrieved from fully annotated nuclear, plastidial and mitochondrial genomes of photosynthetic organisms. In the first release (PlantRNA 1.0), tRNA genes from 11 organisms were annotated. In this second version, the annotation was implemented to 48 photosynthetic species covering the whole phylogenetic tree of photosynthetic organisms, from the most basal group of Archeplastida, the glaucophyte Cyanophora paradoxa, to various land plants. Transfer RNA genes from lower photosynthetic organisms such as streptophyte algae or lycophytes as well as extremophile photosynthetic species such as Eutrema parvulum were incorporated in the database. As a whole, circa 35 000 tRNA genes were accurately annotated. In the frame of the tRNA genes annotation from the genome of the Rhodophyte Chondrus crispus, putative unconventional splicing sites in the D- or T- regions of tRNA molecules were experimentally determined to strengthen the quality of the database. As for PlantRNA 1.0, comprehensive biological information including flanking sequences, A and B box sequences, region of transcription initiation and poly(T) transcription termination stretches, tRNA intron sequences and tRNA mitochondrial import are included.


Genes ◽  
2019 ◽  
Vol 10 (3) ◽  
pp. 198 ◽  
Author(s):  
Tiezhu Yang ◽  
Guolyu Xu ◽  
Bingning Gu ◽  
Yanmei Shi ◽  
Hellen Lucas Mzuka ◽  
...  

The mitochondrial genome (mitogenome) can provide information for phylogenetic analyses and evolutionary biology. We first sequenced, annotated, and characterized the mitogenome of Philomycus bilineatus in this study. The complete mitogenome was 14,347 bp in length, containing 13 protein-coding genes (PCGs), 23 transfer RNA genes, two ribosomal RNA genes, and two non-coding regions (A + T-rich region). There were 15 overlap locations and 18 intergenic spacer regions found throughout the mitogenome of P. bilineatus. The A + T content in the mitogenome was 72.11%. All PCGs used a standard ATN as a start codon, with the exception of cytochrome c oxidase 1 (cox1) and ATP synthase F0 subunit 8 (atp8) with TTG and GTG. Additionally, TAA or TAG was identified as the typical stop codon. All transfer RNA (tRNA) genes had a typical clover-leaf structure, except for trnS1 (AGC), trnS2 (TCA), and trnK (TTT). A phylogenetic analysis with another 37 species of gastropods was performed using Bayesian inference, based on the amino acid sequences of 13 mitochondrial PCGs. The results indicated that P. bilineatus shares a close ancestry with Meghimatium bilineatum. It seems more appropriate to reclassify it as Arionoidea rather than Limacoidea, as previously thought. Our research may provide a new meaningful insight into the evolution of P. bilineatus.


1985 ◽  
Vol 232 (1) ◽  
pp. 223-228 ◽  
Author(s):  
T Samuelsson ◽  
P Elias ◽  
F Lustig ◽  
Y S Guindy

As part of an investigation of the tRNA genes of Mycoplasma mycoides, two HindIII fragments of mycoplasma DNA comprising 0.4 and 2.5 kilobases (kb), respectively, were cloned in pBR322 and their nucleotide sequences determined. Only one tRNA gene was found in the 0.4 kb fragment, the gene for tRNAArg with the anticodon TCT, while the 2.5 kb fragment contained nine different tRNA genes arranged in a cluster which presumably constitutes a transcriptional unit. The clustered tRNA genes, with their respective anticodons, were as follows: Arg (ACG), Pro (TGG), Ala (TGC), Met (CAT), Ile (CAT), Ser (TGA), fMet (CAT), Asp (GTC), and Phe (GAA).


2021 ◽  
Vol 11 (9) ◽  
pp. 3974
Author(s):  
Laila Bashmal ◽  
Yakoub Bazi ◽  
Mohamad Mahmoud Al Rahhal ◽  
Haikel Alhichri ◽  
Naif Al Ajlan

In this paper, we present an approach for the multi-label classification of remote sensing images based on data-efficient transformers. During the training phase, we generated a second view for each image from the training set using data augmentation. Then, both the image and its augmented version were reshaped into a sequence of flattened patches and then fed to the transformer encoder. The latter extracts a compact feature representation from each image with the help of a self-attention mechanism, which can handle the global dependencies between different regions of the high-resolution aerial image. On the top of the encoder, we mounted two classifiers, a token and a distiller classifier. During training, we minimized a global loss consisting of two terms, each corresponding to one of the two classifiers. In the test phase, we considered the average of the two classifiers as the final class labels. Experiments on two datasets acquired over the cities of Trento and Civezzano with a ground resolution of two-centimeter demonstrated the effectiveness of the proposed model.


Author(s):  
K Sooknunan ◽  
M Lochner ◽  
Bruce A Bassett ◽  
H V Peiris ◽  
R Fender ◽  
...  

Abstract With the advent of powerful telescopes such as the Square Kilometer Array and the Vera C. Rubin Observatory, we are entering an era of multiwavelength transient astronomy that will lead to a dramatic increase in data volume. Machine learning techniques are well suited to address this data challenge and rapidly classify newly detected transients. We present a multiwavelength classification algorithm consisting of three steps: (1) interpolation and augmentation of the data using Gaussian processes; (2) feature extraction using wavelets; (3) classification with random forests. Augmentation provides improved performance at test time by balancing the classes and adding diversity into the training set. In the first application of machine learning to the classification of real radio transient data, we apply our technique to the Green Bank Interferometer and other radio light curves. We find we are able to accurately classify most of the eleven classes of radio variables and transients after just eight hours of observations, achieving an overall test accuracy of 78%. We fully investigate the impact of the small sample size of 82 publicly available light curves and use data augmentation techniques to mitigate the effect. We also show that on a significantly larger simulated representative training set that the algorithm achieves an overall accuracy of 97%, illustrating that the method is likely to provide excellent performance on future surveys. Finally, we demonstrate the effectiveness of simultaneous multiwavelength observations by showing how incorporating just one optical data point into the analysis improves the accuracy of the worst performing class by 19%.


2014 ◽  
Vol 539 ◽  
pp. 181-184
Author(s):  
Wan Li Zuo ◽  
Zhi Yan Wang ◽  
Ning Ma ◽  
Hong Liang

Accurate classification of text is a basic premise of extracting various types of information on the Web efficiently and utilizing the network resources properly. In this paper, a brand new text classification method was proposed. Consistency analysis method is a type of iterative algorithm, which mainly trains different classifiers (weak classifier) by aiming at the same training set, and then these classifiers will be gathered for testing the consistency degrees of various classification methods for the same text, thus to manifest the knowledge of each type of classifier. It main determines the weight of each sample according to the fact is the classification of each sample is accurate in each training set, as well as the accuracy of the last overall classification, and then sends the new data set whose weight has been modified to the subordinate classifier for training. In the end, the classifier gained in the training will be integrated as the final decision classifier. The classifier with consistency analysis can eliminate some unnecessary training data characteristics and place the key words on key training data. According to the experimental result, the average accuracy of this method is 91.0%, while the average recall rate is 88.1%.


Zootaxa ◽  
2017 ◽  
Vol 4363 (4) ◽  
pp. 506
Author(s):  
HUAXUAN LIU ◽  
LIYUN YAN ◽  
GUOFANG JIANG

In this study, we reported the complete mitochondrial genome (mitogenome) of Sinopodisma pieli by polymerase chain reaction method for the first time, the type species of the genus Sinopodisma. Its mitogenome was a circular DNA molecule of 15,625 bp in length, with 76.0% A+T, and contained 13 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes and one A+T control region. The overall base composition of the S. pieli mitogenome was 42.8% for A, 33.2% for T, 13.5% for C, and 10.5% for G, respectively. All 13 mitochondrial PCGs shared the start codon ATN. Twelve of the PCGs ended with termination codon TAA and TAG, while cytochrome coxidase subunit 1 (COI) utilized an incomplete T as terminator codon. All tRNA genes could be folded into the typical cloverleaf secondary structure, except trnS(AGN) lacking of dihydrouridine arm. The sizes of the large and small ribosomal RNA genes were 1379 bp and 794 bp, respectively. The A+T rich region was 798 bp in length and contained 88.5% AT content. A phylogenetic analysis based on 13 PCGs by using Bayesian inference (BI) and maximum likelihood (ML) revealed that Sinopodisma is not monophyletic group. We think that the name and taxonomic status of S. tsinlingensis are right, and it should not be moved into the genus Pedopodisma. These data will provide important information for a better understanding of the population genetics and species identification for Sinopodisma. 


Sign in / Sign up

Export Citation Format

Share Document