Exploring Repetitive DNA Landscapes Using REPCLASS, a Tool That Automates the Classification of Transposable Elements in Eukaryotic Genomes

Conversion of DNA Sequences: From a Transposable Element to a Tandem Repeat or to a Gene

Genes ◽

10.3390/genes10121014 ◽

2019 ◽

Vol 10 (12) ◽

pp. 1014 ◽

Cited By ~ 1

Author(s):

Ana Paço ◽

Renata Freitas ◽

Ana Vieira-da-Silva

Keyword(s):

Repetitive Dna ◽

Dna Sequences ◽

Tandem Repeats ◽

Evolutionary Relationship ◽

Repetitive Dna Sequences ◽

Dna Remodeling ◽

Eukaryotic Genomes ◽

Similar Organization ◽

Dispersed Repeats

Eukaryotic genomes are rich in repetitive DNA sequences grouped in two classes regarding their genomic organization: tandem repeats and dispersed repeats. In tandem repeats, copies of a short DNA sequence are positioned one after another within the genome, while in dispersed repeats, these copies are randomly distributed. In this review we provide evidence that both tandem and dispersed repeats can have a similar organization, which leads us to suggest an update to their classification based on the sequence features, concretely regarding the presence or absence of retrotransposons/transposon specific domains. In addition, we analyze several studies that show that a repetitive element can be remodeled into repetitive non-coding or coding sequences, suggesting (1) an evolutionary relationship among DNA sequences, and (2) that the evolution of the genomes involved frequent repetitive sequence reshuffling, a process that we have designated as a “DNA remodeling mechanism”. The alternative classification of the repetitive DNA sequences here proposed will provide a novel theoretical framework that recognizes the importance of DNA remodeling for the evolution and plasticity of eukaryotic genomes.

Download Full-text

TEsorter: lineage-level classification of transposable elements using conserved protein domains

10.1101/800177 ◽

2019 ◽

Cited By ~ 7

Author(s):

Ren-Gang Zhang ◽

Zhao-Xuan Wang ◽

Shujun Ou ◽

Guang-Yuan Li

Keyword(s):

Transposable Elements ◽

Phylogenetic Relationships ◽

Protein Domains ◽

Ltr Retrotransposons ◽

Link Type ◽

Eukaryotic Genomes

AbstractSummaryTransposable elements (TEs) constitute an import part in eukaryotic genomes, but their classification, especially in the lineage or clade level, is still challenging. For this purpose, we propose TEsorter, which is based on conserved protein domains of TEs. It is easy-to-use, fast with multiprocessing, sensitive and precise to classify TEs especially LTR retrotransposons (LTR-RTs). Its results can also directly reflect phylogenetic relationships and diversities of the classified LTR-RTs.AvailabilityThe code in Python is freely available at https://github.com/zhangrengang/TEsorter.

Download Full-text

Computational approaches for identification and classification of transposable elements in eukaryotic genomes

Hereditas (Beijing) ◽

10.3724/sp.j.1005.2012.01009 ◽

2012 ◽

Vol 34 (8) ◽

pp. 1009-1019

Author(s):

Hong-En XU ◽

Hua-Hao ZHANG ◽

Min-Jin HAN ◽

Yi-Hong SHEN ◽

Xian-Zhi HUANG ◽

...

Keyword(s):

Transposable Elements ◽

Computational Approaches ◽

Eukaryotic Genomes

Download Full-text

TERL: classification of transposable elements by convolutional neural networks

Briefings in Bioinformatics ◽

10.1093/bib/bbaa185 ◽

2020 ◽

Author(s):

Murilo Horacio Pereira da Cruz ◽

Douglas Silva Domingues ◽

Priscila Tiemi Maeda Saito ◽

Alexandre Rossi Paschoal ◽

Pedro Henrique Bugatti

Keyword(s):

Neural Networks ◽

Transposable Elements ◽

Convolutional Neural Networks ◽

Dimensional Space ◽

Hierarchical Level ◽

Deep Convolutional Neural Networks ◽

One Dimensional ◽

Space Data ◽

Eukaryotic Genomes

Abstract Transposable elements (TEs) are the most represented sequences occurring in eukaryotic genomes. Few methods provide the classification of these sequences into deeper levels, such as superfamily level, which could provide useful and detailed information about these sequences. Most methods that classify TE sequences use handcrafted features such as k-mers and homology-based search, which could be inefficient for classifying non-homologous sequences. Here we propose an approach, called transposable elements pepresentation learner (TERL), that preprocesses and transforms one-dimensional sequences into two-dimensional space data (i.e., image-like data of the sequences) and apply it to deep convolutional neural networks. This classification method tries to learn the best representation of the input data to classify it correctly. We have conducted six experiments to test the performance of TERL against other methods. Our approach obtained macro mean accuracies and F1-score of 96.4% and 85.8% for superfamilies and 95.7% and 91.5% for the order sequences from RepBase, respectively. We have also obtained macro mean accuracies and F1-score of 95.0% and 70.6% for sequences from seven databases into superfamily level and 89.3% and 73.9% for the order level, respectively. We surpassed accuracy, recall and specificity obtained by other methods on the experiment with the classification of order level sequences from seven databases and surpassed by far the time elapsed of any other method for all experiments. Therefore, TERL can learn how to predict any hierarchical level of the TEs classification system and is about 20 times and three orders of magnitude faster than TEclass and PASTEC, respectively https://github.com/muriloHoracio/TERL. Contact:[email protected]

Download Full-text

A systematic review of the application of machine learning in the detection and classification of transposable elements

PeerJ ◽

10.7717/peerj.8311 ◽

2019 ◽

Vol 7 ◽

pp. e8311 ◽

Cited By ~ 5

Author(s):

Simon Orozco-Arias ◽

Gustavo Isaza ◽

Romain Guyot ◽

Reinel Tabares-Soto

Keyword(s):

Machine Learning ◽

Transposable Elements ◽

Nuclear Dna ◽

Deep Impact ◽

Different Types ◽

Review Protocol ◽

Research Questions ◽

Hidden Patterns ◽

Eukaryotic Genomes

Background Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. Methodology We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. Results Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. Conclusions ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest.

Download Full-text

Cluster and Grid Based Classification of Transposable Elements in Eukaryotic Genomes

Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06) ◽

10.1109/ccgrid.2006.1630938 ◽

2006 ◽

Cited By ~ 2

Author(s):

N. Ranganathan ◽

C. Feschotte ◽

D. Levine

Keyword(s):

Transposable Elements ◽

Eukaryotic Genomes ◽

Grid Based

Download Full-text

Transposable Elements and Teleost Migratory Behaviour

International Journal of Molecular Sciences ◽

10.3390/ijms22020602 ◽

2021 ◽

Vol 22 (2) ◽

pp. 602

Author(s):

Elisa Carotti ◽

Federica Carducci ◽

Adriana Canapa ◽

Marco Barucca ◽

Samuele Greco ◽

...

Keyword(s):

Transposable Elements ◽

Environmental Changes ◽

Chromosomal Rearrangements ◽

Quantitative Difference ◽

Regulatory Elements ◽

Phylogenetic Position ◽

Migratory Behaviour ◽

Relative Contribution ◽

Migratory Routes ◽

Eukaryotic Genomes

Transposable elements (TEs) represent a considerable fraction of eukaryotic genomes, thereby contributing to genome size, chromosomal rearrangements, and to the generation of new coding genes or regulatory elements. An increasing number of works have reported a link between the genomic abundance of TEs and the adaptation to specific environmental conditions. Diadromy represents a fascinating feature of fish, protagonists of migratory routes between marine and freshwater for reproduction. In this work, we investigated the genomes of 24 fish species, including 15 teleosts with a migratory behaviour. The expected higher relative abundance of DNA transposons in ray-finned fish compared with the other fish groups was not confirmed by the analysis of the dataset considered. The relative contribution of different TE types in migratory ray-finned species did not show clear differences between oceanodromous and potamodromous fish. On the contrary, a remarkable relationship between migratory behaviour and the quantitative difference reported for short interspersed nuclear (retro)elements (SINEs) emerged from the comparison between anadromous and catadromous species, independently from their phylogenetic position. This aspect is likely due to the substantial environmental changes faced by diadromous species during their migratory routes.

Download Full-text

A systematic search and classification of T2 family miniature inverted-repeat transposable elements (MITEs) in Xenopus tropicalis suggests the existence of recently active MITE subfamilies

Molecular Genetics and Genomics ◽

10.1007/s00438-009-0496-9 ◽

2009 ◽

Vol 283 (1) ◽

pp. 49-62 ◽

Cited By ~ 9

Author(s):

Akira Hikosaka ◽

Akira Kawahara

Keyword(s):

Transposable Elements ◽

Inverted Repeat ◽

Xenopus Tropicalis ◽

Systematic Search

Download Full-text

Top-down strategies for hierarchical classification of transposable elements with neural networks

2017 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2017.7966165 ◽

2017 ◽

Cited By ~ 9

Author(s):

Felipe Kenji Nakano ◽

Walter Jose Pinto ◽

Gisele Lobo Pappa ◽

Ricardo Cerri

Keyword(s):

Neural Networks ◽

Transposable Elements ◽

Hierarchical Classification ◽

Top Down

Download Full-text

Software Evaluation for de novo Detection of Transposons

10.1101/2021.02.08.430290 ◽

2021 ◽

Author(s):

Matias Rodriguez ◽

Wojciech Makałowski

Keyword(s):

Transposable Elements ◽

Genome Evolution ◽

De Novo ◽

Simulated Data ◽

Genomic Sequences ◽

Software Evaluation ◽

Easy Task ◽

Eukaryotic Genomes

AbstractTransposable elements (TEs) are major genomic components in most eukaryotic genomes and play an important role in genome evolution. However, despite their relevance the identification of TEs is not an easy task and a number of tools were developed to tackle this problem. To better understand how they perform, we tested several widely used tools for de novo TE detection and compared their performance on both simulated data and well curated genomic sequences. The results will be helpful for identifying common issues associated with TE-annotation and for evaluating how comparable are the results obtained with different tools.

Download Full-text