scholarly journals McClintock: An integrated pipeline for detecting transposable element insertions in whole genome shotgun sequencing data

2016 ◽  
Author(s):  
Michael G. Nelson ◽  
Raquel S. Linheiro ◽  
Casey M. Bergman

AbstractBackgroundTransposable element (TE) insertions are among the most challenging type of variants to detect in genomic data because of their repetitive nature and complex mechanisms of replication. Nevertheless, the recent availability of large resequencing datasets has spurred the development of many new methods to detect TE insertions in whole genome shotgun sequences. These methods generate output in diverse formats and have a large number of software and data dependencies, making their comparative evaluation challenging for potential users.ResultsHere we develop an integrated bioinformatics pipeline for the detection of TE insertions in whole genome shotgun data, called McClintock (https://github.com/bergmanlab/mcclintock), that automatically runs and generates standardized output for multiple TE detection methods. We demonstrate the utility of the McClintock system by performing comparative evaluation of six TE detection methods using simulated and real genome data from the model microbal eukaryote, Saccharomyces cerevisiae. We find substantial variation among McClintock component methods in their ability to detect non-reference insertions in the yeast genome, but show that non-reference TEs at nearly all biologically-realistic locations can be detected in simulated data by combining multiple methods that use split-read and read-pair evidence. In general, our results reveal that split-read methods detect fewer non-reference TE insertions than read-pair methods, but generally have much higher positional accuracy. Analysis of a large sample of real yeast genomes reveals that most, but not all, McClintock component methods can recover known aspects of TE biology in yeast such as the transpositional activity status of families, tRNA gene target preferences, and target site duplication structure, albeit with varying levels of positional accuracy.ConclusionsOur results suggest that no single TE detection method currently provides comprehensive detection of non-reference TEs, even in the context of a simplified model eukaryotic genome like S. cerevisiae. In spite of these limitations, the McClintock system provides a framework for testing, developing and integrating results from multiple TE detection methods to achieve this ultimate aim, as well as useful guidance for yeast researchers to select appropriate TE detection tools.

2021 ◽  
Vol 160 (6) ◽  
pp. S-569
Author(s):  
Manoj Dadlani ◽  
Kelly Moffat ◽  
Huai Li ◽  
Xin Zhou ◽  
Rita Colwell

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Chong Chu ◽  
Rebeca Borges-Monroy ◽  
Vinayak V. Viswanadham ◽  
Soohyun Lee ◽  
Heng Li ◽  
...  

AbstractTransposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea.


Author(s):  
Daniela Loconsole ◽  
Francesca Centrone ◽  
Caterina Morcavallo ◽  
Silvia Campanella ◽  
Anna Sallustio ◽  
...  

Epidemiological and virological studies have revealed that SARS-CoV-2 variants of concern (VOCs) are emerging globally, including in Europe. The aim of this study was to evaluate the spread of B.1.1.7-lineage SARS-CoV-2 in southern Italy from December 2020–March 2021 through the detection of the S gene target failure (SGTF), which could be considered a robust proxy of VOC B.1.1.7. SGTF was assessed on 3075 samples from week 52/2020 to week 10/2021. A subset of positive samples identified in the Apulia region during the study period was subjected to whole-genome sequencing (WGS). A descriptive and statistical analysis of the demographic and clinical characteristics of cases according to SGTF status was performed. Overall, 20.2% of samples showed SGTF; 155 strains were confirmed as VOC 202012/01 by WGS. The proportion of SGTF-positive samples rapidly increased over time, reaching 69.2% in week 10/2021. SGTF-positive cases were more likely to be symptomatic and to result in hospitalization (p < 0.0001). Despite the implementation of large-scale non-pharmaceutical interventions (NPIs), such as the closure of schools and local lockdowns, a rapid spread of VOC 202012/01 was observed in southern Italy. Strengthened NPIs and rapid vaccine deployment, first among priority groups and then among the general population, are crucial both to contain the spread of VOC 202012/01 and to flatten the curve of the third wave.


2018 ◽  
Vol 6 (26) ◽  
Author(s):  
Zhong Liang ◽  
Melissa Stephens ◽  
Victoria A. Ploplis ◽  
Shaun W. Lee ◽  
Francis J. Castellino

Whole-genome shotgun sequences and bottom-up assembly of contigs of six skin isolates of Streptococcus pyogenes, viz., NS88.3 (emm98.1), NS223 (emm91), NS455 (emm52), SS1448 (emm86.2), SS1572 (emm223), and SS1574 (emm224), are presented here. All contigs were annotated, and the gene arrangements and the inferred proteins were consistent with a pattern D classification.


2011 ◽  
Vol 193 (19) ◽  
pp. 5553-5554 ◽  
Author(s):  
W. Ghosh ◽  
A. George ◽  
A. Agarwal ◽  
P. Raj ◽  
M. Alam ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document