short read mapping
Recently Published Documents


TOTAL DOCUMENTS

55
(FIVE YEARS 6)

H-INDEX

16
(FIVE YEARS 0)

2021 ◽  
Vol 11 (02) ◽  
pp. 01-07
Author(s):  
Alvin Chon ◽  
Xiaoqiu Huang

Short Read Alignment Mapping Metrics (SRAMM): is an efficient and versatile command line tool providing additional short read mapping metrics, filtering, and graphs. Short read aligners report MAPing Quality (MAPQ), but these methods generally are neither standardized nor well described in literature or software manuals. Additionally, third party mapping quality programs are typically computationally intensive or designed for specific applications. SRAMM efficiently generates multiple different concept-based mapping scores to provide for an informative post alignment examination and filtering process of aligned short reads for various downstream applications. SRAMM is compatible with Python 2.6+ and Python 3.6+ on all operating systems. It works with any short read aligner that generates SAM/BAM/CRAM file outputs and reports 'AS' tags. It is freely available under the MIT license at http://github.com/achon/sramm.


2021 ◽  
Author(s):  
Kristoffer Sahlin

Short-read genome alignment is a fundamental computational step used in many bioinformatic analyses. It is therefore desirable to align such data as fast as possible. Most alignment algorithms consider a seed-and-extend approach. Several popular programs perform the seeding step based on the Burrows-Wheeler Transform with a low memory footprint, but they are relatively slow compared to more recent approaches that use a minimizer-based seeding-and-chaining strategy. Recently, syncmers and strobemers were proposed for sequence comparison. Both protocols were designed for improved conservation of matches between sequences under mutations. Syncmers is a thinning protocol proposed as an alternative to minimizers, while strobemers is a linking protocol for gapped sequences and was proposed as an alternative to k-mers. The main contribution in this work is a new seeding approach that combines syncmers and strobemers. We use a strobemer protocol (randstrobes) to link together syncmers (i.e., in syncmer-space) instead of over the original sequence. Our protocol allows us to create longer seeds while preserving mapping accuracy. A longer seed length reduces the number of candidate regions which allows faster mapping and alignment. We also contribute the insight that speed-wise, this protocol is particularly effective when syncmers are canonical. Canonical syncmers can be created for specific parameter combinations and reduce the computational burden of computing the non-canonical randstrobes in reverse complement. We implement our idea in a proof-of-concept short-read aligner strobealign that aligns short reads 3-4x faster than minimap2 and 15-23x faster than BWA and Bowtie2. Many implementation versions of, e.g., BWA, achieve high speed on specific hardware. Our contribution is algorithmic and requires no hardware architecture or system-specific instructions. Strobealign is available at https://github.com/ksahlin/StrobeAlign.


2021 ◽  
Vol 32 (6) ◽  
pp. 1465-1478
Author(s):  
Yen-Lung Chen ◽  
Bo-Yi Chang ◽  
Chia-Hsiang Yang ◽  
Tzi-Dar Chiueh

SoftwareX ◽  
2021 ◽  
Vol 14 ◽  
pp. 100692
Author(s):  
Sebastian Deorowicz ◽  
Adam Gudyś

2020 ◽  
Vol 5 (1) ◽  
Author(s):  
C. Trier ◽  
G. Fournous ◽  
J. M. Strand ◽  
A. Stray-Pedersen ◽  
R. D. Pettersen ◽  
...  

2019 ◽  
Author(s):  
Sebastian Deorowicz ◽  
Adam Gudyś

AbstractSummaryWhisper 2 is a short-read-mapping software providing superior quality of indel variant calling. Its running times place it among the fastest existing tools.Availability and Implementationhttps://github.com/refresh-bio/[email protected] informationSupplementary data are available at publisher’s Web site.


2019 ◽  
Author(s):  
Hsin-Nan Lin ◽  
Wen-Lian Hsu

AbstractWith the advance of next-generation sequencing (NGS) technologies, more and more medical and biological researches adopt NGS technologies to characterize the genetic variations between individuals. The identification of personal genome variants using NGS technology is a critical factor for the success of clinical genomics studies. It requires an accurate and consistent analysis procedure to distinguish functional or disease-associated variants from false discoveries due to sequencing errors or misalignments. In this study, we integrate the algorithms for read mapping and variant calling to develop an efficient and versatile NGS analysis tool, called MapCaller. It not only maps every short read onto a reference genome, but it also detects single nucleotide variants, indels, inversions and translocations at the same time. We evaluate the performance of MapCaller with existing variant calling pipelines using three simulated datasets and four real datasets. The result shows that MapCaller can identify variants accurately. Moreover, MapCaller runs much faster than existing methods. It is available at https://github.com/hsinnan75/MapCaller.


2019 ◽  
Vol 5 (2) ◽  
Author(s):  
Matthew J Ballinger ◽  
Derek J Taylor

Abstract How insects combat RNA virus infection is a subject of intensive research owing to its importance in insect health, virus evolution, and disease transmission. In recent years, a pair of potentially linked phenomena have come to light as a result of this work—first, the pervasive production of viral DNA from exogenous nonretroviral RNA in infected individuals, and second, the widespread distribution of nonretroviral integrated RNA virus sequences (NIRVs) in the genomes of diverse eukaryotes. The evolutionary consequences of NIRVs for viruses are unclear and the field would benefit from studies of natural virus infections co-occurring with recent integrations, an exceedingly rare circumstance in the literature. Here, we provide evidence that a novel insect-infecting phasmavirus (Order Bunyavirales) has been persisting in a phantom midge host, Chaoborus americanus, for millions of years. Interestingly, the infection persists despite the host’s acquisition (during the Pliocene), fixation, and expression of the viral nucleoprotein gene. We show that virus prevalence and geographic distribution are high and broad, comparable to the host-specific infections reported in other phantom midges. Short-read mapping analyses identified a lower abundance of the nucleoprotein-encoding genome segment in this virus relative to related viruses. Finally, the novel virus has facilitated the first substitution rate estimation for insect-infecting phasmaviruses. Over a period of approximately 16 million years, we find rates of (0.6 − 1.6) × 10−7 substitutions per site per year in protein coding genes, extraordinarily low for negative-sense RNA viruses, but consistent with the few estimates produced over comparable evolutionary timescales.


Sign in / Sign up

Export Citation Format

Share Document