scholarly journals Enhancing breakpoint resolution with deep segmentation model: a general refinement method for read-depth based structural variant callers

2019 ◽  
Author(s):  
Yao-zhong Zhang ◽  
Seiya Imoto ◽  
Satoru Miyano ◽  
Rui Yamaguchi

AbstractMotivationFor short-read sequencing, read-depth based structural variant (SV) callers are difficult to find single-nucleotide-resolution breakpoints due to the bin-size limitation.ResultsIn this paper, we present RDBKE to enhance the breakpoint resolution of read-depth SV callers using deep segmentation model UNet. We show that UNet can be trained with a small amount of data and applied for breakpoint enhancement both in-sample and cross-sample. On both simulation and real data, RDBKE significantly increases the number of SVs with more precise breakpoints.Availabilitysource code of RDBKE is available athttps://github.com/yaozhong/[email protected]

2021 ◽  
Vol 17 (10) ◽  
pp. e1009186
Author(s):  
Yao-zhong Zhang ◽  
Seiya Imoto ◽  
Satoru Miyano ◽  
Rui Yamaguchi

Read-depths (RDs) are frequently used in identifying structural variants (SVs) from sequencing data. For existing RD-based SV callers, it is difficult for them to determine breakpoints in single-nucleotide resolution due to the noisiness of RD data and the bin-based calculation. In this paper, we propose to use the deep segmentation model UNet to learn base-wise RD patterns surrounding breakpoints of known SVs. We integrate model predictions with an RD-based SV caller to enhance breakpoints in single-nucleotide resolution. We show that UNet can be trained with a small amount of data and can be applied both in-sample and cross-sample. An enhancement pipeline named RDBKE significantly increases the number of SVs with more precise breakpoints on simulated and real data. The source code of RDBKE is freely available at https://github.com/yaozhong/deepIntraSV.


2019 ◽  
Author(s):  
Iñigo Prada-Luengo ◽  
Anders Krogh ◽  
Lasse Maretty ◽  
Birgitte Regenberg

AbstractCircular DNA has recently been identified across different species including human normal and cancerous tissue, but short-read mappers are unable to align many of the reads crossing circle junctions and hence limits their detection from short-read sequencing data. Here, we propose a new method, Circle-Map, that guides the realignment of partially aligned reads using information from discordantly mapped reads. We demonstrate how this approach dramatically increases sensitivity for detection of circular DNA on both simulated and real data while retaining high precision.


2019 ◽  
Vol 35 (16) ◽  
pp. 2859-2861
Author(s):  
Linfang Jin ◽  
Jinhuo Lai ◽  
Yang Zhang ◽  
Ying Fu ◽  
Shuhang Wang ◽  
...  

AbstractSummaryHere we developed a tool called Breakpoint Identification (BreakID) to identity fusion events from targeted sequencing data. Taking discordant read pairs and split reads as supporting evidences, BreakID can identify gene fusion breakpoints at single nucleotide resolution. After validation with confirmed fusion events in cancer cell lines, we have proved that BreakID can achieve high sensitivity of 90.63% along with PPV of 100% at sequencing depth of 500× and perform better than other available fusion detection tools. We anticipate that BreakID will have an extensive popularity in the detection and analysis of fusions involved in clinical and research sequencing scenarios.Availability and implementationSource code is freely available at https://github.com/SinOncology/BreakID.Supplementary informationSupplementary data are available at Bioinformatics online.


2021 ◽  
Vol 12 ◽  
Author(s):  
Valentina Grosso ◽  
Luca Marcolungo ◽  
Simone Maestri ◽  
Massimiliano Alfano ◽  
Denise Lavezzari ◽  
...  

Traditional methods for the analysis of repeat expansions, which underlie genetic disorders, such as fragile X syndrome (FXS), lack single-nucleotide resolution in repeat analysis and the ability to characterize causative variants outside the repeat array. These drawbacks can be overcome by long-read and short-read sequencing, respectively. However, the routine application of next-generation sequencing in the clinic requires target enrichment, and none of the available methods allows parallel analysis of long-DNA fragments using both sequencing technologies. In this study, we investigated the use of indirect sequence capture (Xdrop technology) coupled to Nanopore and Illumina sequencing to characterize FMR1, the gene responsible of FXS. We achieved the efficient enrichment (> 200×) of large target DNA fragments (~60–80 kbp) encompassing the entire FMR1 gene. The analysis of Xdrop-enriched samples by Nanopore long-read sequencing allowed the complete characterization of repeat lengths in samples with normal, pre-mutation, and full mutation status (> 1 kbp), and correctly identified repeat interruptions relevant for disease prognosis and transmission. Single-nucleotide variants (SNVs) and small insertions/deletions (indels) could be detected in the same samples by Illumina short-read sequencing, completing the mutational testing through the identification of pathogenic variants within the FMR1 gene, when no typical CGG repeat expansion is detected. The study successfully demonstrated the parallel analysis of repeat expansions and SNVs/indels in the FMR1 gene at single-nucleotide resolution by combining Xdrop enrichment with two next-generation sequencing approaches. With the appropriate optimization necessary for the clinical settings, the system could facilitate both the study of genotype–phenotype correlation in FXS and enable a more efficient diagnosis and genetic counseling for patients and their relatives.


FEBS Letters ◽  
1988 ◽  
Vol 234 (2) ◽  
pp. 295-299 ◽  
Author(s):  
M. Vojtíšková ◽  
S. Mirkin ◽  
V. Lyamichev ◽  
O. Voloshin ◽  
M. Frank-Kamenetskii ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document