Very Low Gene Duplication Rate in the Yeast Genome

Science ◽  
2004 ◽  
Vol 306 (5700) ◽  
pp. 1367-1370 ◽  
Author(s):  
L.-z. Gao
2021 ◽  
Author(s):  
Gunjan Baid ◽  
Daniel E Cook ◽  
Kishwar Shafin ◽  
Taedong Yun ◽  
Felipe Llinares-Lopez ◽  
...  

Pacific BioScience (PacBio) circular consensus sequencing (CCS) generates long (10-25 kb), accurate "HiFi" reads by combining serial observations of a DNA molecule into a consensus sequence. The standard approach to consensus generation uses a hidden Markov model (pbccs). Here, we introduce DeepConsensus, which uses a unique alignment-based loss to train a gap-aware transformer-encoder (GATE) for sequence correction. Compared to pbccs, DeepConsensus reduces read errors in the same dataset by 42%. This increases the yield of PacBio HiFi reads at Q20 by 9%, at Q30 by 27%, and at Q40 by 90%. With two SMRT Cells of HG003, reads from DeepConsensus improve hifiasm assembly contiguity (NG50 4.9Mb to 17.2Mb), increase gene completeness (94% to 97%), reduce false gene duplication rate (1.1% to 0.5%), improve assembly base accuracy (Q43 to Q45), and also reduce variant calling errors by 24%.


2013 ◽  
Vol 46 (06) ◽  
Author(s):  
LK Kollmannsberger ◽  
NC Gassen ◽  
A Bultmann ◽  
J Hartmann ◽  
P Weber ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document