scholarly journals QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data

2007 ◽  
Vol 35 (6) ◽  
pp. 2013-2025 ◽  
Author(s):  
Stefano Colella ◽  
Christopher Yau ◽  
Jennifer M. Taylor ◽  
Ghazala Mirza ◽  
Helen Butler ◽  
...  
Author(s):  
Hai Yang ◽  
Daming Zhu

Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1[Formula: see text]kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms.


2018 ◽  
Author(s):  
Hyoyoung Choo-Wosoba ◽  
Paul S Albert ◽  
Bin Zhu

AbstractBackground:Somatic copy number alternation (SCNA) is a common feature of the cancer genome and is associated with cancer etiology and prognosis. The allele-specific SCNA analysis of a tumor sample aims to identify the allele-specific copy numbers of both alleles, adjusting for the ploidy and the tumor purity. Next generation sequencing platforms produce abundant read counts at the base-pair resolution across the exome or whole genome which is susceptible to hypersegmentation, a phenomenon where numerous regions with very short length are falsely identified as SCNA.Results:We propose hsegHMM, a hidden Markov model approach that accounts for hypersegmentation for allele-specific SCNA analysis. hsegHMM provides statistical inference of copy number profiles by using an effcient E-M algorithm procedure. Through simulation and application studies, we found that hsegHMM handles hypersegmentation effectively with a t-distribution as a part of the emission probability distribution structure and a carefully defined state space. We also compared hsegHMM with FACETS which is a current method for allele-specific SCNA analysis. For the application, we use a renal cell carcinoma sample from The Cancer Genome Atlas (TCGA) study.Conclusions:We demonstrate the robustness of hsegHMM to hypersegmentation. Furthermore, hsegHMM provides the quantification of uncertainty in identifying allele-specific SCNAs over the entire chromosomes. hsegHMM performs better than FACETS when read depth (coverage) is uneven across the genome.


Genetica ◽  
2015 ◽  
Vol 143 (2) ◽  
pp. 145-155 ◽  
Author(s):  
A. Gurgul ◽  
I. Jasielczuk ◽  
T. Szmatoła ◽  
K. Pawlina ◽  
T. Ząbek ◽  
...  

PLoS ONE ◽  
2014 ◽  
Vol 9 (5) ◽  
pp. e96841 ◽  
Author(s):  
Yen-Jen Lin ◽  
Yu-Tin Chen ◽  
Shu-Ni Hsu ◽  
Chien-Hua Peng ◽  
Chuan-Yi Tang ◽  
...  

2008 ◽  
Vol 6 (4) ◽  
pp. 231-234 ◽  
Author(s):  
Ji-Hong Kim ◽  
Seon-Hee Yim ◽  
Yong-Bok Jeong ◽  
Seong-Hyun Jung ◽  
Hai-Dong Xu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document