scholarly journals Model-Integrated Estimation of Normal Tissue Contamination for Cancer SNP Allelic Copy Number Data

2011 ◽  
Vol 10 ◽  
pp. CIN.S6873 ◽  
Author(s):  
Susann Stjernqvist ◽  
Tobias Rydén ◽  
Chris D. Greenman

SNP allelic copy number data provides intensity measurements for the two different alleles separately. We present a method that estimates the number of copies of each allele at each SNP position, using a continuous-index hidden Markov model. The method is especially suited for cancer data, since it includes the fraction of normal tissue contamination, often present when studying data from cancer tumors, into the model. The continuous-index structure takes into account the distances between the SNPs, and is thereby appropriate also when SNPs are unequally spaced. In a simulation study we show that the method performs favorably compared to previous methods even with as much as 70% normal contamination. We also provide results from applications to clinical data produced using the Affymetrix genome-wide SNP 6.0 platform.

2007 ◽  
Vol 23 (8) ◽  
pp. 1006-1014 ◽  
Author(s):  
Susann Stjernqvist ◽  
Tobias Rydén ◽  
Martin Sköld ◽  
Johan Staaf

Biostatistics ◽  
2009 ◽  
Vol 11 (1) ◽  
pp. 164-175 ◽  
Author(s):  
C. D. Greenman ◽  
G. Bignell ◽  
A. Butler ◽  
S. Edkins ◽  
J. Hinton ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Weihua Pan ◽  
Desheng Gong ◽  
Da Sun ◽  
Haohui Luo

AbstractDue to the high complexity of cancer genome, it is too difficult to generate complete cancer genome map which contains the sequence of every DNA molecule until now. Nevertheless, phasing each chromosome in cancer genome into two haplotypes according to germline mutations provides a suboptimal solution to understand cancer genome. However, phasing cancer genome is also a challenging problem, due to the limit in experimental and computational technologies. Hi-C data is widely used in phasing in recent years due to its long-range linkage information and provides an opportunity for solving the problem of phasing cancer genome. The existing Hi-C based phasing methods can not be applied to cancer genome directly, because the somatic mutations in cancer genome such as somatic SNPs, copy number variations and structural variations greatly reduce the correctness and completeness. Here, we propose a new Hi-C based pipeline for phasing cancer genome called HiCancer. HiCancer solves different kinds of somatic mutations and variations, and take advantage of allelic copy number imbalance and linkage disequilibrium to improve the correctness and completeness of phasing. According to our experiments in K562 and KBM-7 cell lines, HiCancer is able to generate very high-quality chromosome-level haplotypes for cancer genome with only Hi-C data.


2011 ◽  
Vol 27 (11) ◽  
pp. 1473-1480 ◽  
Author(s):  
Guoqiang Yu ◽  
Bai Zhang ◽  
G. Steven Bova ◽  
Jianfeng Xu ◽  
Ie−Ming Shih ◽  
...  

PLoS Genetics ◽  
2021 ◽  
Vol 17 (7) ◽  
pp. e1009331
Author(s):  
Young-Lim Lee ◽  
Haruko Takeda ◽  
Gabriel Costa Monteiro Moreira ◽  
Latifa Karim ◽  
Erik Mullaart ◽  
...  

Clinical mastitis (CM) is an inflammatory disease occurring in the mammary glands of lactating cows. CM is under genetic control, and a prominent CM resistance QTL located on chromosome 6 was reported in various dairy cattle breeds. Nevertheless, the biological mechanism underpinning this QTL has been lacking. Herein, we mapped, fine-mapped, and discovered the putative causal variant underlying this CM resistance QTL in the Dutch dairy cattle population. We identified a ~12 kb multi-allelic copy number variant (CNV), that is in perfect linkage disequilibrium with a lead SNP, as a promising candidate variant. By implementing a fine-mapping and through expression QTL mapping, we showed that the group-specific component gene (GC), a gene encoding a vitamin D binding protein, is an excellent candidate causal gene for the QTL. The multiplicated alleles are associated with increased GC expression and low CM resistance. Ample evidence from functional genomics data supports the presence of an enhancer within this CNV, which would exert cis-regulatory effect on GC. We observed that strong positive selection swept the region near the CNV, and haplotypes associated with the multiplicated allele were strongly selected for. Moreover, the multiplicated allele showed pleiotropic effects for increased milk yield and reduced fertility, hinting that a shared underlying biology for these effects may revolve around the vitamin D pathway. These findings together suggest a putative causal variant of a CM resistance QTL, where a cis-regulatory element located within a CNV can alter gene expression and affect multiple economically important traits.


Author(s):  
David Shorthouse ◽  
Eric Rahrmann ◽  
Cassandra Kosmidou ◽  
Benedict Greenwood ◽  
Michael W J Hall ◽  
...  

We present evidence that KCNQ genes are drivers and suppressors of gastrointestinal (GI) cancer in humans. The KCNQ family of genes encode for subunits of a potassium channel complex involved in membrane polarisation and little is known about their role in cancer. We use human cancer data and a multidisciplinary computational-based approach including structural modelling and simulation, coupled with in vitro experiments to show that KCNQ1 is a tumor suppressor, and KCNQ3 and KCNQ5 are oncogenic across human GI cancers. We link the expression of KCNQ genes to WNT signalling, EMT, and survival and propose that mutation/copy number alteration of KCNQ genes can significantly alter patient prognosis in GI cancers.


2014 ◽  
Vol 13s4 ◽  
pp. CIN.S13978
Author(s):  
Yen-Tsung Huang ◽  
Thomas Hsu ◽  
David C. Christiani

The effects of copy number alterations make up a significant part of the tumor genome profile, but pathway analyses of these alterations are still not well established. We proposed a novel method to analyze multiple copy numbers of genes within a pathway, termed Test for the Effect of a Gene Set with Copy Number data (TEGS-CN). TEGS-CN was adapted from TEGS, a method that we previously developed for gene expression data using a variance component score test. With additional development, we extend the method to analyze DNA copy number data, accounting for different sizes and thus various numbers of copy number probes in genes. The test statistic follows a mixture of X 2 distributions that can be obtained using permutation with scaled X 2 approximation. We conducted simulation studies to evaluate the size and the power of TEGS-CN and to compare its performance with TEGS. We analyzed a genome-wide copy number data from 264 patients of non-small-cell lung cancer. With the Molecular Signatures Database (MSigDB) pathway database, the genome-wide copy number data can be classified into 1814 biological pathways or gene sets. We investigated associations of the copy number profile of the 1814 gene sets with pack-years of cigarette smoking. Our analysis revealed five pathways with significant P values after Bonferroni adjustment (<2.8 x 10-5), including the PTEN pathway (7.8 x 10-7), the gene set up-regulated under heat shock (3.6 x 10-6), the gene sets involved in the immune profile for rejection of kidney transplantation (9.2 x 10-6) and for transcriptional control of leukocytes (2.2 x 10-5), and the ganglioside biosynthesis pathway (2.7 x 10-5). In conclusion, we present a new method for pathway analyses of copy number data, and causal mechanisms of the five pathways require further study.


Biostatistics ◽  
2020 ◽  
Author(s):  
Andrea Rau ◽  
Regina Manansala ◽  
Michael J Flister ◽  
Hallgeir Rui ◽  
Florence Jaffrézic ◽  
...  

Summary Malignant progression of normal tissue is typically driven by complex networks of somatic changes, including genetic mutations, copy number aberrations, epigenetic changes, and transcriptional reprogramming. To delineate aberrant multi-omic tumor features that correlate with clinical outcomes, we present a novel pathway-centric tool based on the multiple factor analysis framework called padma. Using a multi-omic consensus representation, padma quantifies and characterizes individualized pathway-specific multi-omic deviations and their underlying drivers, with respect to the sampled population. We demonstrate the utility of padma to correlate patient outcomes with complex genetic, epigenetic, and transcriptomic perturbations in clinically actionable pathways in breast and lung cancer.


Sign in / Sign up

Export Citation Format

Share Document