Model-Integrated Estimation of Normal Tissue Contamination for Cancer SNP Allelic Copy Number Data

Cancer Informatics ◽

10.4137/cin.s6873 ◽

2011 ◽

Vol 10 ◽

pp. CIN.S6873 ◽

Cited By ~ 3

Author(s):

Susann Stjernqvist ◽

Tobias Rydén ◽

Chris D. Greenman

Keyword(s):

Copy Number ◽

Normal Tissue ◽

Index Structure ◽

Copy Number Data ◽

Cancer Data ◽

Unequally Spaced ◽

Intensity Measurements ◽

Tissue Contamination ◽

Continuous Index ◽

Allelic Copy Number

SNP allelic copy number data provides intensity measurements for the two different alleles separately. We present a method that estimates the number of copies of each allele at each SNP position, using a continuous-index hidden Markov model. The method is especially suited for cancer data, since it includes the fraction of normal tissue contamination, often present when studying data from cancer tumors, into the model. The continuous-index structure takes into account the distances between the SNPs, and is thereby appropriate also when SNPs are unequally spaced. In a simulation study we show that the method performs favorably compared to previous methods even with as much as 70% normal contamination. We also provide results from applications to clinical data produced using the Affymetrix genome-wide SNP 6.0 platform.

Download Full-text

Continuous-index hidden Markov modelling of array CGH copy number data

Bioinformatics ◽

10.1093/bioinformatics/btm059 ◽

2007 ◽

Vol 23 (8) ◽

pp. 1006-1014 ◽

Cited By ~ 34

Author(s):

Susann Stjernqvist ◽

Tobias Rydén ◽

Martin Sköld ◽

Johan Staaf

Keyword(s):

Copy Number ◽

Array Cgh ◽

Hidden Markov ◽

Copy Number Data ◽

Markov Modelling ◽

Continuous Index

Download Full-text

A continuous-index hidden Markov jump process for modeling DNA copy number data

Biostatistics ◽

10.1093/biostatistics/kxp030 ◽

2009 ◽

Vol 10 (4) ◽

pp. 773-778 ◽

Cited By ~ 2

Author(s):

S. Stjernqvist ◽

T. Ryden

Keyword(s):

Copy Number ◽

Hidden Markov ◽

Jump Process ◽

Copy Number Data ◽

Dna Copy Number ◽

Markov Jump ◽

Markov Jump Process ◽

Continuous Index

Download Full-text

PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data

Biostatistics ◽

10.1093/biostatistics/kxp045 ◽

2009 ◽

Vol 11 (1) ◽

pp. 164-175 ◽

Cited By ~ 143

Author(s):

C. D. Greenman ◽

G. Bignell ◽

A. Butler ◽

S. Edkins ◽

J. Hinton ◽

...

Keyword(s):

Copy Number Variation ◽

Copy Number ◽

Cancer Data ◽

Number Variation ◽

Allelic Copy Number

Download Full-text

HiCancer: accurate and complete cancer genome phasing with Hi-C reads

Scientific Reports ◽

10.1038/s41598-021-86104-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Weihua Pan ◽

Desheng Gong ◽

Da Sun ◽

Haohui Luo

Keyword(s):

Copy Number ◽

Somatic Mutations ◽

Copy Number Variations ◽

Cancer Genome ◽

Structural Variations ◽

Genome Map ◽

Linkage Information ◽

Suboptimal Solution ◽

Allelic Copy Number ◽

Very High

AbstractDue to the high complexity of cancer genome, it is too difficult to generate complete cancer genome map which contains the sequence of every DNA molecule until now. Nevertheless, phasing each chromosome in cancer genome into two haplotypes according to germline mutations provides a suboptimal solution to understand cancer genome. However, phasing cancer genome is also a challenging problem, due to the limit in experimental and computational technologies. Hi-C data is widely used in phasing in recent years due to its long-range linkage information and provides an opportunity for solving the problem of phasing cancer genome. The existing Hi-C based phasing methods can not be applied to cancer genome directly, because the somatic mutations in cancer genome such as somatic SNPs, copy number variations and structural variations greatly reduce the correctness and completeness. Here, we propose a new Hi-C based pipeline for phasing cancer genome called HiCancer. HiCancer solves different kinds of somatic mutations and variations, and take advantage of allelic copy number imbalance and linkage disequilibrium to improve the correctness and completeness of phasing. According to our experiments in K562 and KBM-7 cell lines, HiCancer is able to generate very high-quality chromosome-level haplotypes for cancer genome with only Hi-C data.

Download Full-text

BACOM: in silico detection of genomic deletion types and correction of normal cell contamination in copy number data

Bioinformatics ◽

10.1093/bioinformatics/btr183 ◽

2011 ◽

Vol 27 (11) ◽

pp. 1473-1480 ◽

Cited By ~ 24

Author(s):

Guoqiang Yu ◽

Bai Zhang ◽

G. Steven Bova ◽

Jianfeng Xu ◽

Ie−Ming Shih ◽

...

Keyword(s):

In Silico ◽

Copy Number ◽

Normal Cell ◽

Copy Number Data ◽

Genomic Deletion ◽

Normal Cell Contamination

Download Full-text

A 12 kb multi-allelic copy number variation encompassing a GC gene enhancer is associated with mastitis resistance in dairy cattle

PLoS Genetics ◽

10.1371/journal.pgen.1009331 ◽

2021 ◽

Vol 17 (7) ◽

pp. e1009331

Author(s):

Young-Lim Lee ◽

Haruko Takeda ◽

Gabriel Costa Monteiro Moreira ◽

Latifa Karim ◽

Erik Mullaart ◽

...

Keyword(s):

Vitamin D ◽

Dairy Cattle ◽

Copy Number ◽

Regulatory Element ◽

Copy Number Variant ◽

Causal Variant ◽

Clinical Mastitis ◽

Ample Evidence ◽

Resistance Qtl ◽

Allelic Copy Number

Clinical mastitis (CM) is an inflammatory disease occurring in the mammary glands of lactating cows. CM is under genetic control, and a prominent CM resistance QTL located on chromosome 6 was reported in various dairy cattle breeds. Nevertheless, the biological mechanism underpinning this QTL has been lacking. Herein, we mapped, fine-mapped, and discovered the putative causal variant underlying this CM resistance QTL in the Dutch dairy cattle population. We identified a ~12 kb multi-allelic copy number variant (CNV), that is in perfect linkage disequilibrium with a lead SNP, as a promising candidate variant. By implementing a fine-mapping and through expression QTL mapping, we showed that the group-specific component gene (GC), a gene encoding a vitamin D binding protein, is an excellent candidate causal gene for the QTL. The multiplicated alleles are associated with increased GC expression and low CM resistance. Ample evidence from functional genomics data supports the presence of an enhancer within this CNV, which would exert cis-regulatory effect on GC. We observed that strong positive selection swept the region near the CNV, and haplotypes associated with the multiplicated allele were strongly selected for. Moreover, the multiplicated allele showed pleiotropic effects for increased milk yield and reduced fertility, hinting that a shared underlying biology for these effects may revolve around the vitamin D pathway. These findings together suggest a putative causal variant of a CM resistance QTL, where a cis-regulatory element located within a CNV can alter gene expression and affect multiple economically important traits.

Download Full-text

KCNQ gene family members act as both tumor suppressors and oncogenes in gastrointestinal cancers

10.1101/2020.03.10.984039 ◽

2020 ◽

Cited By ~ 1

Author(s):

David Shorthouse ◽

Eric Rahrmann ◽

Cassandra Kosmidou ◽

Benedict Greenwood ◽

Michael W J Hall ◽

...

Keyword(s):

Tumor Suppressors ◽

Copy Number ◽

Copy Number Alteration ◽

Human Cancer ◽

Gastrointestinal Cancers ◽

Alter Patient ◽

Cancer Data ◽

Gi Cancers ◽

Patient Prognosis

We present evidence that KCNQ genes are drivers and suppressors of gastrointestinal (GI) cancer in humans. The KCNQ family of genes encode for subunits of a potassium channel complex involved in membrane polarisation and little is known about their role in cancer. We use human cancer data and a multidisciplinary computational-based approach including structural modelling and simulation, coupled with in vitro experiments to show that KCNQ1 is a tumor suppressor, and KCNQ3 and KCNQ5 are oncogenic across human GI cancers. We link the expression of KCNQ genes to WNT signalling, EMT, and survival and propose that mutation/copy number alteration of KCNQ genes can significantly alter patient prognosis in GI cancers.

Download Full-text

TEGS-CN: A Statistical Method for Pathway Analysis of Genome-wide Copy Number Profile

Cancer Informatics ◽

10.4137/cin.s13978 ◽

2014 ◽

Vol 13s4 ◽

pp. CIN.S13978

Author(s):

Yen-Tsung Huang ◽

Thomas Hsu ◽

David C. Christiani

Keyword(s):

Copy Number ◽

Copy Number Data ◽

Copy Number Profile ◽

Test Statistic ◽

Gene Set ◽

Bonferroni Adjustment ◽

Gene Sets ◽

Genome Wide ◽

A Genome ◽

Pathway Analyses

The effects of copy number alterations make up a significant part of the tumor genome profile, but pathway analyses of these alterations are still not well established. We proposed a novel method to analyze multiple copy numbers of genes within a pathway, termed Test for the Effect of a Gene Set with Copy Number data (TEGS-CN). TEGS-CN was adapted from TEGS, a method that we previously developed for gene expression data using a variance component score test. With additional development, we extend the method to analyze DNA copy number data, accounting for different sizes and thus various numbers of copy number probes in genes. The test statistic follows a mixture of X 2 distributions that can be obtained using permutation with scaled X 2 approximation. We conducted simulation studies to evaluate the size and the power of TEGS-CN and to compare its performance with TEGS. We analyzed a genome-wide copy number data from 264 patients of non-small-cell lung cancer. With the Molecular Signatures Database (MSigDB) pathway database, the genome-wide copy number data can be classified into 1814 biological pathways or gene sets. We investigated associations of the copy number profile of the 1814 gene sets with pack-years of cigarette smoking. Our analysis revealed five pathways with significant P values after Bonferroni adjustment (<2.8 x 10-5), including the PTEN pathway (7.8 x 10-7), the gene set up-regulated under heat shock (3.6 x 10-6), the gene sets involved in the immune profile for rejection of kidney transplantation (9.2 x 10-6) and for transcriptional control of leukocytes (2.2 x 10-5), and the ganglioside biosynthesis pathway (2.7 x 10-5). In conclusion, we present a new method for pathway analyses of copy number data, and causal mechanisms of the five pathways require further study.

Download Full-text

Abstract LB-270: Exploration of TCGA generated DNA copy number data in ovarian cancer points to interesting proapaptotic gene for survival prediction

10.1158/1538-7445.am2011-lb-270 ◽

2011 ◽

Author(s):

Raja Keshavan ◽

Soheil Shams

Keyword(s):

Ovarian Cancer ◽

Copy Number ◽

Survival Prediction ◽

Copy Number Data ◽

Dna Copy Number

Download Full-text

Individualized multi-omic pathway deviation scores using multiple factor analysis

Biostatistics ◽

10.1093/biostatistics/kxaa029 ◽

2020 ◽

Author(s):

Andrea Rau ◽

Regina Manansala ◽

Michael J Flister ◽

Hallgeir Rui ◽

Florence Jaffrézic ◽

...

Keyword(s):

Factor Analysis ◽

Patient Outcomes ◽

Copy Number ◽

Normal Tissue ◽

Genetic Mutations ◽

Epigenetic Changes ◽

Analysis Framework ◽

Multiple Factor Analysis ◽

Transcriptional Reprogramming ◽

Multiple Factor

Summary Malignant progression of normal tissue is typically driven by complex networks of somatic changes, including genetic mutations, copy number aberrations, epigenetic changes, and transcriptional reprogramming. To delineate aberrant multi-omic tumor features that correlate with clinical outcomes, we present a novel pathway-centric tool based on the multiple factor analysis framework called padma. Using a multi-omic consensus representation, padma quantifies and characterizes individualized pathway-specific multi-omic deviations and their underlying drivers, with respect to the sampled population. We demonstrate the utility of padma to correlate patient outcomes with complex genetic, epigenetic, and transcriptomic perturbations in clinically actionable pathways in breast and lung cancer.

Download Full-text