scholarly journals BCIP: a gene-centered platform for identifying potential regulatory genes in breast cancer

2017 ◽  
Vol 7 (1) ◽  
Author(s):  
Jiaqi Wu ◽  
Shuofeng Hu ◽  
Yaowen Chen ◽  
Zongcheng Li ◽  
Jian Zhang ◽  
...  

Abstract Breast cancer is a disease with high heterogeneity. Many issues on tumorigenesis and progression are still elusive. It is critical to identify genes that play important roles in the progression of tumors, especially for tumors with poor prognosis such as basal-like breast cancer and tumors in very young women. To facilitate the identification of potential regulatory or driver genes, we present the Breast Cancer Integrative Platform (BCIP, http://www.omicsnet.org/bcancer/). BCIP maintains multi-omics data selected with strict quality control and processed with uniform normalization methods, including gene expression profiles from 9,005 tumor and 376 normal tissue samples, copy number variation information from 3,035 tumor samples, microRNA-target interactions, co-expressed genes, KEGG pathways, and mammary tissue-specific gene functional networks. This platform provides a user-friendly interface integrating comprehensive and flexible analysis tools on differential gene expression, copy number variation, and survival analysis. The prominent characteristic of BCIP is that users can perform analysis by customizing subgroups with single or combined clinical features, including subtypes, histological grades, pathologic stages, metastasis status, lymph node status, ER/PR/HER2 status, TP53 mutation status, menopause status, age, tumor size, therapy responses, and prognosis. BCIP will help to identify regulatory or driver genes and candidate biomarkers for further research in breast cancer.

2020 ◽  
Author(s):  
Christopher W. Whelan ◽  
Robert E. Handsaker ◽  
Giulio Genovese ◽  
Seva Kashin ◽  
Monkol Lek ◽  
...  

AbstractTwo intriguing forms of genome structural variation (SV) – dispersed duplications, and de novo rearrangements of complex, multi-allelic loci – have long escaped genomic analysis. We describe a new way to find and characterize such variation by utilizing identity-by-descent (IBD) relationships between siblings together with high-precision measurements of segmental copy number. Analyzing whole-genome sequence data from 706 families, we find hundreds of “IBD-discordant” (IBDD) CNVs: loci at which siblings’ CNV measurements and IBD states are mathematically inconsistent. We found that commonly-IBDD CNVs identify dispersed duplications; we mapped 95 of these common dispersed duplications to their true genomic locations through family-based linkage and population linkage disequilibrium (LD), and found several to be in strong LD with genome-wide association (GWAS) signals for common diseases or gene expression variation at their revealed genomic locations. Other CNVs that were IBDD in a single family appear to involve de novo mutations in complex and multi-allelic loci; we identified 26 de novo structural mutations that had not been previously detected in earlier analyses of the same families by diverse SV analysis methods. These included a de novo mutation of the amylase gene locus and multiple de novo mutations at chromosome 15q14. Combining these complex mutations with more-conventional CNVs, we estimate that segmental mutations larger than 1kb arise in about one per 22 human meioses. These methods are complementary to previous techniques in that they interrogate genomic regions that are home to segmental duplication, high CNV allele frequencies, and multi-allelic CNVs.Author SummaryCopy number variation is an important form of genetic variation in which individuals differ in the number of copies of segments of their genomes. Certain aspects of copy number variation have traditionally been difficult to study using short-read sequencing data. For example, standard analyses often cannot tell whether the duplicated copies of a segment are located near the original copy or are dispersed to other regions of the genome. Another aspect of copy number variation that has been difficult to study is the detection of mutations in the copy number of DNA segments passed down from parents to their children, particularly when the mutations affect genome segments which already display common copy number variation in the population. We develop an analytical approach to solving these problems when sequencing data is available for all members of families with at least two children. This method is based on determining the number of parental haplotypes the two siblings share at each location in their genome, and using that information to determine the possible inheritance patterns that might explain the copy numbers we observe in each family member. We show that dispersed duplications and mutations can be identified by looking for copy number variants that do not follow these expected inheritance patterns. We use this approach to determine the location of 95 common duplications which are dispersed to distant regions of the genome, and demonstrate that these duplications are linked to genetic variants that affect disease risk or gene expression levels. We also identify a set of copy number mutations not detected by previous analyses of sequencing data from a large cohort of families, and show that repetitive and complex regions of the genome undergo frequent mutations in copy number.


2019 ◽  
Author(s):  
Virginia Valori ◽  
Katalin Tus ◽  
Christina Laukaitis ◽  
David T. Harris ◽  
Lauren LeBeau ◽  
...  

AbstractEpigenetic silencing, including the formation of heterochromatin, silent chromosome territories, and repressed gene promoters, acts to stabilize patterns of gene regulation and the physical structure of the genome. Reduction of epigenetic silencing can result in genome rearrangements, particularly at intrinsically unstable regions of the genome such as transposons, satellite repeats, and repetitive gene clusters including the rRNA gene clusters (rDNA). It is thus expected that mutational or environmental conditions that compromise heterochromatin function might cause genome instability, and diseases associated with decreased epigenetic stability might exhibit genome changes as part of their etiology. We find support of this hypothesis in invasive ductal breast carcinoma, in which reduced epigenetic silencing has been previously described, by using a facile method to quantify rDNA copy number in biopsied breast tumors and pair-matched healthy tissue. We found that rDNA and satellite DNA sequences had significant copy number variation – both losses and gains of copies – compared to healthy tissue, arguing that these genome rearrangements are common in developing breast cancer. Thus, any proposed etiology onset or progression of breast cancer should consider alterations to the epigenome, but must also accommodate concomitant changes to genome sequence at heterochromatic loci.Authors’ StatementOne of the common hallmarks of cancer is genome instability, including hypermutation and changes to chromosome structure. Using tumor tissues obtained from women with invasive ductal carcinoma, we find that a sensitive area of the genome – the ribosomal DNA gene repeat cluster – shows hypervariability in copy number. The patterns we observe as not consistent with an adaptive loss leading to increased tumor growth, but rather we conclude that copy number variation at repeat DNA is a general consequence of reduced heterochromatin function in cancer progression.


2020 ◽  
Vol 95 (8) ◽  
pp. 634-640
Author(s):  
Fulvio Celsi ◽  
Luisa Zupin ◽  
Emmanouil Athanasakis ◽  
Eva Orzan ◽  
Domenico Leonardo Grasso ◽  
...  

2010 ◽  
Author(s):  
Kyoung-Mu Lee ◽  
Miey Park ◽  
Sang-Hoon Moon ◽  
Hyung-Chol Kim ◽  
Ji-Young Lee ◽  
...  

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Xin Shao ◽  
Ning Lv ◽  
Jie Liao ◽  
Jinbo Long ◽  
Rui Xue ◽  
...  

Abstract Background Cancer is a heterogeneous disease with many genetic variations. Lines of evidence have shown copy number variations (CNVs) of certain genes are involved in development and progression of many cancers through the alterations of their gene expression levels on individual or several cancer types. However, it is not quite clear whether the correlation will be a general phenomenon across multiple cancer types. Methods In this study we applied a bioinformatics approach integrating CNV and differential gene expression mathematically across 1025 cell lines and 9159 patient samples to detect their potential relationship. Results Our results showed there is a close correlation between CNV and differential gene expression and the copy number displayed a positive linear influence on gene expression for the majority of genes, indicating that genetic variation generated a direct effect on gene transcriptional level. Another independent dataset is utilized to revalidate the relationship between copy number and expression level. Further analysis show genes with general positive linear influence on gene expression are clustered in certain disease-related pathways, which suggests the involvement of CNV in pathophysiology of diseases. Conclusions This study shows the close correlation between CNV and differential gene expression revealing the qualitative relationship between genetic variation and its downstream effect, especially for oncogenes and tumor suppressor genes. It is of a critical importance to elucidate the relationship between copy number variation and gene expression for prevention, diagnosis and treatment of cancer.


Sign in / Sign up

Export Citation Format

Share Document