scholarly journals sigQC: A procedural approach for standardising the evaluation of gene signatures

2017 ◽  
Author(s):  
Andrew Dhawan ◽  
Alessandro Barberis ◽  
Wei-Chen Cheng ◽  
Enric Domingo ◽  
Catharine West ◽  
...  

AbstractWith the increase in next generation sequencing generating large amounts of genomic data, gene expression signatures are becoming critically important tools, poised to make a large impact on the diagnosis, management and prognosis for a number of diseases. Increasingly, it is becoming necessary to determine whether a gene expression signature may apply to a dataset, but no standard quality control methodology exists. In this work, we introduce the first protocol, implemented in an R package sigQC, enabling a streamlined methodological and standardised approach for the quality control validation of gene signatures on independent data sets. The emphasis in this work is in showing the critical quality control steps involved in the generation of a clinically and biologically useful, transportable gene signature, including ensuring sufficient expression, variability, and autocorrelation of a signature. We demonstrate the application of the protocol in this work, showing how the outputs created from sigQC may be used for the evaluation of gene signatures on large-scale gene expression data in cancer.

Author(s):  
Ekaterina Bourova-Flin ◽  
Samira Derakhshan ◽  
Afsaneh Goudarzi ◽  
Tao Wang ◽  
Anne-Laure Vitte ◽  
...  

Abstract Background Large-scale genetic and epigenetic deregulations enable cancer cells to ectopically activate tissue-specific expression programmes. A specifically designed strategy was applied to oral squamous cell carcinomas (OSCC) in order to detect ectopic gene activations and develop a prognostic stratification test. Methods A dedicated original prognosis biomarker discovery approach was implemented using genome-wide transcriptomic data of OSCC, including training and validation cohorts. Abnormal expressions of silent genes were systematically detected, correlated with survival probabilities and evaluated as predictive biomarkers. The resulting stratification test was confirmed in an independent cohort using immunohistochemistry. Results A specific gene expression signature, including a combination of three genes, AREG, CCNA1 and DDX20, was found associated with high-risk OSCC in univariate and multivariate analyses. It was translated into an immunohistochemistry-based test, which successfully stratified patients of our own independent cohort. Discussion The exploration of the whole gene expression profile characterising aggressive OSCC tumours highlights their enhanced proliferative and poorly differentiated intrinsic nature. Experimental targeting of CCNA1 in OSCC cells is associated with a shift of transcriptomic signature towards the less aggressive form of OSCC, suggesting that CCNA1 could be a good target for therapeutic approaches.


Neurology ◽  
2017 ◽  
Vol 89 (16) ◽  
pp. 1676-1683 ◽  
Author(s):  
Ron Shamir ◽  
Christine Klein ◽  
David Amar ◽  
Eva-Juliane Vollstedt ◽  
Michael Bonin ◽  
...  

Objective:To examine whether gene expression analysis of a large-scale Parkinson disease (PD) patient cohort produces a robust blood-based PD gene signature compared to previous studies that have used relatively small cohorts (≤220 samples).Methods:Whole-blood gene expression profiles were collected from a total of 523 individuals. After preprocessing, the data contained 486 gene profiles (n = 205 PD, n = 233 controls, n = 48 other neurodegenerative diseases) that were partitioned into training, validation, and independent test cohorts to identify and validate a gene signature. Batch-effect reduction and cross-validation were performed to ensure signature reliability. Finally, functional and pathway enrichment analyses were applied to the signature to identify PD-associated gene networks.Results:A gene signature of 100 probes that mapped to 87 genes, corresponding to 64 upregulated and 23 downregulated genes differentiating between patients with idiopathic PD and controls, was identified with the training cohort and successfully replicated in both an independent validation cohort (area under the curve [AUC] = 0.79, p = 7.13E–6) and a subsequent independent test cohort (AUC = 0.74, p = 4.2E–4). Network analysis of the signature revealed gene enrichment in pathways, including metabolism, oxidation, and ubiquitination/proteasomal activity, and misregulation of mitochondria-localized genes, including downregulation of COX4I1, ATP5A1, and VDAC3.Conclusions:We present a large-scale study of PD gene expression profiling. This work identifies a reliable blood-based PD signature and highlights the importance of large-scale patient cohorts in developing potential PD biomarkers.


2019 ◽  
Vol 15 ◽  
pp. 117693431983849 ◽  
Author(s):  
Mengying Sheng ◽  
Xueying Xie ◽  
Jun Wang ◽  
Wanjun Gu

Current research has identified several potential biomarkers for lung cancer diagnosis or prognosis. However, most of these biomarkers are derived from a relatively small number of samples using algorithms at the gene level. Hence, gene expression signatures discovered in these studies have little overlaps. In this study, we proposed a new strategy to identify biomarkers from multiple datasets at the pathway level. We integrated the genome-wide expression data of lung cancer tissues from 13 published studies and applied our strategy to identify lung cancer diagnostic and prognostic biomarkers. We identified a 32-gene signature that differentiates lung adenocarcinomas from other lung cancer subtypes. We also discovered a 43-gene signature that can predict the outcome of human lung cancers. We tested their performance in several independent cohorts, which confirmed their robust prognostic and diagnostic power. Furthermore, we showed that the proposed gene expression signatures were independent of several traditional clinical indicators in lung cancer management. Our results suggest that the pathway-based strategy is useful to identify transcriptomic biomarkers from large-scale gene expression datasets that were collected from multiple sources.


Neurosurgery ◽  
2020 ◽  
Vol 88 (1) ◽  
pp. 202-210 ◽  
Author(s):  
William C Chen ◽  
Harish N Vasudevan ◽  
Abrar Choudhury ◽  
Melike Pekmezci ◽  
Calixto-Hope G Lucas ◽  
...  

Abstract BACKGROUND Prognostic markers for meningioma are needed to risk-stratify patients and guide postoperative surveillance and adjuvant therapy. OBJECTIVE To identify a prognostic gene signature for meningioma recurrence and mortality after resection using targeted gene-expression analysis. METHODS Targeted gene-expression analysis was used to interrogate a discovery cohort of 96 meningiomas and an independent validation cohort of 56 meningiomas with comprehensive clinical follow-up data from separate institutions. Bioinformatic analysis was used to identify prognostic genes and generate a gene-signature risk score between 0 and 1 for local recurrence. RESULTS We identified a 36-gene signature of meningioma recurrence after resection that achieved an area under the curve of 0.86 in identifying tumors at risk for adverse clinical outcomes. The gene-signature risk score compared favorably to World Health Organization (WHO) grade in stratifying cases by local freedom from recurrence (LFFR, P < .001 vs .09, log-rank test), shorter time to failure (TTF, F-test, P < .0001), and overall survival (OS, P < .0001 vs .07) and was independently associated with worse LFFR (relative risk [RR] 1.56, 95% CI 1.30-1.90) and OS (RR 1.32, 95% CI 1.07-1.64), after adjusting for clinical covariates. When tested on an independent validation cohort, the gene-signature risk score remained associated with shorter TTF (F-test, P = .002), compared favorably to WHO grade in stratifying cases by OS (P = .003 vs P = .10), and was significantly associated with worse OS (RR 1.86, 95% CI 1.19-2.88) on multivariate analysis. CONCLUSION The prognostic meningioma gene-expression signature and risk score presented may be useful for identifying patients at risk for recurrence.


2011 ◽  
Vol 43 (3) ◽  
pp. 110-120 ◽  
Author(s):  
Nicky Konstantopoulos ◽  
Victoria C. Foletta ◽  
David H. Segal ◽  
Katherine A. Shields ◽  
Andrew Sanigorski ◽  
...  

Insulin resistance is a heterogeneous disorder caused by a range of genetic and environmental factors, and we hypothesize that its etiology varies considerably between individuals. This heterogeneity provides significant challenges to the development of effective therapeutic regimes for long-term management of type 2 diabetes. We describe a novel strategy, using large-scale gene expression profiling, to develop a gene expression signature (GES) that reflects the overall state of insulin resistance in cells and patients. The GES was developed from 3T3-L1 adipocytes that were made “insulin resistant” by treatment with tumor necrosis factor-α (TNF-α) and then reversed with aspirin and troglitazone (“resensitized”). The GES consisted of five genes whose expression levels best discriminated between the insulin-resistant and insulin-resensitized states. We then used this GES to screen a compound library for agents that affected the GES genes in 3T3-L1 adipocytes in a way that most closely resembled the changes seen when insulin resistance was successfully reversed with aspirin and troglitazone. This screen identified both known and new insulin-sensitizing compounds including nonsteroidal anti-inflammatory agents, β-adrenergic antagonists, β-lactams, and sodium channel blockers. We tested the biological relevance of this GES in participants in the San Antonio Family Heart Study ( n = 1,240) and showed that patients with the lowest GES scores were more insulin resistant (according to HOMA_IR and fasting plasma insulin levels; P < 0.001). These findings show that GES technology can be used for both the discovery of insulin-sensitizing compounds and the characterization of patients into subtypes of insulin resistance according to GES scores, opening the possibility of developing a personalized medicine approach to type 2 diabetes.


Blood ◽  
2011 ◽  
Vol 118 (21) ◽  
pp. 805-805
Author(s):  
Carolina Terragna ◽  
Daniel Remondini ◽  
Sandra Durante ◽  
Marina Martello ◽  
Francesca Patriarca ◽  
...  

Abstract Abstract 805FN2 Background. Achievement of CR is generally associated with improved clinical outcomes for patients (pts) with MM and represents a primary endpoint of current clinical trials. The GIMEMA Italian Myeloma Network designed a phase 3 study to demonstrate that the triplet VTD regimen was superior over a doublet such as thalidomide-dexamethasone (TD) as induction therapy prior to double ASCT for newly diagnosed MM. On an intention-to-treat basis, the rate of complete or near complete response (CR/nCR) was 31% for the 236 pts on VTD induction therapy, while it was 11% (p<0.0001) for the 238 pts on TD induction therapy. Since enhanced rates of CR/nCR affected by VTD incorporated into ASCT resulted in extended progression-free survival, prediction of CR by pharmacogenomic tools is likely to be an important goal to prospectively select those pts who are more likely to benefit from a given therapy. Methods. For this purpose, in a molecular substudy to the main clinical study we assessed the ability of gene expression profile (GEP) to predict attainment of CR/nCR in 122 pts enrolled in the VTD arm of the study. Their characteristics at baseline, including cytogenetic abnormalities, were comparable with those of the whole population of 236 pts. Highly purified CD138+ plasma cells were obtained at diagnosis from each of these pts and were profiled for gene expression using the Affymetrix U133 Plus2.0 platform. In order to build a low-dimensional signature with optimal performance, genomic data were analyzed with an original algorithm that exploits quadratic discriminant analysis with a bottom-up approach that builds N-gene signatures starting from two-dimensional signatures. Gene models were applied to test datasets to predict achievement of either CR/nCR or less than nCR, and classification performances were validated by a leave-one-out crossvalidation procedure. Results. Thirty four pts out of the 122 (28%) who were included in the present analysis achieved a CR/nCR, while the remaining 88 patients failed this objective. The molecular approach described above allowed to identify several gene signatures among which we choose a 163-gene signature that provided a predictive capability of 79% sensitivity, 87% specificity, 71% positive predictive value (PPV) and 92% negative predictive value (NPV). These expression values were used in an unsupervised hierarchical clustering to stratify the population of 122 profilated pts into 3 well defined subgroups. Seventy nine pts were included in subgroup A, while the remaining 43 pts were included in either subgroup B (n=22) or subgroup C (n=21). Notably, 19 out the 34 CR/nCR pts (56%) clustered in subgroup B, whereas the remaining 15 pts were randomly distributed within subgroup A. Analysis of demographic and disease characteristics of the pts belonging to the 3 major subgroups, revealed that in subgroup B the frequencies of pts carrying del(13q) (78%) or del(17p) (22%) or with an IgA isotype (54%) were significantly higher in comparison with the corresponding values found in subgroup A (47%, 4%, and 10%, respectively) and subgroup C (38%, 10%, and 5%, respectively). In order to obtain a more feasible set of genes predictive of CR/nCR, several smaller signatures originating from the 163-gene signature were further analyzed by means of the same algorithm described above. The best predictive capability was obtained with a 41-gene signature that provided 88% sensitivity, 97% specificity, 91% PPV and 95% NPV. A GeneGo ® network analysis of genes included in the signatures showed that the most relevant network nodes included tumour suppressor genes (FBXW7 and MAD), genes involved in inflammatory response (TREM1 and TLR4) and genes involved in B cell development (IKZF1, IL10 and NFAM1). Genes included in the signatures do not gather in specific chromosomes, thus confirming the absence of bias on selection of signatures genes, potentially due to prevalence of MM typical chromosomal aberrations. Conclusions. GEP analysis of a subgroup of pts who received VTD induction therapy allowed to provide a 41-gene signature that was able to predict attainment of CR/nCR and, conversely, failure to achieve at least nCR in 91% and 95% of cases, respectively. These favorable results might represent a first step towards the possible application of a tailored therapy based on the single patient's genetic background. Supported by: Fondazione Del Monte di Bologna e Ravenna, Ateneo RFO grants (M.C.) BolognAIL. Disclosures: Bringhen: Celgene: Honoraria; Janssen-Cilag: Honoraria; Novartis: Honoraria; Merck Sharp & Dhome: Membership on an entity's Board of Directors or advisory committees. Offidani:Janssen: Honoraria; Celgene: Honoraria.


2013 ◽  
Vol 31 (4_suppl) ◽  
pp. 403-403
Author(s):  
Loredana Vecchione ◽  
Valentina Gambino ◽  
Giovanni d'Ario ◽  
Sun Tian ◽  
Iris Simon ◽  
...  

403 Background: Approximately 8-15% of colorectal (CRC) patients carry an activating mutation in BRAF. This CRC subtype is associated with poor outcome and with resistance, both to chemotherapeutic treatments and to tailored drugs. We recently showed that BRAF (V600E) colon cancers (CCs) have a characteristic gene expression signature (1, 2) which is found also in subsets of KRAS mutant and KRAS-BRAF wild type (WT2) tumors. Tumors having this gene signature, referred as “BRAF-like”, have a similar poor prognosis irrespective of the presence of the BRAF (V600E) mutation. By using a shRNA-based genetic screen in BRAF mutant CC cell lines we aimed to identify genes and pathways necessary for survival and growth of BRAFmutant CC. Such studies may reveal additional targets for therapy and potentially provide new biomarkers for patient stratification Methods: We identified 363 genes that are selectively overexpressed in BRAF mutant tumors as compared to WT2 type tumors, based on gene expression profiles of the PETACC3 (1) and Agendia (2) datasets. The TRC human genome-wide shRNA collection (TRC-Hs1.0) was used to generate a 1815 hairpins sub-library targeting those identified genes (BRAF library). BRAF(V600E) CC cell lines were infected with the BRAF library and screened for shRNAs that cause lethality. LIM1215 CC cell line (WT2) was used as a control. Cells stably expressing the shRNA library were cultured for 13 days, after which shRNAs were recovered by PCR. Deep sequencing was applied to determine the specific depletion of shRNA in BRAF(V600E) cells as compared to LIM1215 cells Results: Candidate genes were identified by using following filtering criteria: depletion in BRAF(V600E) cells by at least 50% and depletion in BRAF(V600E) cells 1, 5-fold higher than in control cells with the corresponding p-value to be ≤ 0.1. A total of 34 genes met our criteria of which 6 genes were presented with more than one hairpin and were concordant across the cell lines selected for validation. Conclusions: We identified candidate synthetic lethal genes in BRAF mutant CC cell lines. Functional analysis is ongoing. Data will be presented. References 1. J Clin Oncol 2012 Apr 20;30(12):1288-9 2. Gut (2012). doi:10.1136/gutjnl-2012-302423


2017 ◽  
Vol 114 (52) ◽  
pp. 13792-13797 ◽  
Author(s):  
Mary R. Doherty ◽  
HyeonJoo Cheon ◽  
Damian J. Junk ◽  
Shaveta Vinayak ◽  
Vinay Varadan ◽  
...  

Triple-negative breast cancer (TNBC), the deadliest form of this disease, lacks a targeted therapy. TNBC tumors that fail to respond to chemotherapy are characterized by a repressed IFN/signal transducer and activator of transcription (IFN/STAT) gene signature and are often enriched for cancer stem cells (CSCs). We have found that human mammary epithelial cells that undergo an epithelial-to-mesenchymal transition (EMT) following transformation acquire CSC properties. These mesenchymal/CSCs have a significantly repressed IFN/STAT gene expression signature and an enhanced ability to migrate and form tumor spheres. Treatment with IFN-beta (IFN-β) led to a less aggressive epithelial/non–CSC-like state, with repressed expression of mesenchymal proteins (VIMENTIN, SLUG), reduced migration and tumor sphere formation, and reexpression of CD24 (a surface marker for non-CSCs), concomitant with an epithelium-like morphology. The CSC-like properties were correlated with high levels of unphosphorylated IFN-stimulated gene factor 3 (U-ISGF3), which was previously linked to resistance to DNA damage. Inhibiting the expression of IRF9 (the DNA-binding component of U-ISGF3) reduced the migration of mesenchymal/CSCs. Here we report a positive translational role for IFN-β, as gene expression profiling of patient-derived TNBC tumors demonstrates that an IFN-β metagene signature correlates with improved patient survival, an immune response linked with tumor-infiltrating lymphocytes (TILs), and a repressed CSC metagene signature. Taken together, our findings indicate that repressed IFN signaling in TNBCs with CSC-like properties is due to high levels of U-ISGF3 and that treatment with IFN-β reduces CSC properties, suggesting a therapeutic strategy to treat drug-resistant, highly aggressive TNBC tumors.


Cancers ◽  
2021 ◽  
Vol 13 (21) ◽  
pp. 5523
Author(s):  
Daugrois Camille ◽  
Bessiere Chloé ◽  
Dejean Sébastien ◽  
Anton Leberre Véronique ◽  
Commes Thérèse ◽  
...  

Anaplastic large cell lymphomas associated with ALK translocation have a good outcome after CHOP treatment; however, the 2-year relapse rate remains at 30%. Microarray gene-expression profiling of 48 samples obtained at diagnosis was used to identify 47 genes that were differentially expressed between patients with early relapse/progression and no relapse. In the relapsing group, the most significant overrepresented genes were related to the regulation of the immune response and T-cell activation while those in the non-relapsing group were involved in the extracellular matrix. Fluidigm technology gave concordant results for 29 genes, of which FN1, FAM179A, and SLC40A1 had the strongest predictive power after logistic regression and two classification algorithms. In parallel with 39 samples, we used a Kallisto/Sleuth pipeline to analyze RNA sequencing data and identified 20 genes common to the 28 genes validated by Fluidigm technology—notably, the FAM179A and FN1 genes. Interestingly, FN1 also belongs to the gene signature predicting longer survival in diffuse large B-cell lymphomas treated with CHOP. Thus, our molecular signatures indicate that the FN1 gene, a matrix key regulator, might also be involved in the prognosis and the therapeutic response in anaplastic lymphomas.


2019 ◽  
Author(s):  
L Cao ◽  
C Clish ◽  
FB Hu ◽  
MA Martínez-González ◽  
C Razquin ◽  
...  

AbstractMotivationLarge-scale untargeted metabolomics experiments lead to detection of thousands of novel metabolic features as well as false positive artifacts. With the incorporation of pooled QC samples and corresponding bioinformatics algorithms, those measurement artifacts can be well quality controlled. However, it is impracticable for all the studies to apply such experimental design.ResultsWe introduce a post-alignment quality control method called genuMet, which is solely based on injection order of biological samples to identify potential false metabolic features. In terms of the missing pattern of metabolic signals, genuMet can reach over 95% true negative rate and 85% true positive rate with suitable parameters, compared with the algorithm utilizing pooled QC samples. genu-Met makes it possible for studies without pooled QC samples to reduce false metabolic signals and perform robust statistical analysis.Availability and implementationgenuMet is implemented in a R package and available on https://github.com/liucaomics/genuMet under GPL-v2 license.ContactLiming Liang: [email protected] informationSupplementary data are available at ….


Sign in / Sign up

Export Citation Format

Share Document