A Novel Approach Searching for Discriminative Gene Sets

GeneSetCluster: a tool for summarizing and integrating gene-set analysis results

BMC Bioinformatics ◽

10.1186/s12859-020-03784-z ◽

2020 ◽

Vol 21 (1) ◽

Cited By ~ 1

Author(s):

Ewoud Ewing ◽

Nuria Planell-Picola ◽

Maja Jagodic ◽

David Gomez-Cabrero

Keyword(s):

Gene Content ◽

Gene Set Analysis ◽

Gene Set ◽

Overlapping Gene ◽

Analysis Tools ◽

Novel Approach ◽

Gene Sets ◽

Distance Score ◽

Significant Gene ◽

Similar Gene

Abstract Background Gene-set analysis tools, which make use of curated sets of molecules grouped based on their shared functions, aim to identify which gene-sets are over-represented in the set of features that have been associated with a given trait of interest. Such tools are frequently used in gene-centric approaches derived from RNA-sequencing or microarrays such as Ingenuity or GSEA, but they have also been adapted for interval-based analysis derived from DNA methylation or ChIP/ATAC-sequencing. Gene-set analysis tools return, as a result, a list of significant gene-sets. However, while these results are useful for the researcher in the identification of major biological insights, they may be complex to interpret because many gene-sets have largely overlapping gene contents. Additionally, in many cases the result of gene-set analysis consists of a large number of gene-sets making it complicated to identify the major biological insights. Results We present GeneSetCluster, a novel approach which allows clustering of identified gene-sets, from one or multiple experiments and/or tools, based on shared genes. GeneSetCluster calculates a distance score based on overlapping gene content, which is then used to cluster them together and as a result, GeneSetCluster identifies groups of gene-sets with similar gene-set definitions (i.e. gene content). These groups of gene-sets can aid the researcher to focus on such groups for biological interpretations. Conclusions GeneSetCluster is a novel approach for grouping together post gene-set analysis results based on overlapping gene content. GeneSetCluster is implemented as a package in R. The package and the vignette can be downloaded at https://github.com/TranslationalBioinformaticsUnit

Download Full-text

Active Mining Discriminative Gene Sets

Artificial Intelligence and Soft Computing – ICAISC 2006 - Lecture Notes in Computer Science ◽

10.1007/11785231_92 ◽

2006 ◽

pp. 880-889

Author(s):

Feng Chu ◽

Lipo Wang

Keyword(s):

Gene Sets ◽

Discriminative Gene

Download Full-text

ClusterMine: a Knowledge-integrated Clustering Approach based on Expression Profiles of Gene Sets

10.1101/255711 ◽

2018 ◽

Author(s):

Hong-Dong Li ◽

Yunpei Xu ◽

Xiaoshu Zhu ◽

Quan Liu ◽

Gilbert S. Omenn ◽

...

Keyword(s):

Expression Profiles ◽

R Package ◽

Biological Data ◽

Supplementary Information ◽

Consensus Clustering ◽

Cluster Membership ◽

Link Type ◽

Novel Approach ◽

Gene Sets ◽

Biological Interpretation

ABSTRACTMotivationClustering analysis is essential for understanding complex biological data. In widely used methods such as hierarchical clustering (HC) and consensus clustering (CC), expression profiles of all genes are often used to assess similarity between samples for clustering. These methods output sample clusters, but are not able to provide information about which gene sets (functions) contribute most to the clustering. So interpretability of their results is limited. We hypothesized that integrating prior knowledge of annotated biological processes would not only achieve satisfying clustering performance but also, more importantly, enable potential biological interpretation of clusters.ResultsHere we report ClusterMine, a novel approach that identifies clusters by assessing functional similarity between samples through integrating known annotated gene sets, e.g., in Gene Ontology. In addition to outputting cluster membership of each sample as conventional approaches do, it outputs gene sets that are most likely to contribute to the clustering, a feature facilitating biological interpretation. Using three cancer datasets, two single cell RNA-sequencing based cell differentiation datasets, one cell cycle dataset and two datasets of cells of different tissue origins, we found that ClusterMine achieved similar or better clustering performance and that top-scored gene sets prioritized by ClusterMine are biologically relevant.Implementation and availabilityClusterMine is implemented as an R package and is freely available at: www.genemine.org/[email protected] InformationSupplementary data are available at Bioinformatics online.

Download Full-text

An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms

BioMed Research International ◽

10.1155/2016/7639397 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 5

Author(s):

Hong-Li Hua ◽

Fa-Zhan Zhang ◽

Abraham Alemayehu Labena ◽

Chuan Dong ◽

Yan-Ting Jin ◽

...

Keyword(s):

Machine Learning ◽

Drug Targets ◽

Essential Genes ◽

Machine Learning Algorithms ◽

Biological Knowledge ◽

Independent Dataset ◽

Novel Approach ◽

Gene Sets ◽

Tenfold Cross Validation ◽

Potential Drug Targets

Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset fromSynechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.

Download Full-text

NetCore: a network propagation approach using node coreness

Nucleic Acids Research ◽

10.1093/nar/gkaa639 ◽

2020 ◽

Vol 48 (17) ◽

pp. e98-e98

Author(s):

Gal Barel ◽

Ralf Herwig

Keyword(s):

Node Degree ◽

Disease Genes ◽

The Novel ◽

Ppi Networks ◽

Novel Approach ◽

Gene Sets ◽

Network Modules ◽

Network Propagation ◽

Improved Performance ◽

Pan Cancer

Abstract We present NetCore, a novel network propagation approach based on node coreness, for phenotype–genotype associations and module identification. NetCore addresses the node degree bias in PPI networks by using node coreness in the random walk with restart procedure, and achieves improved re-ranking of genes after propagation. Furthermore, NetCore implements a semi-supervised approach to identify phenotype-associated network modules, which anchors the identification of novel candidate genes at known genes associated with the phenotype. We evaluated NetCore on gene sets from 11 different GWAS traits and showed improved performance compared to the standard degree-based network propagation using cross-validation. Furthermore, we applied NetCore to identify disease genes and modules for Schizophrenia GWAS data and pan-cancer mutation data. We compared the novel approach to existing network propagation approaches and showed the benefits of using NetCore in comparison to those. We provide an easy-to-use implementation, together with a high confidence PPI network extracted from ConsensusPathDB, which can be applied to various types of genomics data in order to obtain a re-ranking of genes and functionally relevant network modules.

Download Full-text

Copper-dependent ATP7B up-regulation drives the resistance of TMEM16A-overexpressing head-and-neck cancer models to platinum toxicity

Biochemical Journal ◽

10.1042/bcj20190591 ◽

2019 ◽

Vol 476 (24) ◽

pp. 3705-3719 ◽

Cited By ~ 2

Author(s):

Avani Vyas ◽

Umamaheswar Duvvuri ◽

Kirill Kiselyov

Keyword(s):

Oxidative Stress ◽

Head And Neck ◽

Transcriptional Activation ◽

Platinum Resistance ◽

Mrna Levels ◽

Strong Positive Correlation ◽

Platinum Compounds ◽

Copper Chelation ◽

Novel Approach ◽

Cancer Models

Platinum-containing drugs such as cisplatin and carboplatin are routinely used for the treatment of many solid tumors including squamous cell carcinoma of the head and neck (SCCHN). However, SCCHN resistance to platinum compounds is well documented. The resistance to platinum has been linked to the activity of divalent transporter ATP7B, which pumps platinum from the cytoplasm into lysosomes, decreasing its concentration in the cytoplasm. Several cancer models show increased expression of ATP7B; however, the reason for such an increase is not known. Here we show a strong positive correlation between mRNA levels of TMEM16A and ATP7B in human SCCHN tumors. TMEM16A overexpression and depletion in SCCHN cell lines caused parallel changes in the ATP7B mRNA levels. The ATP7B increase in TMEM16A-overexpressing cells was reversed by suppression of NADPH oxidase 2 (NOX2), by the antioxidant N-Acetyl-Cysteine (NAC) and by copper chelation using cuprizone and bathocuproine sulphonate (BCS). Pretreatment with either chelator significantly increased cisplatin's sensitivity, particularly in the context of TMEM16A overexpression. We propose that increased oxidative stress in TMEM16A-overexpressing cells liberates the chelated copper in the cytoplasm, leading to the transcriptional activation of ATP7B expression. This, in turn, decreases the efficacy of platinum compounds by promoting their vesicular sequestration. We think that such a new explanation of the mechanism of SCCHN tumors’ platinum resistance identifies novel approach to treating these tumors.

Download Full-text

Collecting Words: A Clinical Example of a Morphology-Focused Orthographic Intervention

Language Speech and Hearing Services in Schools ◽

10.1044/2020_lshss-19-00050 ◽

2020 ◽

Vol 51 (3) ◽

pp. 544-560 ◽

Cited By ~ 3

Author(s):

Kimberly A. Murphy ◽

Emily A. Diehm

Keyword(s):

Written Language ◽

Morphological Knowledge ◽

English Orthography ◽

Word Level ◽

Novel Approach ◽

Language Pathology ◽

The One ◽

Critical Intervention ◽

Reading And Spelling ◽

Reading And Spelling Difficulties

Purpose Morphological interventions promote gains in morphological knowledge and in other oral and written language skills (e.g., phonological awareness, vocabulary, reading, and spelling), yet we have a limited understanding of critical intervention features. In this clinical focus article, we describe a relatively novel approach to teaching morphology that considers its role as the key organizing principle of English orthography. We also present a clinical example of such an intervention delivered during a summer camp at a university speech and hearing clinic. Method Graduate speech-language pathology students provided a 6-week morphology-focused orthographic intervention to children in first through fourth grade ( n = 10) who demonstrated word-level reading and spelling difficulties. The intervention focused children's attention on morphological families, teaching how morphology is interrelated with phonology and etymology in English orthography. Results Comparing pre- and posttest scores, children demonstrated improvement in reading and/or spelling abilities, with the largest gains observed in spelling affixes within polymorphemic words. Children and their caregivers reacted positively to the intervention. Therefore, data from the camp offer preliminary support for teaching morphology within the context of written words, and the intervention appears to be a feasible approach for simultaneously increasing morphological knowledge, reading, and spelling. Conclusion Children with word-level reading and spelling difficulties may benefit from a morphology-focused orthographic intervention, such as the one described here. Research on the approach is warranted, and clinicians are encouraged to explore its possible effectiveness in their practice. Supplemental Material https://doi.org/10.23641/asha.12290687

Download Full-text