scholarly journals Identification of Candidate Drugs for Heart Failure using Tensor Decomposition-Based Unsupervised Feature Extraction Applied to Integrated Analysis of Gene Expression between Heart Failure and DrugMatrix Datasets

2017 ◽  
Author(s):  
Y-h. Taguchi

AbstractIdentifying drug target genes in gene expression profiles is not straightforward. Because a drug targets not mRNAs but proteins, mRNA expression of drug target genes is not always altered. In addition, the interaction between a drug and protein can be context dependent; this means that simple drug incubation experiments on cell lines do not always reflect the real situation during active disease. In this paper, I apply tensor decomposition-based unsupervised feature extraction to the integrated analysis of gene expression between heart failure and the DrugMatrix dataset where comprehensive data on gene expression during various drug treatments of rats were reported. I found that this strategy, in a fully unsupervised manner, enables us to identify a combined set of genes and compounds, for which various associations with heart failure were reported.

Author(s):  
Y-h. Taguchi ◽  
Turki Turki

ABSTRACTGene expression profiles of tissues treated with drugs have recently been used to infer clinical outcomes. Although this method is often successful from the application point of view, gene expression altered by drugs is rarely analyzed in detail, because of the extremely large number of genes involved. Here, we applied tensor decomposition (TD)-based unsupervised feature extraction (FE) to the gene expression profiles of 24 mouse tissues treated with 15 drugs. TD-based unsupervised FE enabled identification of the common effects of 15 drugs including an interesting universal feature: these drugs affect genes in a gene-group-wide manner and were dependent on three tissue types (neuronal, muscular, and gastroenterological). For each tissue group, TD-based unsupervised FE enabled identification of a few tens to a few hundreds of genes affected by the drug treatment. These genes are distinctly expressed between drug treatments and controls as well as between tissues in individual tissue groups and other tissues. We also validated the assignment of genes to individual tissue groups using multiple enrichment analyses. We conclude that TD-based unsupervised FE is a promising method for integrated analysis of gene expression profiles from multiple tissues treated with multiple drugs in a completely unsupervised manner.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kota Fujisawa ◽  
Mamoru Shimo ◽  
Y.-H. Taguchi ◽  
Shinya Ikematsu ◽  
Ryota Miyata

AbstractCoronavirus disease 2019 (COVID-19) is raging worldwide. This potentially fatal infectious disease is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, the complete mechanism of COVID-19 is not well understood. Therefore, we analyzed gene expression profiles of COVID-19 patients to identify disease-related genes through an innovative machine learning method that enables a data-driven strategy for gene selection from a data set with a small number of samples and many candidates. Principal-component-analysis-based unsupervised feature extraction (PCAUFE) was applied to the RNA expression profiles of 16 COVID-19 patients and 18 healthy control subjects. The results identified 123 genes as critical for COVID-19 progression from 60,683 candidate probes, including immune-related genes. The 123 genes were enriched in binding sites for transcription factors NFKB1 and RELA, which are involved in various biological phenomena such as immune response and cell survival: the primary mediator of canonical nuclear factor-kappa B (NF-κB) activity is the heterodimer RelA-p50. The genes were also enriched in histone modification H3K36me3, and they largely overlapped the target genes of NFKB1 and RELA. We found that the overlapping genes were downregulated in COVID-19 patients. These results suggest that canonical NF-κB activity was suppressed by H3K36me3 in COVID-19 patient blood.


2019 ◽  
Author(s):  
Ka-Lok Ng ◽  
Y-h Taguchi

AbstractCancer is a highly complex disease caused by multiple genetic factors. MicroRNA (miRNA) and mRNA expression profiles are useful for identifying prognostic biomarkers for cancer. Kidney renal clear cell carcinoma (KIRC), which accounts for more than 70% of all renal malignant tumour cases, was selected for our analysis.Traditional methods of identifying cancer prognostic markers may not be accurate. Tensor decomposition (TD) is a useful method uncovering the underlying low-dimensional structures in the tensor. The TD-based unsupervised feature extraction method was applied to analyse mRNA and miRNA expression profiles. Biological annotations of the prognostic miRNAs and mRNAs were examined utilizing the pathway and oncogenic signature databases DIANA-miRPath and MSigDB.TD identified the miRNA signatures and the associated genes. These genes were found to be involved in cancer-related pathways, and 23 genes were significantly correlated with the survival of KIRC patients. We demonstrated that the results are robust and not highly dependent upon the databases we selected. Compared with traditional supervised methods tested, TD achieves much better performance in selecting prognostic miRNAs and mRNAs.These results suggest that integrated analysis using the TD-based unsupervised feature extraction technique is an effective strategy for identifying prognostic signatures in cancer studies.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Ka-Lok Ng ◽  
Y.-H. Taguchi

Abstract Cancer is a highly complex disease caused by multiple genetic factors. MicroRNA (miRNA) and mRNA expression profiles are useful for identifying prognostic biomarkers for cancer. Kidney renal clear cell carcinoma (KIRC), which accounts for more than 70% of all renal malignant tumour cases, was selected for our analysis. Traditional methods of identifying cancer prognostic markers may not be accurate. Tensor decomposition (TD) is a useful method uncovering the underlying low-dimensional structures in the tensor. The TD-based unsupervised feature extraction method was applied to analyse mRNA and miRNA expression profiles. Biological annotations of the prognostic miRNAs and mRNAs were examined utilizing the pathway and oncogenic signature databases DIANA-miRPath and MSigDB. TD identified the miRNA signatures and the associated genes. These genes were found to be involved in cancer-related pathways, and 23 genes were significantly correlated with the survival of KIRC patients. We demonstrated that the results are robust and not highly dependent upon the databases we selected. Compared with traditional supervised methods tested, TD achieves much better performance in selecting prognostic miRNAs and mRNAs. These results suggest that integrated analysis using the TD-based unsupervised feature extraction technique is an effective strategy for identifying prognostic signatures in cancer studies.


2019 ◽  
Author(s):  
Y-h. Taguchi ◽  
Turki Turki

ABSTRACTAlthough single cell RNA sequencing (scRNA-seq) technology is newly invented and promising one, because of lack of enough information that labels individual cells, it is hard to interpret the obtained gene expression of each cell. Because of this insufficient information available, unsupervised clustering, e.g., t-Distributed Stochastic Neighbor Embedding and Uniform Manifold Approximation and Projection, is usually employed to obtain low dimensional embedding that can help to understand cell-cell relationship. One possible drawback of this strategy is that the outcome is highly dependent upon genes selected for the usage of clustering. In order to fulfill this requirement, there are many methods that performed unsupervised gene selection. In this study, a tensor decomposition (TD) based unsupervised feature extraction (FE) was applied to the integration of two scRNA-seq expression profiles that measure human and mouse midbrain development. TD based unsupervised FE could not only select coincident genes between human and mouse, but also biologically reliable genes. Coincidence between two species as well as biological reliability of selected genes is increased compared with principal component analysis (PCA) based FE applied to the same data set in the previous study. Since PCA based unsupervised FE outperformed other three popular unsupervised gene selection methods, highly variable genes, bimodal genes and dpFeature, TD based unsupervised FE can do so as well. In addition to this, ten transcription factors (TFs) that might regulate selected genes and might contribute to midbrain development are identified. These ten TFs, BHLHE40, EGR1, GABPA, IRF3, PPARG, REST, RFX5, STAT3, TCF7L2, and ZBTB33, were previously reported to be related to brain functions and diseases. TD based unsupervised FE is a promising method to integrate two scRNA-seq profiles effectively.


2021 ◽  
Author(s):  
Taguchi Y-h. ◽  
Turki Turki

Abstract The integrated analysis of multiple gene expression profiles measured in distinct studies is always problematic. Especially, missing sample matching and missing common labeling between distinct studies prevent the integration of multiple studies in fully data-driven and unsupervised manner. In this study, we propose a strategy enabling the integration of multiple gene expression profiles among multiple independent studies without either labeling or sample matching, using tensor decomposition-based unsupervised feature extraction. As an example, we applied this strategy to Alzheimer’s disease (AD)-related gene expression profiles that lack exact correspondence among samples as well as AD single-cell RNA-seq (scRNA-seq) data. We found that we could select biologically reasonable genes with integrated analysis. Overall, integrated gene expression profiles can function analogously to prior learning and/or transfer learning strategies in other machine learning applications. For scRNA-seq, the proposed approach was able to drastically reduce the required computational memory.


Polymers ◽  
2021 ◽  
Vol 13 (23) ◽  
pp. 4117
Author(s):  
Y-h. Taguchi ◽  
Turki Turki

The development of the medical applications for substances or materials that contact cells is important. Hence, it is necessary to elucidate how substances that surround cells affect gene expression during incubation. In the current study, we compared the gene expression profiles of cell lines that were in contact with collagen–glycosaminoglycan mesh and control cells. Principal component analysis-based unsupervised feature extraction was applied to identify genes with altered expression during incubation in the treated cell lines but not in the controls. The identified genes were enriched in various biological terms. Our method also outperformed a conventional methodology, namely, gene selection based on linear regression with time course.


2021 ◽  
Author(s):  
Y-h. Taguchi ◽  
Turki Turki

AbstractDevelopment of the medical applications for substances or materials that contact the cells is important. Hence, it is necessary to elucidate how substance that surround cells affect the gene expression during incubation. Here, we compared the gene expression profiles of cell lines that were in contact with the collagen–glycosaminoglycan mesh and control cells. Principal component analysis-based unsupervised feature extraction was applied to identify genes with altered expression during incubation in the treated cell lines but not in the controls. The identified genes were enriched in various biological terms. Our method also outperformed a conventional methodology, namely, gene selection based on linear regression with time course.


Sign in / Sign up

Export Citation Format

Share Document