Gene selection and cancer classification using Monte Carlo and nonnegative matrix factorization

RSC Advances ◽  
2016 ◽  
Vol 6 (46) ◽  
pp. 39652-39656 ◽  
Author(s):  
Jing Chen ◽  
Qin Ma ◽  
Xiaoyan Hu ◽  
Miao Zhang ◽  
Dongdong Qin ◽  
...  

Cancer classification is a key problem for identifying the genomic biomarkers and treating cancerous tumors in clinical research.

2020 ◽  
Vol 34 (04) ◽  
pp. 5420-5427
Author(s):  
Qiao Maoying ◽  
Yu Jun ◽  
Liu Tongliang ◽  
Wang Xinchao ◽  
Tao Dacheng

Nonnegative matrix factorization (NMF) has been widely employed in a variety of scenarios due to its capability of inducing semantic part-based representation. However, because of the non-convexity of its objective, the factorization is generally not unique and may inaccurately discover intrinsic “parts” from the data. In this paper, we approach this issue using a Bayesian framework. We propose to assign a diversity prior to the parts of the factorization to induce correctness based on the assumption that useful parts should be distinct and thus well-spread. A Bayesian framework including this diversity prior is then established. This framework aims at inducing factorizations embracing both good data fitness from maximizing likelihood and large separability from the diversity prior. Specifically, the diversity prior is formulated with determinantal point processes (DPP) and is seamlessly embedded into a Bayesian NMF framework. To carry out the inference, a Monte Carlo Markov Chain (MCMC) based procedure is derived. Experiments conducted on a synthetic dataset and a real-world MULAN dataset for multi-label learning (MLL) task demonstrate the superiority of the proposed method.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Yong-Jing Hao ◽  
Ying-Lian Gao ◽  
Mi-Xiao Hou ◽  
Ling-Yun Dai ◽  
Jin-Xing Liu

Nonnegative Matrix Factorization (NMF) is a significant big data analysis technique. However, standard NMF regularized by simple graph does not have discriminative function, and traditional graph models cannot accurately reflect the problem of multigeometry information between data. To solve the above problem, this paper proposed a new method called Hypergraph Regularized Discriminative Nonnegative Matrix Factorization (HDNMF), which captures intrinsic geometry by constructing hypergraphs rather than simple graphs. The introduction of the hypergraph method allows high-order relationships between samples to be considered, and the introduction of label information enables the method to have discriminative effect. Both the hypergraph Laplace and the discriminative label information are utilized together to learn the projection matrix in the standard method. In addition, we offered a corresponding multiplication update solution for the optimization. Experiments indicate that the method proposed is more effective by comparing with the earlier methods.


Complexity ◽  
2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Ling-Yun Dai ◽  
Chun-Mei Feng ◽  
Jin-Xing Liu ◽  
Chun-Hou Zheng ◽  
Jiguo Yu ◽  
...  

Differential expression plays an important role in cancer diagnosis and classification. In recent years, many methods have been used to identify differentially expressed genes. However, the recognition rate and reliability of gene selection still need to be improved. In this paper, a novel constrained method named robust nonnegative matrix factorization via joint graph Laplacian and discriminative information (GLD-RNMF) is proposed for identifying differentially expressed genes, in which manifold learning and the discriminative label information are incorporated into the traditional nonnegative matrix factorization model to train the objective matrix. Specifically,L2,1-norm minimization is enforced on both the error function and the regularization term which is robust to outliers and noise in gene data. Furthermore, the multiplicative update rules and the details of convergence proof are shown for the new model. The experimental results on two publicly available cancer datasets demonstrate that GLD-RNMF is an effective method for identifying differentially expressed genes.


Sign in / Sign up

Export Citation Format

Share Document