Manifold Learning for Visualizing and Analyzing High-dimensional Data

Author(s):  
Junping Zhang ◽  
Hua Huang ◽  
Jue Wang
Mathematics ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 406
Author(s):  
Harold A. Hernández-Roig ◽  
M. Carmen Aguilera-Morillo ◽  
Rosa E. Lillo

This paper introduces stringing via Manifold Learning (ML-stringing), an alternative to the original stringing based on Unidimensional Scaling (UDS). Our proposal is framed within a wider class of methods that map high-dimensional observations to the infinite space of functions, allowing the use of Functional Data Analysis (FDA). Stringing handles general high-dimensional data as scrambled realizations of an unknown stochastic process. Therefore, the essential feature of the method is a rearrangement of the observed values. Motivated by the linear nature of UDS and the increasing number of applications to biosciences (e.g., functional modeling of gene expression arrays and single nucleotide polymorphisms, or the classification of neuroimages) we aim to recover more complex relations between predictors through ML. In simulation studies, it is shown that ML-stringing achieves higher-quality orderings and that, in general, this leads to improvements in the functional representation and modeling of the data. The versatility of our method is also illustrated with an application to a colon cancer study that deals with high-dimensional gene expression arrays. This paper shows that ML-stringing is a feasible alternative to the UDS-based version. Also, it opens a window to new contributions to the field of FDA and the study of high-dimensional data.


2021 ◽  
Vol 7 ◽  
pp. e477
Author(s):  
Amalia Villa ◽  
Abhijith Mundanad Narayanan ◽  
Sabine Van Huffel ◽  
Alexander Bertrand ◽  
Carolina Varon

Feature selection techniques are very useful approaches for dimensionality reduction in data analysis. They provide interpretable results by reducing the dimensions of the data to a subset of the original set of features. When the data lack annotations, unsupervised feature selectors are required for their analysis. Several algorithms for this aim exist in the literature, but despite their large applicability, they can be very inaccessible or cumbersome to use, mainly due to the need for tuning non-intuitive parameters and the high computational demands. In this work, a publicly available ready-to-use unsupervised feature selector is proposed, with comparable results to the state-of-the-art at a much lower computational cost. The suggested approach belongs to the methods known as spectral feature selectors. These methods generally consist of two stages: manifold learning and subset selection. In the first stage, the underlying structures in the high-dimensional data are extracted, while in the second stage a subset of the features is selected to replicate these structures. This paper suggests two contributions to this field, related to each of the stages involved. In the manifold learning stage, the effect of non-linearities in the data is explored, making use of a radial basis function (RBF) kernel, for which an alternative solution for the estimation of the kernel parameter is presented for cases with high-dimensional data. Additionally, the use of a backwards greedy approach based on the least-squares utility metric for the subset selection stage is proposed. The combination of these new ingredients results in the utility metric for unsupervised feature selection U2FS algorithm. The proposed U2FS algorithm succeeds in selecting the correct features in a simulation environment. In addition, the performance of the method on benchmark datasets is comparable to the state-of-the-art, while requiring less computational time. Moreover, unlike the state-of-the-art, U2FS does not require any tuning of parameters.


2021 ◽  
pp. 1-19
Author(s):  
Guo Niu ◽  
Zhengming Ma ◽  
Haoqing Chen ◽  
Xue Su

Manifold learning plays an important role in nonlinear dimensionality reduction. But many manifold learning algorithms cannot offer an explicit expression for dealing with the problem of out-of-sample (or new data). In recent, many improved algorithms introduce a fixed function to the object function of manifold learning for learning this expression. In manifold learning, the relationship between the high-dimensional data and its low-dimensional representation is a local homeomorphic mapping. Therefore, these improved algorithms actually change or damage the intrinsic structure of manifold learning, as well as not manifold learning. In this paper, a novel manifold learning based on polynomial approximation (PAML) is proposed, which learns the polynomial approximation of manifold learning by using the dimensionality reduction results of manifold learning and the original high-dimensional data. In particular, we establish a polynomial representation of high-dimensional data with Kronecker product, and learns an optimal transformation matrix with this polynomial representation. This matrix gives an explicit and optimal nonlinear mapping between the high-dimensional data and its low-dimensional representation, and can be directly used for solving the problem of new data. Compare with using the fixed linear or nonlinear relationship instead of the manifold relationship, our proposed method actually learns the polynomial optimal approximation of manifold learning, without changing the object function of manifold learning (i.e., keeping the intrinsic structure of manifold learning). We implement experiments over eight data sets with the advanced algorithms published in recent years to demonstrate the benefits of our algorithm.


2009 ◽  
Vol 35 (7) ◽  
pp. 859-866
Author(s):  
Ming LIU ◽  
Xiao-Long WANG ◽  
Yuan-Chao LIU

Sign in / Sign up

Export Citation Format

Share Document