scholarly journals Utility-Aware Graph Dimensionality Reduction Approach

10.29007/2p7l ◽  
2020 ◽  
Author(s):  
Lamyaa Al-Omairi ◽  
Jemal Abawajy ◽  
Morshed Chowdhury

In recent years graphs with massive nodes and edges have become widely used in various application fields, for example, social networks, web mining, traffic on transport, and more. Several researchers have shown that reducing the dimensions is very important in analyzing extensive graph data. They applied a variety of dimensionality reduction strategies, including linear methods or nonlinear methods. However, it is still not clear to what extent the information is lost or preserved when these techniques are applied to reduce the dimensions of large networks. In this study, we measured the utility of graph dimensionality reduction, and we proved when using the very recently suggested method, which is HDR to reduce dimensional for graph, the utility loss will be small compared with popular linear techniques, such as PCA, LDA, FA, and MDS. We measured the utility based on three essential network metrics: Average Clustering Coefficient (ACC), Average Path Length (APL), and Average Betweenness (ABW). The results showed that HDR achieved a lower rate of utility loss compared to other dimensionality reduction methods. We performed our experiments on the three undirected and unweighted graph datasets.

10.29007/h232 ◽  
2019 ◽  
Author(s):  
Lamyaa Al-Omairi ◽  
Jemal Abawajy ◽  
Morshed Chowdhury ◽  
Tahsien Al-Quraishi

In recent years, graph data analysis has become very important in modeling data distribution or structure in many applications, for example, social science, astronomy, computational biology or social networks with a massive number of nodes and edges. However, high-dimensionality of the graph data remains a difficult task, mainly because the analysis system is not used to dealing with large graph data. Therefore, graph-based dimensionality reduction approaches have been widely used in many machine learning and pattern recognition applications. This paper offers a novel dimensionality reduction approach based on the recent graph data. In particular, we focus on combining two linear methods: Neighborhood Preserving Embedding (NPE) method with the aim of preserving the local neighborhood information of a given dataset, and Principal Component Analysis (PCA) method with aims of maximizing the mutual information between the original high-dimensional data sets. The combination of NPE and PCA contributes to proposing a new Hybrid dimensionality reduction technique (HDR). We propose HDR to create a transformation matrix, based on formulating a generalized eigenvalue problem and solving it with Rayleigh Quotient solution. Consequently, therefore, a massive reduction is achieved compared to the use of PCA and NPE separately. We compared the results with the conventional PCA, NPE, and other linear dimension reduction methods. The proposed method HDR was found to perform better than other techniques. Experimental results have been based on two real datasets.


2013 ◽  
Vol 38 (4) ◽  
pp. 465-470 ◽  
Author(s):  
Jingjie Yan ◽  
Xiaolan Wang ◽  
Weiyi Gu ◽  
LiLi Ma

Abstract Speech emotion recognition is deemed to be a meaningful and intractable issue among a number of do- mains comprising sentiment analysis, computer science, pedagogy, and so on. In this study, we investigate speech emotion recognition based on sparse partial least squares regression (SPLSR) approach in depth. We make use of the sparse partial least squares regression method to implement the feature selection and dimensionality reduction on the whole acquired speech emotion features. By the means of exploiting the SPLSR method, the component parts of those redundant and meaningless speech emotion features are lessened to zero while those serviceable and informative speech emotion features are maintained and selected to the following classification step. A number of tests on Berlin database reveal that the recogni- tion rate of the SPLSR method can reach up to 79.23% and is superior to other compared dimensionality reduction methods.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Joshua T. Vogelstein ◽  
Eric W. Bridgeford ◽  
Minh Tang ◽  
Da Zheng ◽  
Christopher Douville ◽  
...  

AbstractTo solve key biomedical problems, experimentalists now routinely measure millions or billions of features (dimensions) per sample, with the hope that data science techniques will be able to build accurate data-driven inferences. Because sample sizes are typically orders of magnitude smaller than the dimensionality of these data, valid inferences require finding a low-dimensional representation that preserves the discriminating information (e.g., whether the individual suffers from a particular disease). There is a lack of interpretable supervised dimensionality reduction methods that scale to millions of dimensions with strong statistical theoretical guarantees. We introduce an approach to extending principal components analysis by incorporating class-conditional moment estimates into the low-dimensional projection. The simplest version, Linear Optimal Low-rank projection, incorporates the class-conditional means. We prove, and substantiate with both synthetic and real data benchmarks, that Linear Optimal Low-Rank Projection and its generalizations lead to improved data representations for subsequent classification, while maintaining computational efficiency and scalability. Using multiple brain imaging datasets consisting of more than 150 million features, and several genomics datasets with more than 500,000 features, Linear Optimal Low-Rank Projection outperforms other scalable linear dimensionality reduction techniques in terms of accuracy, while only requiring a few minutes on a standard desktop computer.


2010 ◽  
Vol 31 (12) ◽  
pp. 1720-1727 ◽  
Author(s):  
Michał Lewandowski ◽  
Dimitrios Makris ◽  
Jean-Christophe Nebel

2013 ◽  
Vol 27 (3-4) ◽  
pp. 281-301 ◽  
Author(s):  
Jesús González-Rubio ◽  
J. Ramón Navarro-Cerdán ◽  
Francisco Casacuberta

2020 ◽  
Vol 49 (3) ◽  
pp. 421-437
Author(s):  
Genggeng Liu ◽  
Lin Xie ◽  
Chi-Hua Chen

Dimensionality reduction plays an important role in the data processing of machine learning and data mining, which makes the processing of high-dimensional data more efficient. Dimensionality reduction can extract the low-dimensional feature representation of high-dimensional data, and an effective dimensionality reduction method can not only extract most of the useful information of the original data, but also realize the function of removing useless noise. The dimensionality reduction methods can be applied to all types of data, especially image data. Although the supervised learning method has achieved good results in the application of dimensionality reduction, its performance depends on the number of labeled training samples. With the growing of information from internet, marking the data requires more resources and is more difficult. Therefore, using unsupervised learning to learn the feature of data has extremely important research value. In this paper, an unsupervised multilayered variational auto-encoder model is studied in the text data, so that the high-dimensional feature to the low-dimensional feature becomes efficient and the low-dimensional feature can retain mainly information as much as possible. Low-dimensional feature obtained by different dimensionality reduction methods are used to compare with the dimensionality reduction results of variational auto-encoder (VAE), and the method can be significantly improved over other comparison methods.


Sign in / Sign up

Export Citation Format

Share Document