Comparative Study of Dimensionality Reduction Techniques for Spectral–Temporal Data

Shingchern D. You; Ming-Jen Hung

doi:10.3390/info12010001

Comparative Study of Dimensionality Reduction Techniques for Spectral–Temporal Data

Information ◽

10.3390/info12010001 ◽

2020 ◽

Vol 12 (1) ◽

pp. 1

Author(s):

Shingchern D. You ◽

Ming-Jen Hung

Keyword(s):

Principal Component ◽

Component Analysis ◽

Identification Accuracy ◽

Temporal Data ◽

Reduction Techniques ◽

Temporal Features ◽

Dimensionality Reduction Techniques ◽

Low Dimensional ◽

Low Dimensional Features ◽

Motion Picture Expert Group

This paper studies the use of three different approaches to reduce the dimensionality of a type of spectral–temporal features, called motion picture expert group (MPEG)-7 audio signature descriptors (ASD). The studied approaches include principal component analysis (PCA), independent component analysis (ICA), and factor analysis (FA). These approaches are applied to ASD features obtained from audio items with or without distortion. These low-dimensional features are used as queries to a dataset containing low-dimensional features extracted from undistorted items. Doing so, we may investigate the distortion-resistant capability of each approach. The experimental results show that features obtained by the ICA or FA reduction approaches have higher identification accuracy than the PCA approach for moderately distorted items. Therefore, to extract features from distorted items, ICA or FA approaches should also be considered in addition to the PCA approach.

Download Full-text

Performance Analysis of Dimensionality Reduction Techniques in the Context of Clustering

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2019.8.s3.2084 ◽

2019 ◽

Vol 8 (S3) ◽

pp. 66-71

Author(s):

T. Sudha ◽

P. Nagendra Kumar

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

High Dimensional Data ◽

Principal Component ◽

Component Analysis ◽

High Dimensional ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Low Dimensional ◽

Probabilistic Principal Component Analysis

Data mining is one of the major areas of research. Clustering is one of the main functionalities of datamining. High dimensionality is one of the main issues of clustering and Dimensionality reduction can be used as a solution to this problem. The present work makes a comparative study of dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis in the context of clustering. High dimensional data have been reduced to low dimensional data using dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis. Cluster analysis has been performed on the high dimensional data as well as the low dimensional data sets obtained through t-distributed stochastic neighbour embedding and Probabilistic principal component analysis with varying number of clusters. Mean squared error; time and space have been considered as parameters for comparison. The results obtained show that time taken to convert the high dimensional data into low dimensional data using probabilistic principal component analysis is higher than the time taken to convert the high dimensional data into low dimensional data using t-distributed stochastic neighbour embedding.The space required by the data set reduced through Probabilistic principal component analysis is less than the storage space required by the data set reduced through t-distributed stochastic neighbour embedding.

Download Full-text

Principal Component Analysis

ACM Computing Surveys ◽

10.1145/3447755 ◽

2021 ◽

Vol 54 (4) ◽

pp. 1-34

Author(s):

Felipe L. Gewers ◽

Gustavo R. Ferreira ◽

Henrique F. De Arruda ◽

Filipi N. Silva ◽

Cesar H. Comin ◽

...

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

Real World ◽

Principal Component ◽

Component Analysis ◽

Basic Principles ◽

Data Standardization ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Real World Datasets

Principal component analysis (PCA) is often applied for analyzing data in the most diverse areas. This work reports, in an accessible and integrated manner, several theoretical and practical aspects of PCA. The basic principles underlying PCA, data standardization, possible visualizations of the PCA results, and outlier detection are subsequently addressed. Next, the potential of using PCA for dimensionality reduction is illustrated on several real-world datasets. Finally, we summarize PCA-related approaches and other dimensionality reduction techniques. All in all, the objective of this work is to assist researchers from the most diverse areas in using and interpreting PCA.

Download Full-text

Dimensionality Reduction Techniques for Visualizing Morphometric Data: Comparing Principal Component Analysis to Nonlinear Methods

Evolutionary Biology ◽

10.1007/s11692-018-9464-9 ◽

2018 ◽

Vol 46 (1) ◽

pp. 106-121 ◽

Cited By ~ 4

Author(s):

Trina Y. Du

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

Principal Component ◽

Component Analysis ◽

Morphometric Data ◽

Nonlinear Methods ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

Download Full-text

Dimensionality reduction for EEG-based sleep stage detection: comparison of autoencoders, principal component analysis and factor analysis

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2020-0139 ◽

2020 ◽

Vol 0 (0) ◽

Author(s):

Alexandra-Maria Tăuţan ◽

Alessandro C. Rossi ◽

Ruben de Francisco ◽

Bogdan Ionescu

Keyword(s):

Principal Component Analysis ◽

Factor Analysis ◽

Dimensionality Reduction ◽

Promising Result ◽

Experimental Testing ◽

Sleep Stage ◽

Principal Component ◽

Component Analysis ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

AbstractMethods developed for automatic sleep stage detection make use of large amounts of data in the form of polysomnographic (PSG) recordings to build predictive models. In this study, we investigate the effect of several dimensionality reduction techniques, i.e., principal component analysis (PCA), factor analysis (FA), and autoencoders (AE) on common classifiers, e.g., random forests (RF), multilayer perceptron (MLP), long-short term memory (LSTM) networks, for automated sleep stage detection. Experimental testing is carried out on the MGH Dataset provided in the “You Snooze, You Win: The PhysioNet/Computing in Cardiology Challenge 2018”. The signals used as input are the six available (EEG) electoencephalographic channels and combinations with the other PSG signals provided: ECG – electrocardiogram, EMG – electromyogram, respiration based signals – respiratory efforts and airflow. We observe that a similar or improved accuracy is obtained in most cases when using all dimensionality reduction techniques, which is a promising result as it allows to reduce the computational load while maintaining performance and in some cases also improves the accuracy of automated sleep stage detection. In our study, using autoencoders for dimensionality reduction maintains the performance of the model, while using PCA and FA the accuracy of the models is in most cases improved.

Download Full-text

Data Augmentation Through Monte Carlo Arithmetic Leads to More Generalizable Classification in Connectomics

10.1101/2020.12.16.423084 ◽

2020 ◽

Author(s):

Gregory Kiar ◽

Yohan Chatelain ◽

Ali Salari ◽

Alan C. Evans ◽

Tristan Glatard

Keyword(s):

Monte Carlo ◽

Data Augmentation ◽

Learning Models ◽

Reduction Techniques ◽

Numerical Noise ◽

Numerical Instabilities ◽

Dimensionality Reduction Techniques ◽

Improved Performance ◽

Low Dimensional ◽

Machine Learning Models

AbstractMachine learning models are commonly applied to human brain imaging datasets in an effort to associate function or structure with behaviour, health, or other individual phenotypes. Such models often rely on low-dimensional maps generated by complex processing pipelines. However, the numerical instabilities inherent to pipelines limit the fidelity of these maps and introduce computational bias. Monte Carlo Arithmetic, a technique for introducing controlled amounts of numerical noise, was used to perturb a structural connectome estimation pipeline, ultimately producing a range of plausible networks for each sample. The variability in the perturbed networks was captured in an augmented dataset, which was then used for an age classification task. We found that resampling brain networks across a series of such numerically perturbed outcomes led to improved performance in all tested classifiers, preprocessing strategies, and dimensionality reduction techniques. Importantly, we find that this benefit does not hinge on a large number of perturbations, suggesting that even minimally perturbing a dataset adds meaningful variance which can be captured in the subsequently designed models.

Download Full-text

Research for Structural Damage Identification Method Based on Stable Time Series and Principal Component Analysis

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.578-579.1020 ◽

2014 ◽

Vol 578-579 ◽

pp. 1020-1023

Author(s):

Jing Zhou Lu ◽

Jia Chen Wang ◽

Xu Zhu

Keyword(s):

Principal Component Analysis ◽

Time Series ◽

Damage Detection ◽

Structural Damage ◽

Computing Time ◽

Principal Component ◽

Component Analysis ◽

Identification Accuracy ◽

Frame Structure ◽

Ar Model

In this paper, we introduce a set of techniques for time series analysis based on principal component analysis (PCA). Firstly, the autoregressive (AR) model is established using acceleration response data, and the root mean squared error (RMSE) of AR model is calculated based on PCA. Then a new damage sensitive feature (DSF) based on the AR coefficients is presented. To test the efficacy of the damage detection and localization methodologies, the algorithm has been tested on the analytical and experimental results of a three-story frame structure model of the Los Alamos National Laboratory. The result of the damage detection indicates that the algorithm is able to identify and localize minor to severe damage as defined for the structure. It shows that the suggested method can lead to less amount of computing time, high suitability and identification accuracy.

Download Full-text

Uniform Manifold Approximation and Projection for Clustering Taxa through Vocalizations in a Neotropical Passerine (Rough-Legged Tyrannulet, Phyllomyias burmeisteri)

Animals ◽

10.3390/ani10081406 ◽

2020 ◽

Vol 10 (8) ◽

pp. 1406

Author(s):

Ronald M. Parra-Hernández ◽

Jorge I. Posada-Quintero ◽

Orlando Acevedo-Charry ◽

Hugo F. Posada-Quintero

Keyword(s):

Principal Component Analysis ◽

Bird Species ◽

Principal Component ◽

Taxonomic Status ◽

Behavioral Traits ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Taxonomic Groups ◽

Source Of Information

Vocalizations from birds are a fruitful source of information for the classification of species. However, currently used analyses are ineffective to determine the taxonomic status of some groups. To provide a clearer grouping of taxa for such bird species from the analysis of vocalizations, more sensitive techniques are required. In this study, we have evaluated the sensitivity of the Uniform Manifold Approximation and Projection (UMAP) technique for grouping the vocalizations of individuals of the Rough-legged Tyrannulet Phyllomyias burmeisteri complex. Although the existence of two taxonomic groups has been suggested by some studies, the species has presented taxonomic difficulties in classification in previous studies. UMAP exhibited a clearer separation of groups than previously used dimensionality-reduction techniques (i.e., principal component analysis), as it was able to effectively identify the two taxa groups. The results achieved with UMAP in this study suggest that the technique can be useful in the analysis of species with complex in taxonomy through vocalizations data as a complementary tool including behavioral traits such as acoustic communication.

Download Full-text

An Approach to Data Reduction for Learning from Big Datasets: Integrating Stacking, Rotation, and Agent Population Learning Techniques

Complexity ◽

10.1155/2018/7404627 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 5

Author(s):

Ireneusz Czarnowski ◽

Piotr Jędrzejowicz

Keyword(s):

Data Reduction ◽

Learning Algorithm ◽

Research Question ◽

Principal Component ◽

Component Analysis ◽

Classifier Ensembles ◽

Agent Based ◽

Reduction Techniques ◽

Learning Techniques ◽

Population Learning

In the paper, several data reduction techniques for machine learning from big datasets are discussed and evaluated. The discussed approach focuses on combining several techniques including stacking, rotation, and data reduction aimed at improving the performance of the machine classification. Stacking is seen as the technique allowing to take advantage of the multiple classification models. The rotation-based techniques are used to increase the heterogeneity of the stacking ensembles. Data reduction makes it possible to classify instances belonging to big datasets. We propose to use an agent-based population learning algorithm for data reduction in the feature and instance dimensions. For diversification of the classifier ensembles within the rotation also, alternatively, principal component analysis and independent component analysis are used. The research question addressed in the paper is formulated as follows: does the performance of a classifier using the reduced dataset be improved by integrating the data reduction mechanism with the rotation-based technique and the stacking?

Download Full-text

Object Tracking Using Probabilistic Principal Component Analysis Based on Particle Filtering Framework

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.341-342.790 ◽

2011 ◽

Vol 341-342 ◽

pp. 790-797 ◽

Cited By ~ 2

Author(s):

Zhi Yan Xiang ◽

Tie Yong Cao ◽

Peng Zhang ◽

Tao Zhu ◽

Jing Feng Pan

Keyword(s):

Principal Component Analysis ◽

Object Tracking ◽

Particle Filtering ◽

Principal Component ◽

Component Analysis ◽

Dimensional Subspace ◽

Target Object ◽

Scale Invariant ◽

Low Dimensional ◽

Probabilistic Principal Component Analysis

In this paper, an object tracking approach is introduced for color video sequences. The approach presents the integration of color distributions and probabilistic principal component analysis (PPCA) into particle filtering framework. Color distributions are robust to partial occlusion, are rotation and scale invariant and are calculated efficiently. Principal Component Analysis (PCA) is used to update the eigenbasis and the mean, which can reflect the appearance changes of the tracked object. And a low dimensional subspace representation of PPCA efficiently adapts to these changes of appearance of the target object. At the same time, a forgetting factor is incorporated into the updating process, which can be used to economize on processing time and enhance the efficiency of object tracking. Computer simulation experiments demonstrate the effectiveness and the robustness of the proposed tracking algorithm when the target object undergoes pose and scale changes, defilade and complex background.

Download Full-text

Laser Active Image-Denoising Based on Principal Component Analysis with Local Pixel Grouping

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.753 ◽

2014 ◽

Vol 571-572 ◽

pp. 753-756

Author(s):

Wei Li Li ◽

Xiao Qing Yin ◽

Bin Wang ◽

Mao Jun Zhang ◽

Ke Tan

Keyword(s):

Principal Component Analysis ◽

Image Denoising ◽

Principal Component ◽

Speckle Noise ◽

Component Analysis ◽

Median Filtering ◽

Denoising Method ◽

Structure Information ◽

Pixel Grouping ◽

Low Dimensional

Denoising is an important issue for laser active image. This paper attempted to process laser active image in the low-dimensional sub-space. We adopted the principal component analysis with local pixel grouping (LPG-PCA) denoising method proposed by Zhang [1], and compared it with the conventional denoising method for laser active image, such as wavelet filtering, wavelet soft threshold filtering and median filtering. Experimental results show that the image denoised by LPG-PCA has higher BIQI value than other images, most of the speckle noise can be reduced and the detail structure information is well preserved. The low-dimensional sub-space idea is a new direction for laser active image denoising.

Download Full-text