On Simultaneous Data-Based Dimension Reduction and Hidden Phase Identification

Illia Horenko

doi:10.1175/2007jas2587.1

On Simultaneous Data-Based Dimension Reduction and Hidden Phase Identification

Journal of the Atmospheric Sciences ◽

10.1175/2007jas2587.1 ◽

2008 ◽

Vol 65 (6) ◽

pp. 1941-1954 ◽

Cited By ~ 21

Author(s):

Illia Horenko

Keyword(s):

Dimension Reduction ◽

Meteorological Data ◽

Original Data ◽

Fixed Number ◽

Multidimensional Data ◽

Lorenz Attractor ◽

Phase Identification ◽

Extended Space ◽

Residual Functional ◽

Low Dimensional

Abstract A problem of simultaneous dimension reduction and identification of hidden attractive manifolds in multidimensional data with noise is considered. The problem is approached in two consecutive steps: (i) embedding the original data in a sufficiently high-dimensional extended space in a way proposed by Takens in his embedding theorem, followed by (ii) a minimization of the residual functional. The residual functional is constructed to measure the distance between the original data in extended space and their reconstruction based on a low-dimensional description. The reduced representation of the analyzed data results from projection onto a fixed number of unknown low-dimensional manifolds. Two specific forms of the residual functional are proposed, defining two different types of essential coordinates: (i) localized essential orthogonal functions (EOFs) and (ii) localized functions called principal original components (POCs). The application of the framework is exemplified both on a Lorenz attractor model with measurement noise and on historical air temperature data. It is demonstrated how the new method can be used for the elimination of noise and identification of the seasonal low-frequency components in meteorological data. An application of the proposed POCs in the context of the low-dimensional predictive models construction is presented.

Download Full-text

Feature Dimension Reduction and Graph Based Ranking Based Image Classification

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.380-384.4035 ◽

2013 ◽

Vol 380-384 ◽

pp. 4035-4038 ◽

Cited By ~ 1

Author(s):

Nan Yao ◽

Feng Qian ◽

Zuo Lei Sun

Keyword(s):

Dimensionality Reduction ◽

Dimension Reduction ◽

Image Classification ◽

Learning Algorithm ◽

Contextual Information ◽

Original Data ◽

Classification Problem ◽

Image Features ◽

Similarity Learning ◽

Low Dimensional

Dimensionality reduction (DR) of image features plays an important role in image retrieval and classification tasks. Recently, two types of methods have been proposed to improve both the accuracy and efficiency for the dimensionality reduction problem. One uses Non-negative matrix factorization (NMF) to describe the image distribution on the space of base matrix. Another one for dimension reduction trains a subspace projection matrix to project original data space into some low-dimensional subspaces which have deep architecture, so that the low-dimensional codes would be learned. At the same time, the graph based similarity learning algorithm which tries to exploit contextual information for improving the effectiveness of image rankings is also proposed for image class and retrieval problem. In this paper, after above two methods mentioned are utilized to reduce the high-dimensional features of images respectively, we learn the graph based similarity for the image classification problem. This paper compares the proposed approach with other approaches on an image database.

Download Full-text

Unsupervised Anomaly Detection Based on Deep Autoencoding and Clustering

Security and Communication Networks ◽

10.1155/2021/7389943 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Chuanlei Zhang ◽

Jiangtao Liu ◽

Wei Chen ◽

Jinyuan Shi ◽

Minda Yao ◽

...

Keyword(s):

Anomaly Detection ◽

Dimension Reduction ◽

Density Estimation ◽

Original Data ◽

Industrial Applications ◽

Multidimensional Data ◽

High Dimensional ◽

Clustering Methods ◽

Detection Scheme ◽

Unsupervised Anomaly Detection

The unsupervised anomaly detection task based on high-dimensional or multidimensional data occupies a very important position in the field of machine learning and industrial applications; especially in the aspect of network security, the anomaly detection of network data is particularly important. The key to anomaly detection is density estimation. Although the methods of dimension reduction and density estimation have made great progress in recent years, most dimension reduction methods are difficult to retain the key information of original data or multidimensional data. Recent studies have shown that the deep autoencoder (DAE) can solve this problem well. In order to improve the performance of unsupervised anomaly detection, we propose an anomaly detection scheme based on a deep autoencoder (DAE) and clustering methods. The deep autoencoder is trained to learn the compressed representation of the input data and then feed it to clustering approach. This scheme makes full use of the advantages of the deep autoencoder (DAE) to generate low-dimensional representation and reconstruction errors for the input high-dimensional or multidimensional data and uses them to reconstruct the input samples. The proposed scheme could eliminate redundant information contained in the data, improve performance of clustering methods in identifying abnormal samples, and reduce the amount of calculation. To verify the effectiveness of the proposed scheme, massive comparison experiments have been conducted with traditional dimension reduction algorithms and clustering methods. The results of experiments demonstrate that, in most cases, the proposed scheme outperforms the traditional dimension reduction algorithms with different clustering methods.

Download Full-text

Google Earth Engine Sentinel-3 OLCI Level-1 Dataset Deviates from the Original Data: Causes and Consequences

Remote Sensing ◽

10.3390/rs13061098 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1098

Author(s):

Egor Prikaziuk ◽

Peiqi Yang ◽

Christiaan van der Tol

Keyword(s):

Meteorological Data ◽

Original Data ◽

Google Earth ◽

Target Area ◽

Pixel Resolution ◽

Point Of Interest ◽

Homogeneity Assumption ◽

Google Earth Engine ◽

Level 1 ◽

Monitoring Service

In this study, we demonstrate that the Google Earth Engine (GEE) dataset of Sentinel-3 Ocean and Land Color Instrument (OLCI) level-1 deviates from the original Copernicus Open Access Data Hub Service (DHUS) data by 10–20 W m−2 sr−1μμm−1 per pixel per band. We compared GEE and DHUS single pixel time series for the period from April 2016 to September 2020 and identified two sources of this discrepancy: the ground pixel position and reprojection. The ground pixel position of OLCI product can be determined in two ways: from geo-coordinates (DHUS) or from tie-point coordinates (GEE). We recommend using geo-coordinates for pixel extraction from the original data. When the Sentinel Application Platform (SNAP) Pixel Extraction Tool is used, an additional distance check has to be conducted to exclude pixels that lay further than 212 m from the point of interest. Even geo-coordinates-based pixel extraction requires the homogeneity of the target area at a 700 m diameter (49 ha) footprint (double of the pixel resolution). The GEE OLCI dataset can be safely used if the homogeneity assumption holds at 2700 m diameter (9-by-9 OLCI pixels) or if the uncertainty in the radiance of 10% is not critical for the application. Further analysis showed that the scaling factors reported in the GEE dataset description must not be used. Finally, observation geometry and meteorological data are not present in the GEE OLCI dataset, but they are crucial for most applications. Therefore, we propose to calculate angles and extraterrestrial solar fluxes and to use an alternative data source—the Copernicus Atmosphere Monitoring Service (CAMS) dataset—for meteodata.

Download Full-text

Unsupervised Text Feature Learning via Deep Variational Auto-encoder

Information Technology And Control ◽

10.5755/j01.itc.49.3.25918 ◽

2020 ◽

Vol 49 (3) ◽

pp. 421-437

Author(s):

Genggeng Liu ◽

Lin Xie ◽

Chi-Hua Chen

Keyword(s):

Dimensionality Reduction ◽

High Dimensional Data ◽

Image Data ◽

Original Data ◽

Feature Representation ◽

High Dimensional ◽

Learning To Learn ◽

Text Feature ◽

Reduction Methods ◽

Low Dimensional

Dimensionality reduction plays an important role in the data processing of machine learning and data mining, which makes the processing of high-dimensional data more efficient. Dimensionality reduction can extract the low-dimensional feature representation of high-dimensional data, and an effective dimensionality reduction method can not only extract most of the useful information of the original data, but also realize the function of removing useless noise. The dimensionality reduction methods can be applied to all types of data, especially image data. Although the supervised learning method has achieved good results in the application of dimensionality reduction, its performance depends on the number of labeled training samples. With the growing of information from internet, marking the data requires more resources and is more difficult. Therefore, using unsupervised learning to learn the feature of data has extremely important research value. In this paper, an unsupervised multilayered variational auto-encoder model is studied in the text data, so that the high-dimensional feature to the low-dimensional feature becomes efficient and the low-dimensional feature can retain mainly information as much as possible. Low-dimensional feature obtained by different dimensionality reduction methods are used to compare with the dimensionality reduction results of variational auto-encoder (VAE), and the method can be significantly improved over other comparison methods.

Download Full-text

Dimension-reduction simplifies the analysis of signal crosstalk in a bacterial quorum sensing pathway

Scientific Reports ◽

10.1038/s41598-021-99169-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Taylor Miller ◽

Keval Patel ◽

Coralis Rodriguez ◽

Eric V. Stabb ◽

Stephen J. Hagen

Keyword(s):

Dimension Reduction ◽

Vibrio Fischeri ◽

Individual Cell ◽

Full Range ◽

Homoserine Lactone ◽

Tuning Parameter ◽

Statistical Distributions ◽

Cell Responses ◽

Low Dimensional ◽

Signal Crosstalk

AbstractMany pheromone sensing bacteria produce and detect more than one chemically distinct signal, or autoinducer. The pathways that detect these signals are typically noisy and interlocked through crosstalk and feedback. As a result, the sensing response of individual cells is described by statistical distributions that change under different combinations of signal inputs. Here we examine how signal crosstalk reshapes this response. We measure how combinations of two homoserine lactone (HSL) input signals alter the statistical distributions of individual cell responses in the AinS/R- and LuxI/R-controlled branches of the Vibrio fischeri bioluminescence pathway. We find that, while the distributions of pathway activation in individual cells vary in complex fashion with environmental conditions, these changes have a low-dimensional representation. For both the AinS/R and LuxI/R branches, the distribution of individual cell responses to mixtures of the two HSLs is effectively one-dimensional, so that a single tuning parameter can capture the full range of variability in the distributions. Combinations of crosstalking HSL signals extend the range of responses for each branch of the circuit, so that signals in combination allow population-wide distributions that are not available under a single HSL input. Dimension reduction also simplifies the problem of identifying the HSL conditions to which the pathways and their outputs are most sensitive. A comparison of the maximum sensitivity HSL conditions to actual HSL levels measured during culture growth indicates that the AinS/R and LuxI/R branches lack sensitivity to population density except during the very earliest and latest stages of growth respectively.

Download Full-text

Five-dimensional interpolation: Recovering from acquisition constraints

Geophysics ◽

10.1190/1.3245216 ◽

2009 ◽

Vol 74 (6) ◽

pp. V123-V132 ◽

Cited By ~ 143

Author(s):

Daniel Trad

Keyword(s):

Seismic Data ◽

Original Data ◽

Spatial Spectrum ◽

Optimal Interpolation ◽

Multidimensional Data ◽

Inversion Algorithm ◽

Five Dimensions ◽

Fourier Reconstruction ◽

Simultaneous Interpolation ◽

Seismic Amplitudes

Although 3D seismic data are being acquired in larger volumes than ever before, the spatial sampling of these volumes is not always adequate for certain seismic processes. This is especially true of marine and land wide-azimuth acquisitions, leading to the development of multidimensional data interpolation techniques. Simultaneous interpolation in all five seismic data dimensions (inline, crossline, offset, azimuth, and frequency) has great utility in predicting missing data with correct amplitude and phase variations. Although there are many techniques that can be implemented in five dimensions, this study focused on sparse Fourier reconstruction. The success of Fourier interpolation methods depends largely on two factors: (1) having efficient Fourier transform operators that permit the use of large multidimensional data windows and (2) constraining the spatial spectrum along dimensions where seismic amplitudes change slowly so that the sparseness and band limitation assumptions remain valid. Fourier reconstruction can be performed when enforcing a sparseness constraint on the 4D spatial spectrum obtained from frequency slices of five-dimensional windows. Binning spatial positions into a fine 4D grid facilitates the use of the FFT, which helps on the convergence of the inversion algorithm. This improves the results and computational efficiency. The 5D interpolation can successfully interpolate sparse data, improve AVO analysis, and reduce migration artifacts. Target geometries for optimal interpolation and regularization of land data can be classified in terms of whether they preserve the original data and whether they are designed to achieve surface or subsurface consistency.

Download Full-text

Local dimension reduction of summary statistics for likelihood-free inference

Statistics and Computing ◽

10.1007/s11222-019-09905-w ◽

2019 ◽

Vol 30 (3) ◽

pp. 559-570

Author(s):

Jukka Sirén ◽

Samuel Kaski

Keyword(s):

Dimension Reduction ◽

Estimation Accuracy ◽

Summary Statistics ◽

Functional Relationships ◽

Reduction Techniques ◽

Space Localization ◽

Reliable Performance ◽

Whole Space ◽

Approximate Bayesian ◽

Low Dimensional

Abstract Approximate Bayesian computation (ABC) and other likelihood-free inference methods have gained popularity in the last decade, as they allow rigorous statistical inference for complex models without analytically tractable likelihood functions. A key component for accurate inference with ABC is the choice of summary statistics, which summarize the information in the data, but at the same time should be low-dimensional for efficiency. Several dimension reduction techniques have been introduced to automatically construct informative and low-dimensional summaries from a possibly large pool of candidate summaries. Projection-based methods, which are based on learning simple functional relationships from the summaries to parameters, are widely used and usually perform well, but might fail when the assumptions behind the transformation are not satisfied. We introduce a localization strategy for any projection-based dimension reduction method, in which the transformation is estimated in the neighborhood of the observed data instead of the whole space. Localization strategies have been suggested before, but the performance of the transformed summaries outside the local neighborhood has not been guaranteed. In our localization approach the transformation is validated and optimized over validation datasets, ensuring reliable performance. We demonstrate the improvement in the estimation accuracy for localized versions of linear regression and partial least squares, for three different models of varying complexity.

Download Full-text

Complex Moment-Based Supervised Eigenmap for Dimensionality Reduction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013910 ◽

2019 ◽

Vol 33 ◽

pp. 3910-3918 ◽

Cited By ~ 1

Author(s):

Akira Imakura ◽

Momo Matsuda ◽

Xiucai Ye ◽

Tetsuya Sakurai

Keyword(s):

Dimensionality Reduction ◽

Parallel Implementation ◽

Dimensional Space ◽

Recognition Performance ◽

Optimization Methods ◽

Original Data ◽

Dimensional Subspace ◽

Reduction Methods ◽

Low Dimensional ◽

Matrix Trace

Dimensionality reduction methods that project highdimensional data to a low-dimensional space by matrix trace optimization are widely used for clustering and classification. The matrix trace optimization problem leads to an eigenvalue problem for a low-dimensional subspace construction, preserving certain properties of the original data. However, most of the existing methods use only a few eigenvectors to construct the low-dimensional space, which may lead to a loss of useful information for achieving successful classification. Herein, to overcome the deficiency of the information loss, we propose a novel complex moment-based supervised eigenmap including multiple eigenvectors for dimensionality reduction. Furthermore, the proposed method provides a general formulation for matrix trace optimization methods to incorporate with ridge regression, which models the linear dependency between covariate variables and univariate labels. To reduce the computational complexity, we also propose an efficient and parallel implementation of the proposed method. Numerical experiments indicate that the proposed method is competitive compared with the existing dimensionality reduction methods for the recognition performance. Additionally, the proposed method exhibits high parallel efficiency.

Download Full-text

AN ANALYTICAL FRAMEWORK FOR MODELING EVOKED AND EVENT-RELATED POTENTIALS

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127404009351 ◽

2004 ◽

Vol 14 (02) ◽

pp. 653-666 ◽

Cited By ~ 7

Author(s):

AXEL HUTT

Keyword(s):

Auditory Evoked Potentials ◽

Event Related Potentials ◽

Analytical Framework ◽

Multidimensional Data ◽

Dimensional System ◽

Segmentation Method ◽

Neuronal Dynamics ◽

Related Potentials ◽

Low Dimensional ◽

Number Of Segments

The present work reviews briefly a segmentation method and a modeling approach for multivariate quasi-stationary data. The combination of both parts allows the extraction of low-dimensional models from multidimensional data. The segmentation method is applied to event-related potentials and fields and early auditory evoked potentials and extracts ERP- and ERF-components and early auditory waves objectively and independent from the number of segments. Additionally, the early auditory wave Pa is modeled by a two-dimensional system of ordinary differential equations. We find a common topology of wave Pa, which lets us conjecture intrinsic low-dimensional underlying attractors in the corresponding neuronal dynamics.

Download Full-text

Fitting sparse multidimensional data with low-dimensional terms

Computer Physics Communications ◽

10.1016/j.cpc.2009.05.022 ◽

2009 ◽

Vol 180 (10) ◽

pp. 2002-2012 ◽

Cited By ~ 30

Author(s):

Sergei Manzhos ◽

Koichi Yamashita ◽

Tucker Carrington

Keyword(s):

Multidimensional Data ◽

Low Dimensional

Download Full-text