scholarly journals Sparse and Low-Rank Subspace Data Clustering with Manifold Regularization Learned by Local Linear Embedding

2018 ◽  
Vol 8 (11) ◽  
pp. 2175 ◽  
Author(s):  
Ye Yang ◽  
Yongli Hu ◽  
Fei Wu

Data clustering is an important research topic in data mining and signal processing communications. In all the data clustering methods, the subspace spectral clustering methods based on self expression model, e.g., the Sparse Subspace Clustering (SSC) and the Low Rank Representation (LRR) methods, have attracted a lot of attention and shown good performance. The key step of SSC and LRR is to construct a proper affinity or similarity matrix of data for spectral clustering. Recently, Laplacian graph constraint was introduced into the basic SSC and LRR and obtained considerable improvement. However, the current graph construction methods do not well exploit and reveal the non-linear properties of the clustering data, which is common for high dimensional data. In this paper, we introduce the classic manifold learning method, the Local Linear Embedding (LLE), to learn the non-linear structure underlying the data and use the learned local geometry of manifold as a regularization for SSC and LRR, which results the proposed LLE-SSC and LLE-LRR clustering methods. Additionally, to solve the complex optimization problem involved in the proposed models, an efficient algorithm is also proposed. We test the proposed data clustering methods on several types of public databases. The experimental results show that our methods outperform typical subspace clustering methods with Laplacian graph constraint.

Author(s):  
Jun Li ◽  
Handong Zhao ◽  
Zhiqiang Tao ◽  
Yun Fu

Large-Scale Subspace Clustering (LSSC) is an interesting and important problem in big data era. However, most existing methods (i.e., sparse or low-rank subspace clustering) cannot be directly used for solving LSSC because they suffer from the high time complexity-quadratic or cubic in n (the number of data points). To overcome this limitation, we propose a Fast Regression Coding (FRC) to optimize regression codes, and simultaneously train a non-linear function to approximate the codes. By using FRC, we develop an efficient Regression Coding Clustering (RCC) framework to solve the LSSC problem. It consists of sampling, FRC and clustering. RCC randomly samples a small number of data points, quickly calculates the codes of all data points by using the non-linear function learned from FRC, and employs a large-scale spectral clustering method to cluster the codes. Besides, we provide a theorem guarantee that the non-linear function has a first-order approximation ability and a group effect. The theorem manifests that the codes are easily used to construct a dividable similarity graph. Compared with the state-of-the-art LSSC methods, our model achieves better clustering results in large-scale datasets.


Data clustering is an active topic of research as it has applications in various fields such as biology, management, statistics, pattern recognition, etc. Spectral Clustering (SC) has gained popularity in recent times due to its ability to handle complex data and ease of implementation. A crucial step in spectral clustering is the construction of the affinity matrix, which is based on a pairwise similarity measure. The varied characteristics of datasets affect the performance of a spectral clustering technique. In this paper, we have proposed an affinity measure based on Topological Node Features (TNFs) viz., Clustering Coefficient (CC) and Summation index (SI) to define the notion of density and local structure. It has been shown that these features improve the performance of SC in clustering the data. The experiments were conducted on synthetic datasets, UCI datasets, and the MNIST handwritten datasets. The results show that the proposed affinity metric outperforms several recent spectral clustering methods in terms of accuracy.


2021 ◽  
pp. 1-18
Author(s):  
Ting Gao ◽  
Zhengming Ma ◽  
Wenxu Gao ◽  
Shuyu Liu

There are three contributions in this paper. (1) A tensor version of LLE (short for Local Linear Embedding algorithm) is deduced and presented. LLE is the most famous manifold learning algorithm. Since its proposal, various improvements to LLE have kept emerging without interruption. However, all these achievements are only suitable for vector data, not tensor data. The proposed tensor LLE can also be used a bridge for various improvements to LLE to transfer from vector data to tensor data. (2) A framework of tensor dimensionality reduction based on tensor mode product is proposed, in which the mode matrices can be determined according to specific criteria. (3) A novel dimensionality reduction algorithm for tensor data based on LLE and mode product (LLEMP-TDR) is proposed, in which LLE is used as a criterion to determine the mode matrices. Benefiting from local LLE and global mode product, the proposed LLEMP-TDR can preserve both local and global features of high-dimensional tenser data during dimensionality reduction. The experimental results on data clustering and classification tasks demonstrate that our method performs better than 5 other related algorithms published recently in top academic journals.


Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 767 ◽  
Author(s):  
Yepeng Ni ◽  
Jianping Chai ◽  
Yan Wang ◽  
Weidong Fang

Indoor WLAN fingerprint localization systems have been widely applied due to the simplicity of implementation on various mobile devices, including smartphones. However, collecting received signal strength indication (RSSI) samples for the fingerprint database, named a radio map, is significantly labor-intensive and time-consuming. To solve the problem, this paper proposes a semi-supervised self-adaptive local linear embedding algorithm to build the radio map. First, this method uses the self-adaptive local linear embedding (SLLE) algorithm based on manifold learning to reduce the dimension of the high-dimensional RSSI samples and to extract a neighbor weight matrix. Secondly, a graph-based label propagation (GLP) algorithm is employed to build the radio map by semi-supervised learning from a large number of unlabeled RSSI samples to a few labeled RSSI samples. Finally, we propose a k self-adaptive neighbor weight (kSNW) algorithm, used for radio map construction in this paper, to realize online localization. The results of the experiments conducted in a real indoor environment show that the proposed method reduces the demand for large quantities of labeled samples and achieves good positioning accuracy. With only 25% labeled RSSI samples, our system can obtain positioning accuracy of more than 88%, within 3 m of localization errors.


2012 ◽  
Vol 246-247 ◽  
pp. 1289-1293
Author(s):  
Zheng Qiang Li ◽  
Peng Nie ◽  
Shu Guo Zhao

Aiming at the nonlinear characteristics of the tool wear Acoustic Emission signal, tool wear state identification method is proposed based on local linear embedding and vector machine supported. The local linear embedding algorithm makes high dimensional information down to low dimension feature space through commutation, and thus to compress the data for highlighting signal features. This algorithm well compensates for the weakness of linear dimension reduction failing to find datasets nonlinear structure. In this paper, acoustic emission signal is firstly made by phase space reconstruction. Using local linear embedding method, the high dimension space mapping data points are reflected into low-dimensional space corresponding data points, then extracting tool wear state characteristics, and using vector machine supported classifier to identify classification of the tool wear conditions. Experimental results show that this method is used for the exact recognition of the tool wear state, and has widespread tendency.


Sign in / Sign up

Export Citation Format

Share Document