Novel Algorithms for Graph Clustering Applied to Human Activities

Nebojsa Budimirovic; Nebojsa Bacanin

doi:10.3390/math9101089

Novel Algorithms for Graph Clustering Applied to Human Activities

Mathematics ◽

10.3390/math9101089 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1089

Author(s):

Nebojsa Budimirovic ◽

Nebojsa Bacanin

Keyword(s):

Human Activities ◽

Clustering Algorithms ◽

Weighted Graph ◽

Graph Clustering ◽

Recall Rate ◽

Number Of Clusters ◽

Starting Point ◽

Measurement Units ◽

Novel Algorithms ◽

Precision Rate

In this paper, a novel algorithm (IBC1) for graph clustering with no prior assumption of the number of clusters is introduced. Furthermore, an additional algorithm (IBC2) for graph clustering when the number of clusters is given beforehand is presented. Additionally, a new measure of evaluation of clustering results is given—the accuracy of formed clusters (T). For the purpose of clustering human activities, the procedure of forming string sequences are presented. String symbols are gained by modeling spatiotemporal signals obtained from inertial measurement units. String sequences provided a starting point for forming the complete weighted graph. Using this graph, the proposed algorithms, as well as other well-known clustering algorithms, are tested. The best results are obtained using novel IBC2 algorithm: T = 96.43%, Rand Index (RI) 0.966, precision rate (P) 0.918, recall rate (R) 0.929 and balanced F-measure (F) 0.923.

Download Full-text

Mobile Anomaly Detection Based on Improved Self-Organizing Maps

Mobile Information Systems ◽

10.1155/2017/5674086 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 3

Author(s):

Chunyong Yin ◽

Sun Zhang ◽

Kwang-jun Kim

Keyword(s):

Anomaly Detection ◽

Mobile Devices ◽

Clustering Algorithms ◽

Recall Rate ◽

Accuracy Rate ◽

Optimal Method ◽

Self Organizing Maps ◽

Different Characteristics ◽

Precision Rate ◽

Self Organizing

Anomaly detection has always been the focus of researchers and especially, the developments of mobile devices raise new challenges of anomaly detection. For example, mobile devices can keep connection with Internet and they are rarely turned off even at night. This means mobile devices can attack nodes or be attacked at night without being perceived by users and they have different characteristics from Internet behaviors. The introduction of data mining has made leaps forward in this field. Self-organizing maps, one of famous clustering algorithms, are affected by initial weight vectors and the clustering result is unstable. The optimal method of selecting initial clustering centers is transplanted from K-means to SOM. To evaluate the performance of improved SOM, we utilize diverse datasets and KDD Cup99 dataset to compare it with traditional one. The experimental results show that improved SOM can get higher accuracy rate for universal datasets. As for KDD Cup99 dataset, it achieves higher recall rate and precision rate.

Download Full-text

A novel bidirectional clustering algorithm based on local density

Scientific Reports ◽

10.1038/s41598-021-93244-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Baicheng Lyu ◽

Wenhua Wu ◽

Zhiqiang Hu

Keyword(s):

Clustering Algorithm ◽

Local Density ◽

Clustering Algorithms ◽

Cluster Number ◽

Denoising Method ◽

Number Of Clusters ◽

Data Points ◽

Cutoff Distance ◽

Large Clusters ◽

Small Clusters

AbstractWith the widely application of cluster analysis, the number of clusters is gradually increasing, as is the difficulty in selecting the judgment indicators of cluster numbers. Also, small clusters are crucial to discovering the extreme characteristics of data samples, but current clustering algorithms focus mainly on analyzing large clusters. In this paper, a bidirectional clustering algorithm based on local density (BCALoD) is proposed. BCALoD establishes the connection between data points based on local density, can automatically determine the number of clusters, is more sensitive to small clusters, and can reduce the adjusted parameters to a minimum. On the basis of the robustness of cluster number to noise, a denoising method suitable for BCALoD is proposed. Different cutoff distance and cutoff density are assigned to each data cluster, which results in improved clustering performance. Clustering ability of BCALoD is verified by randomly generated datasets and city light satellite images.

Download Full-text

EARLY PREDICTION OF MEDICATION REFRACTORINESS IN CHILDREN WITH IDIOPATHIC EPILEPSY BASED ON SCALP EEG ANALYSIS

International Journal of Neural Systems ◽

10.1142/s0129065714500233 ◽

2014 ◽

Vol 24 (07) ◽

pp. 1450023 ◽

Cited By ~ 14

Author(s):

LUNG-CHANG LIN ◽

CHEN-SEN OUYANG ◽

CHING-TAI CHIANG ◽

REI-CHENG YANG ◽

RONG-CHING WU ◽

...

Keyword(s):

Refractory Epilepsy ◽

Recall Rate ◽

Idiopathic Epilepsy ◽

Spectral Edge Frequency ◽

Scalp Eeg ◽

Feature Descriptors ◽

Antiepileptic Drug Treatment ◽

Eeg Recordings ◽

Precision Rate

Refractory epilepsy often has deleterious effects on an individual's health and quality of life. Early identification of patients whose seizures are refractory to antiepileptic drugs is important in considering the use of alternative treatments. Although idiopathic epilepsy is regarded as having a significantly lower risk factor of developing refractory epilepsy, still a subset of patients with idiopathic epilepsy might be refractory to medical treatment. In this study, we developed an effective method to predict the refractoriness of idiopathic epilepsy. Sixteen EEG segments from 12 well-controlled patients and 14 EEG segments from 11 refractory patients were analyzed at the time of first EEG recordings before antiepileptic drug treatment. Ten crucial EEG feature descriptors were selected for classification. Three of 10 were related to decorrelation time, and four of 10 were related to relative power of delta/gamma. There were significantly higher values in these seven feature descriptors in the well-controlled group as compared to the refractory group. On the contrary, the remaining three feature descriptors related to spectral edge frequency, kurtosis, and energy of wavelet coefficients demonstrated significantly lower values in the well-controlled group as compared to the refractory group. The analyses yielded a weighted precision rate of 94.2%, and a 93.3% recall rate. Therefore, the developed method is a useful tool in identifying the possibility of developing refractory epilepsy in patients with idiopathic epilepsy.

Download Full-text

On Caching for Local Graph Clustering Algorithms

Lecture Notes in Computer Science - AI 2013: Advances in Artificial Intelligence ◽

10.1007/978-3-319-03680-9_6 ◽

2013 ◽

pp. 56-67

Author(s):

René Speck ◽

Axel-Cyrille Ngonga Ngomo

Keyword(s):

Clustering Algorithms ◽

Graph Clustering ◽

Local Graph

Download Full-text

Towards Expert-Inspired Automatic Criterion to Cut a Dendrogram for Real-Industrial Applications

10.3233/faia210140 ◽

2021 ◽

Author(s):

Shikha Suman ◽

Ashutosh Karna ◽

Karina Gibert

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithms ◽

Computational Cost ◽

Real Life ◽

Ground Truth ◽

Industrial Applications ◽

Underlying Structure ◽

Cluster Validity ◽

Cluster Validity Index ◽

Number Of Clusters

Hierarchical clustering is one of the most preferred choices to understand the underlying structure of a dataset and defining typologies, with multiple applications in real life. Among the existing clustering algorithms, the hierarchical family is one of the most popular, as it permits to understand the inner structure of the dataset and find the number of clusters as an output, unlike popular methods, like k-means. One can adjust the granularity of final clustering to the goals of the analysis themselves. The number of clusters in a hierarchical method relies on the analysis of the resulting dendrogram itself. Experts have criteria to visually inspect the dendrogram and determine the number of clusters. Finding automatic criteria to imitate experts in this task is still an open problem. But, dependence on the expert to cut the tree represents a limitation in real applications like the fields industry 4.0 and additive manufacturing. This paper analyses several cluster validity indexes in the context of determining the suitable number of clusters in hierarchical clustering. A new Cluster Validity Index (CVI) is proposed such that it properly catches the implicit criteria used by experts when analyzing dendrograms. The proposal has been applied on a range of datasets and validated against experts ground-truth overcoming the results obtained by the State of the Art and also significantly reduces the computational cost.

Download Full-text

Exploring high reliable substructures in auto-reconstructions of a neuron

10.21203/rs.3.rs-615483/v1 ◽

2021 ◽

Author(s):

Yishan He ◽

Jiajin Huang ◽

Gaowei Wu ◽

Jian Yang

Keyword(s):

State Of The Art ◽

Recall Rate ◽

Local Alignment ◽

Alignment Algorithm ◽

Neuron Tracing ◽

Digital Reconstruction ◽

High Recall Rate ◽

Multiple Species ◽

Multiple Reference ◽

Precision Rate

Abstract The digital reconstruction of a neuron is the most direct and effective way to investigate its morphology. Many automatic neuron tracing methods have been proposed, but without manual check it is difficult to know whether a reconstruction or which substructure in a reconstruction is accurate. For a neuron’s reconstructions generated by multiple automatic tracing methods with different principles or models, their common substructures are highly reliable and named individual motifs. In this work, we propose a Vaa3D based method called Lamotif to explore individual motifs in automatic reconstructions of a neuron. Lamotif utilizes the local alignment algorithm in BlastNeuron to extract local alignment pairs between a specified objective reconstruction and multiple reference reconstructions, and combines these pairs to generate individual motifs on the objective reconstruction. The proposed Lamotif is evaluated on reconstructions of 163 multiple species neurons, which are generated by four state-of-the-art tracing methods. Experimental results show that individual motifs are almost on corresponding gold standard reconstructions and have much higher precision rate than objective reconstructions themselves. Furthermore, an objective reconstruction is mostly quite accurate if its individual motifs have high recall rate. Individual motifs contain common geometry substructures in multiple reconstructions, and can be used to select some accurate substructures from a reconstruction or some accurate reconstructions from automatic reconstruction dataset of different neurons.

Download Full-text

A Meta-learning approach for recommending the number of clusters for clustering algorithms

Knowledge-Based Systems ◽

10.1016/j.knosys.2020.105682 ◽

2020 ◽

Vol 195 ◽

pp. 105682

Author(s):

Bruno Almeida Pimentel ◽

André C.P.L.F. de Carvalho

Keyword(s):

Clustering Algorithms ◽

Learning Approach ◽

Number Of Clusters ◽

Meta Learning

Download Full-text

Models for Internal Clustering Validation Indexes Based on Hadoop-MapReduce

International Journal of Distributed Systems and Technologies ◽

10.4018/ijdst.2020070103 ◽

2020 ◽

Vol 11 (3) ◽

pp. 42-67

Author(s):

Soumeya Zerabi ◽

Souham Meshoul ◽

Samia Chikhi Boucherkha

Keyword(s):

Clustering Algorithms ◽

Large Data ◽

Optimal Number ◽

Data Sets ◽

Data Set ◽

Number Of Clusters ◽

Distributed Models ◽

Hadoop Mapreduce ◽

Distributed Solutions ◽

Clustering Validation

Cluster validation aims to both evaluate the results of clustering algorithms and predict the number of clusters. It is usually achieved using several indexes. Traditional internal clustering validation indexes (CVIs) are mainly based in computing pairwise distances which results in a quadratic complexity of the related algorithms. The existing CVIs cannot handle large data sets properly and need to be revisited to take account of the ever-increasing data set volume. Therefore, design of parallel and distributed solutions to implement these indexes is required. To cope with this issue, the authors propose two parallel and distributed models for internal CVIs namely for Silhouette and Dunn indexes using MapReduce framework under Hadoop. The proposed models termed as MR_Silhouette and MR_Dunn have been tested to solve both the issue of evaluating the clustering results and identifying the optimal number of clusters. The results of experimental study are very promising and show that the proposed parallel and distributed models achieve the expected tasks successfully.

Download Full-text

Method for Retrieving Digital Agricultural Text Information Based on Local Matching

Symmetry ◽

10.3390/sym12071103 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1103

Author(s):

Yue Song ◽

Minjuan Wang ◽

Wanlin Gao

Keyword(s):

Recall Rate ◽

Retrieval Method ◽

Retrieval Time ◽

Search Results ◽

Retrieval Efficiency ◽

Query Tree ◽

Text Information ◽

Relationship Of ◽

The Relationship ◽

Precision Rate

In order to improve the retrieval results of digital agricultural text information and improve the efficiency of retrieval, the method for searching digital agricultural text information based on local matching is proposed. The agricultural text tree and the query tree are constructed to generate the relationship of ancestor–descendant in the query and map it to the agricultural text. According to the retrieval method of the local matching, the vector retrieval method is used to calculate the digital agricultural text and submit the similarity between the queries. The similarity is sorted from large to small so that the agricultural text tree can output digital agricultural text information in turn. In the case of adding interference information, the recall rate and precision rate of the proposed method are above 99.5%; the average retrieval time is between 4s and 6s, and the average retrieval efficiency is above 99%. The proposed method is more efficient in information retrieval and can obtain comprehensive and accurate search results, which can be used for the rapid retrieval of digital agricultural text information.

Download Full-text

An Adaptive Multiobjective Genetic Algorithm with Fuzzy c-Means for Automatic Data Clustering

Mathematical Problems in Engineering ◽

10.1155/2018/6123874 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 2

Author(s):

Ze Dong ◽

Hao Jia ◽

Miao Liu

Keyword(s):

Genetic Algorithm ◽

Fuzzy Clustering ◽

Clustering Algorithm ◽

Majority Vote ◽

Clustering Algorithms ◽

Nsga Ii ◽

Number Of Clusters ◽

Automatic Data ◽

Multiobjective Genetic Algorithm ◽

Fuzzy Clustering Method

This paper presents a fuzzy clustering method based on multiobjective genetic algorithm. The ADNSGA2-FCM algorithm was developed to solve the clustering problem by combining the fuzzy clustering algorithm (FCM) with the multiobjective genetic algorithm (NSGA-II) and introducing an adaptive mechanism. The algorithm does not need to give the number of clusters in advance. After the number of initial clusters and the center coordinates are given randomly, the optimal solution set is found by the multiobjective evolutionary algorithm. After determining the optimal number of clusters by majority vote method, the Jm value is continuously optimized through the combination of Canonical Genetic Algorithm and FCM, and finally the best clustering result is obtained. By using standard UCI dataset verification and comparing with existing single-objective and multiobjective clustering algorithms, the effectiveness of this method is proved.

Download Full-text