semisupervised clustering Latest Research Papers

SCMAG: A Semisupervised Single-Cell Clustering Method Based on Matrix Aggregation Graph Convolutional Neural Network

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/6842752 ◽

2021 ◽

Vol 2021 ◽

pp. 1-6

Author(s):

Wenliang Gao ◽

Yuanyuan Li ◽

Chujie Fang ◽

Wei Fan ◽

Haonan Peng

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Single Cell ◽

Prior Information ◽

Clustering Algorithm ◽

Cell Types ◽

Clustering Method ◽

Cell Clustering ◽

Semisupervised Clustering ◽

Cell Data

Clustering analysis is one of the most important technologies for single-cell data mining. It is widely used in the division of different gene sequences, the identification of functional genes, and the detection of new cell types. Although the traditional unsupervised clustering method does not require label data, the distribution of the original data, the setting of hyperparameters, and other factors all affect the effectiveness of the clustering algorithm. While in some cases the type of some cells is known, it is hoped to achieve high accuracy if the prior information about those cells is utilized sufficiently. In this study, we propose SCMAG (a semisupervised single-cell clustering method based on a matrix aggregation graph convolutional neural network) that takes into full consideration the prior information for single-cell data. To evaluate the performance of the proposed semisupervised clustering method, we test on different single-cell datasets and compare with the current semisupervised clustering algorithm in recognizing cell types on various real scRNA-seq data; the results show that it is a more accurate and significant model.

Download Full-text

An Exhaustive Research on the Application of Intrusion Detection Technology in Computer Network Security in Sensor Networks

Journal of Sensors ◽

10.1155/2021/5558860 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Yajing Wang ◽

Juan Ma ◽

Ashutosh Sharma ◽

Pradeep Kumar Singh ◽

Gurjot Singh Gaba ◽

...

Keyword(s):

Network Security ◽

Intrusion Detection ◽

Anomaly Detection ◽

Outlier Detection ◽

Computer Network ◽

Clustering Algorithms ◽

Nearest Neighbors ◽

Detection Technology ◽

Computer Network Security ◽

Semisupervised Clustering

Intrusion detection is crucial in computer network security issues; therefore, this work is aimed at maximizing network security protection and its improvement by proposing various preventive techniques. Outlier detection and semisupervised clustering algorithms based on shared nearest neighbors are proposed in this work to address intrusion detection by converting it into a problem of mining outliers using the network behavior dataset. The algorithm uses shared nearest neighbors as similarity, judges whether it is an outlier according to the number of nearest neighbors of a data point, and performs semisupervised clustering on the dataset where outliers are deleted. In the process of semisupervised clustering, vast prior knowledge is added, and the dataset is clustered according to the principle of graph segmentation. The novelty of the proposed algorithm lies in outlier detection while effectively avoiding the dependence on parameters, thus eliminating the influence of outliers on clustering. This article uses real datasets: lypmphography and glass for simulation purposes. The simulation results show that the algorithm proposed in this paper can effectively detect outliers and has a good clustering effect. Furthermore, the experimentation reveals that the outlier detection-based SCA-SNN algorithm has the best practical effect on the dataset without outliers, clearly validating the clustering performance of the outlier detection-based SCA-SNN algorithm. Furthermore, compared to the other state-of-the-art anomaly detection method, it was revealed that the anomaly detection technology based on outlier mining does not require a training process. Thus, they overcome the current anomaly detection problems caused due to incomplete normal patterns in training samples.

Download Full-text

Network‐based semisupervised clustering

Applied Stochastic Models in Business and Industry ◽

10.1002/asmb.2618 ◽

2021 ◽

Author(s):

Luca Frigau ◽

Giulia Contu ◽

Francesco Mola ◽

Claudio Conversano

Keyword(s):

Semisupervised Clustering

Download Full-text

Data-driven Derivation and Validation of Novel Phenotypes for Acute Kidney Transplant Rejection using Semi-supervised Clustering

Journal of the American Society of Nephrology ◽

10.1681/asn.2020101418 ◽

2021 ◽

pp. ASN.2020101418

Author(s):

Thibaut Vaulet ◽

Gillian Divard ◽

Olivier Thaunat ◽

Evelyne Lerut ◽

Aleksandar Senev ◽

...

Keyword(s):

Kidney Transplant ◽

Graft Failure ◽

Transplant Rejection ◽

Data Driven ◽

Adjusted Rand Index ◽

Additional Information ◽

Banff Classification ◽

Data Driven Approach ◽

Consensus Classification ◽

Semisupervised Clustering

BackgroundOver the past decades, an international group of experts iteratively developed a consensus classification of kidney transplant rejection phenotypes, known as the Banff classification. Data-driven clustering of kidney transplant histologic data could simplify the complex and discretionary rules of the Banff classification, while improving the association with graft failure.MethodsThe data consisted of a training set of 3510 kidney-transplant biopsies from an observational cohort of 936 recipients. Independent validation of the results was performed on an external set of 3835 biopsies from 1989 patients. On the basis of acute histologic lesion scores and the presence of donor-specific HLA antibodies, stable clustering was achieved on the basis of a consensus of 400 different clustering partitions. Additional information on kidney-transplant failure was introduced with a weighted Euclidean distance.ResultsBased on the proportion of ambiguous clustering, six clinically meaningful cluster phenotypes were identified. There was significant overlap with the existing Banff classification (adjusted rand index, 0.48). However, the data-driven approach eliminated intermediate and mixed phenotypes and created acute rejection clusters that are each significantly associated with graft failure. Finally, a novel visualization tool presents disease phenotypes and severity in a continuous manner, as a complement to the discrete clusters.ConclusionsA semisupervised clustering approach for the identification of clinically meaningful novel phenotypes of kidney transplant rejection has been developed and validated. The approach has the potential to offer a more quantitative evaluation of rejection subtypes and severity, especially in situations in which the current histologic categorization is ambiguous.

Download Full-text

Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk Minimization

Neural Computation ◽

10.1162/neco_a_01373 ◽

2021 ◽

pp. 1-35

Author(s):

Takuya Shimada ◽

Han Bao ◽

Issei Sato ◽

Masashi Sugiyama

Keyword(s):

Unbiased Estimator ◽

Estimation Error ◽

Unlabeled Data ◽

Empirical Risk Minimization ◽

Risk Minimization ◽

Clustering Methods ◽

Classification Problems ◽

Minimization Method ◽

Empirical Risk ◽

Semisupervised Clustering

Pairwise similarities and dissimilarities between data points are often obtained more easily than full labels of data in real-world classification problems. To make use of such pairwise information, an empirical risk minimization approach has been proposed, where an unbiased estimator of the classification risk is computed from only pairwise similarities and unlabeled data. However, this approach has not yet been able to handle pairwise dissimilarities. Semisupervised clustering methods can incorporate both similarities and dissimilarities into their framework; however, they typically require strong geometrical assumptions on the data distribution such as the manifold assumption, which may cause severe performance deterioration. In this letter, we derive an unbiased estimator of the classification risk based on all of similarities and dissimilarities and unlabeled data. We theoretically establish an estimation error bound and experimentally demonstrate the practical usefulness of our empirical risk minimization method.

Download Full-text

An Efficient Automatic Gait Anomaly Detection Method Based on Semisupervised Clustering

Computational Intelligence and Neuroscience ◽

10.1155/2021/8840156 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

Zhenlun Yang

Keyword(s):

Anomaly Detection ◽

Clustering Algorithm ◽

Detection Method ◽

Binary Classification ◽

Simple Method ◽

Gait Patterns ◽

Abnormal Gait ◽

Gait Features ◽

Efficient Gait ◽

Semisupervised Clustering

The aim of this work is to develop a common automatic computer method to distinguish human individuals with abnormal gait patterns from those with normal gait patterns. As long as the silhouette gait images of the subjects are obtainable, the proposed method is capable of providing online anomaly gait detection result without additional work on analyzing the gait features of the target subjects before ahead. Moreover, the proposed method does not need any parameter settings by users and can start producing detection results under the work by only collecting a very small number of gait samples, even though none of those gait samples are abnormal. Therefore, the proposed method can provide fast and simple deployment for various anomaly gait detection application scenarios. The proposed method is composed of two main modules: (1) feature extraction from gait images and (2) anomaly detection via binary classification. In the first module, a new representation of the most frequently involved area of the silhouette gait images called full gait energy image (F-GEI) is proposed. Furthermore, based on the F-GEI, a novel and simple method characterizing individual walking properties is developed to extract gait features from individual subjects. In the second module, based on the very limited prior knowledge on the target dataset, a semisupervised clustering algorithm is proposed to perform the binary classification for detecting the gait anomaly of each subject. The performance of the proposed gait anomaly detection method was evaluated on the human gaits dataset in comparison with three state-of-the-art methods. The experiment results show that the proposed method is an effective and efficient gait anomaly detection method in terms of accuracy, robustness, and computational efficiency.

Download Full-text

Semisupervised Deep Embedded Clustering with Adaptive Labels

Scientific Programming ◽

10.1155/2021/6613452 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Zhikui Chen ◽

Chaojie Li ◽

Jing Gao ◽

Jianing Zhang ◽

Peng Li

Keyword(s):

Clustering Algorithm ◽

State Of The Art ◽

A Priori ◽

Back Propagation ◽

Normalized Mutual Information ◽

Deep Embedding ◽

End To End ◽

Semisupervised Clustering ◽

Fine Tune ◽

Priori Knowledge

Deep embedding clustering (DEC) attracts much attention due to its outperforming performance attributed to the end-to-end clustering. However, DEC cannot make use of small amount of a priori knowledge contained in data of increasing volume. To tackle this challenge, a semisupervised deep embedded clustering algorithm with adaptive labels is proposed to cluster those data in a semisupervised end-to-end manner on the basis of a little priori knowledge. Specifically, a deep semisupervised clustering network is designed based on the autoencoder paradigm and deep clustering, which well mine the clustering representation and clustering assignment by preventing the shift of labels in DEC. Then, to train parameters of the deep semisupervised clustering network, a back-propagation-based algorithm with adaptive labels is introduced based on the pretrain and fine-tune strategies. Finally, extensive experiments on representative datasets are conducted to evaluate the performance of the proposed method in terms of clustering accuracy and normalized mutual information. Results show the proposed method outperforms the state-of-the-art methods of DEC.

Download Full-text

Nonsmooth Optimization-Based Model and Algorithm for Semisupervised Clustering

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2021.3129370 ◽

2021 ◽

pp. 1-14

Author(s):

Adil M. Bagirov ◽

Sona Taheri ◽

Fusheng Bai ◽

Fangying Zheng

Keyword(s):

Nonsmooth Optimization ◽

Semisupervised Clustering

Download Full-text

Preprocessing Method for Encrypted Traffic Based on Semisupervised Clustering

Security and Communication Networks ◽

10.1155/2020/8824659 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Rongfeng Zheng ◽

Jiayong Liu ◽

Weina Niu ◽

Liang Liu ◽

Kai Li ◽

...

Keyword(s):

Network Traffic ◽

Clustering Algorithm ◽

Network Flows ◽

Spatial Clustering ◽

Clustering Algorithms ◽

Communication Channels ◽

Transport Layer ◽

Clustering Model ◽

Network Intrusion ◽

Semisupervised Clustering

The explosive growth in network traffic in recent times has resulted in increased processing pressure on network intrusion detection systems. In addition, there is a lack of reliable methods for preprocessing network traffic generated by benign applications that do not steal users’ data from their devices. To alleviate these problems, this study analyzed the differences between benign and malicious traffic produced by benign applications and malware, respectively. To fully express these differences, this study proposed a new set of statistical features for training a clustering model. Furthermore, to mine the communication channels generated by benign applications in batches, a semisupervised clustering method was adopted. Using a small number of labeled samples, our method aggregated historical network traffic into two types of clusters. The cluster that did not contain labeled malicious samples was regarded as a benign traffic cluster. The experimental results were compared using four types of clustering algorithms. The density-based spatial clustering of applications with noise (DBSCAN) clustering algorithm was selected to mine benign communication channels. We also compared our method with two other methods, and the results demonstrated that the benign channels mined through our method were more reliable. Finally, using our method, 1,811 benign transport layer security (TLS) channels were mined from 18,357 TLS communication channels. The number of flows carried by these benign channels comprised 65.37% of the entire network flows, and no malicious flow was included in our results, which proves the effectiveness of our method.

Download Full-text

Semisupervised Clustering by Queries and Locally Encodable Source Coding

IEEE Transactions on Information Theory ◽

10.1109/tit.2020.3037533 ◽

2020 ◽

pp. 1-1

Author(s):

Arya Mazumdar ◽

Soumyabrata Pal

Keyword(s):

Source Coding ◽

Semisupervised Clustering

Download Full-text

semisupervised clustering
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

SCMAG: A Semisupervised Single-Cell Clustering Method Based on Matrix Aggregation Graph Convolutional Neural Network

An Exhaustive Research on the Application of Intrusion Detection Technology in Computer Network Security in Sensor Networks

Network‐based semisupervised clustering

Data-driven Derivation and Validation of Novel Phenotypes for Acute Kidney Transplant Rejection using Semi-supervised Clustering

Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk Minimization

An Efficient Automatic Gait Anomaly Detection Method Based on Semisupervised Clustering

Semisupervised Deep Embedded Clustering with Adaptive Labels

Nonsmooth Optimization-Based Model and Algorithm for Semisupervised Clustering

Preprocessing Method for Encrypted Traffic Based on Semisupervised Clustering

Semisupervised Clustering by Queries and Locally Encodable Source Coding

Export Citation Format

semisupervised clusteringRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

SCMAG: A Semisupervised Single-Cell Clustering Method Based on Matrix Aggregation Graph Convolutional Neural Network

An Exhaustive Research on the Application of Intrusion Detection Technology in Computer Network Security in Sensor Networks

Network‐based semisupervised clustering

Data-driven Derivation and Validation of Novel Phenotypes for Acute Kidney Transplant Rejection using Semi-supervised Clustering

Classification from Pairwise Similarities/Dissimilarities and Unlabeled Data via Empirical Risk Minimization

An Efficient Automatic Gait Anomaly Detection Method Based on Semisupervised Clustering

Semisupervised Deep Embedded Clustering with Adaptive Labels

Nonsmooth Optimization-Based Model and Algorithm for Semisupervised Clustering

Preprocessing Method for Encrypted Traffic Based on Semisupervised Clustering

Semisupervised Clustering by Queries and Locally Encodable Source Coding

semisupervised clustering
Recently Published Documents