Cooperative Hybrid Semi-Supervised Learning for Text Sentiment Classification

Yang Li; Ying Lv; Suge Wang; Jiye Liang; Juanzi Li; Xiaoli Li

doi:10.3390/sym11020133

Cooperative Hybrid Semi-Supervised Learning for Text Sentiment Classification

Symmetry ◽

10.3390/sym11020133 ◽

2019 ◽

Vol 11 (2) ◽

pp. 133 ◽

Cited By ~ 2

Author(s):

Yang Li ◽

Ying Lv ◽

Suge Wang ◽

Jiye Liang ◽

Juanzi Li ◽

...

Keyword(s):

Supervised Learning ◽

Large Scale ◽

Ensemble Classifier ◽

Sentiment Classification ◽

Training Dataset ◽

Support Vector ◽

Seed Selection ◽

Training Strategy ◽

Whole Process ◽

Self Learning

A large-scale and high-quality training dataset is an important guarantee to learn an ideal classifier for text sentiment classification. However, manually constructing such a training dataset with sentiment labels is a labor-intensive and time-consuming task. Therefore, based on the idea of effectively utilizing unlabeled samples, a synthetical framework that covers the whole process of semi-supervised learning from seed selection, iterative modification of the training text set, to the co-training strategy of the classifier is proposed in this paper for text sentiment classification. To provide an important basis for selecting the seed texts and modifying the training text set, three kinds of measures—the cluster similarity degree of an unlabeled text, the cluster uncertainty degree of a pseudo-label text to a learner, and the reliability degree of a pseudo-label text to a learner—are defined. With these measures, a seed selection method based on Random Swap clustering, a hybrid modification method of the training text set based on active learning and self-learning, and an alternately co-training strategy of the ensemble classifier of the Maximum Entropy and Support Vector Machine are proposed and combined into our framework. The experimental results on three Chinese datasets (COAE2014, COAE2015, and a Hotel review, respectively) and five English datasets (Books, DVD, Electronics, Kitchen, and MR, respectively) in the real world verify the effectiveness of the proposed framework.

Download Full-text

Combining Self-supervised Learning and Active Learning for Disfluency Detection

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3487290 ◽

2022 ◽

Vol 21 (3) ◽

pp. 1-25

Author(s):

Shaolei Wang ◽

Zhongyuan Wang ◽

Wanxiang Che ◽

Sendong Zhao ◽

Ting Liu

Keyword(s):

Neural Network ◽

Active Learning ◽

Supervised Learning ◽

Large Scale ◽

Training Data ◽

Fine Tuning ◽

Training Dataset ◽

Performance Gap ◽

Annotation Costs ◽

Trained Neural Network

Spoken language is fundamentally different from the written language in that it contains frequent disfluencies or parts of an utterance that are corrected by the speaker. Disfluency detection (removing these disfluencies) is desirable to clean the input for use in downstream NLP tasks. Most existing approaches to disfluency detection heavily rely on human-annotated data, which is scarce and expensive to obtain in practice. To tackle the training data bottleneck, in this work, we investigate methods for combining self-supervised learning and active learning for disfluency detection. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled data and propose two self-supervised pre-training tasks: (i) a tagging task to detect the added noisy words and (ii) sentence classification to distinguish original sentences from grammatically incorrect sentences. We then combine these two tasks to jointly pre-train a neural network. The pre-trained neural network is then fine-tuned using human-annotated disfluency detection training data. The self-supervised learning method can capture task-special knowledge for disfluency detection and achieve better performance when fine-tuning on a small annotated dataset compared to other supervised methods. However, limited in that the pseudo training data are generated based on simple heuristics and cannot fully cover all the disfluency patterns, there is still a performance gap compared to the supervised models trained on the full training dataset. We further explore how to bridge the performance gap by integrating active learning during the fine-tuning process. Active learning strives to reduce annotation costs by choosing the most critical examples to label and can address the weakness of self-supervised learning with a small annotated dataset. We show that by combining self-supervised learning with active learning, our model is able to match state-of-the-art performance with just about 10% of the original training data on both the commonly used English Switchboard test set and a set of in-house annotated Chinese data.

Download Full-text

Repetitive Reprediction Deep Decipher for Semi-Supervised Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6082 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6170-6177

Author(s):

Guo-Hua Wang ◽

Jianxin Wu

Keyword(s):

Deep Learning ◽

Supervised Learning ◽

Large Scale ◽

State Of The Art ◽

Link Function ◽

Training Strategy ◽

Network Parameters ◽

Theoretical Support ◽

Percentage Points ◽

End To End

Most recent semi-supervised deep learning (deep SSL) methods used a similar paradigm: use network predictions to update pseudo-labels and use pseudo-labels to update network parameters iteratively. However, they lack theoretical support and cannot explain why predictions are good candidates for pseudo-labels. In this paper, we propose a principled end-to-end framework named deep decipher (D2) for SSL. Within the D2 framework, we prove that pseudo-labels are related to network predictions by an exponential link function, which gives a theoretical support for using predictions as pseudo-labels. Furthermore, we demonstrate that updating pseudo-labels by network predictions will make them uncertain. To mitigate this problem, we propose a training strategy called repetitive reprediction (R2). Finally, the proposed R2-D2 method is tested on the large-scale ImageNet dataset and outperforms state-of-the-art methods by 5 percentage points.

Download Full-text

“In-Network Ensemble”: Deep Ensemble Learning with Diversified Knowledge Distillation

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3473464 ◽

2021 ◽

Vol 12 (5) ◽

pp. 1-19

Author(s):

Xingjian Li ◽

Haoyi Xiong ◽

Zeyu Chen ◽

Jun Huan ◽

Cheng-Zhong Xu ◽

...

Keyword(s):

Supervised Learning ◽

Transfer Learning ◽

Ensemble Learning ◽

Large Scale ◽

Ensemble Classifier ◽

Network Architectures ◽

Deep Convolutional Neural Networks ◽

Knowledge Distillation ◽

Real World Datasets ◽

Improved Accuracy

Ensemble learning is a widely used technique to train deep convolutional neural networks (CNNs) for improved robustness and accuracy. While existing algorithms usually first train multiple diversified networks and then assemble these networks as an aggregated classifier, we propose a novel learning paradigm, namely, “In-Network Ensemble” ( INE ) that incorporates the diversity of multiple models through training a SINGLE deep neural network. Specifically, INE segments the outputs of the CNN into multiple independent classifiers, where each classifier is further fine-tuned with better accuracy through a so-called diversified knowledge distillation process . We then aggregate the fine-tuned independent classifiers using an Averaging-and-Softmax operator to obtain the final ensemble classifier. Note that, in the supervised learning settings, INE starts the CNN training from random, while, under the transfer learning settings, it also could start with a pre-trained model to incorporate the knowledge learned from additional datasets. Extensive experiments have been done using eight large-scale real-world datasets, including CIFAR, ImageNet, and Stanford Cars, among others, as well as common deep network architectures such as VGG, ResNet, and Wide ResNet. We have evaluated the method under two tasks: supervised learning and transfer learning. The results show that INE outperforms the state-of-the-art algorithms for deep ensemble learning with improved accuracy.

Download Full-text

A Self-Adaptive Hidden Markov Model for Emotion Classification in Chinese Microblogs

Mathematical Problems in Engineering ◽

10.1155/2015/987189 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8 ◽

Cited By ~ 7

Author(s):

Li Liu ◽

Dashi Luo ◽

Ming Liu ◽

Jun Zhong ◽

Ye Wei ◽

...

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Large Scale ◽

Hidden Markov ◽

Training Dataset ◽

Support Vector ◽

Emotion Classification ◽

Online Social Media ◽

Hidden Knowledge ◽

Self Adaptive

Microblogging is increasingly becoming one of the most popular online social media for people to express ideas and emotions. The amount of socially generated content from this medium is enormous. Text mining techniques have been intensively applied to discover the hidden knowledge and emotions from this huge dataset. In this paper, we propose a modified version of hidden Markov model (HMM) classifier, called self-adaptive HMM, whose parameters are optimized by Particle Swarm Optimization algorithms. Since manually labeling large-scale dataset is difficult, we also employ the entropy to decide whether a new unlabeled tweet shall be contained in the training dataset after being assigned an emotion using our HMM-based approach. In the experiment, we collected about 200,000 Chinese tweets from Sina Weibo. The results show that theF-score of our approach gets 76% on happiness and fear and 65% on anger, surprise, and sadness. In addition, the self-adaptive HMM classifier outperforms Naive Bayes and Support Vector Machine on recognition of happiness, anger, and sadness.

Download Full-text

Sentiment classification of social media reviews using an ensemble classifier

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v16.i1.pp355-363 ◽

2019 ◽

Vol 16 (1) ◽

pp. 355 ◽

Cited By ~ 1

Author(s):

Savita Sangam ◽

Subhash Shinde

Keyword(s):

Social Media ◽

Opinion Mining ◽

Ensemble Classifier ◽

Sentiment Classification ◽

Support Vector ◽

Business Organizations ◽

Text Data ◽

Proposed Model ◽

Show Business ◽

Use Of Social Media

<p>These days it has become a common practice for business organizations and individuals to make use of social media for sharing the opinions about the products or the services. Consumers are also ready to share their views on certain products or commodities. Thus huge amount of unstructured social media data gets generated day by day. Gradually heap of text data will be formed in many areas like automated business, education, health care, and show business and so on. Opinion mining also referred as sentiment analysis or sentiment classification, deals with mining of the review text and classifying the opinions or the sentiments of that text as positive or negative. In this paper we propose an ensemble classifier model consisting of Support Vector Machine and Artificial Neural Network. It combines the knowledge from two feature sets for sentiment classification. The proposed model shows the acceptable performance in terms of accuracy when compared with the baseline model.</p>

Download Full-text

Severity Assessment of COVID-19 Using a CT-Based Radiomics Model

Stem Cells International ◽

10.1155/2021/2263469 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Zhigao Xu ◽

Lili Zhao ◽

Guoqiang Yang ◽

Ying Ren ◽

Jinlong Wu ◽

...

Keyword(s):

Large Scale ◽

Operating Characteristic ◽

Fixed Ratio ◽

Roc Curves ◽

Classification Model ◽

Training Dataset ◽

Support Vector ◽

Svm Classifier ◽

Test Dataset ◽

Ct Features

The coronavirus disease of 2019 (COVID-19) has evolved into a worldwide pandemic. Although CT is sensitive in detecting lesions and assessing their severity, these works mainly depend on radiologists’ subjective judgment, which is inefficient in case of a large-scale outbreak. This work focuses on developing a CT-based radiomics model to assess whether COVID-19 patients are in the early, progressive, severe, or absorption stages of the disease. We retrospectively analyzed the CT images of 284 COVID-19 patients. All of the patients were divided into four groups (0-3): early ( n = 75 ), progressive ( n = 58 ), severe ( n = 75 ), and absorption ( n = 76 ) groups, according to the progression of the disease and the CT features. Meanwhile, they were split randomly to training and test datasets with the fixed ratio of 7 : 3 in each category. Thirty-eight radiomic features were nominated from 1688 radiomic features after using select K -best method and the ElasticNet algorithm. On this basis, a support vector machine (SVM) classifier was trained to build this model. Receiver operating characteristic (ROC) curves were generated to determine the diagnostic performance of various models. The precision, recall, and f 1 -score of the classification model of macro- and microaverage were 0.82, 0.82, 0.81, 0.81, 0.81, and 0.81 for the training dataset and 0.75, 0.73, 0.73, 0.72, 0.72, and 0.72 for the test dataset. The AUCs for groups 0, 1, 2, and 3 on the training dataset were 0.99, 0.97, 0.96, and 0.93, and the microaverage AUC was 0.97 with a macroaverage AUC of 0.97. On the test dataset, AUCs for each group were 0.97, 0.86, 0.83, and 0.89 and the microaverage AUC was 0.89 with a macroaverage AUC of 0.90. The CT-based radiomics model proved efficacious in assessing the severity of COVID-19.

Download Full-text

Non-Blind Image Deconvolution Based on “Ringing” Removal Using Convolutional Neural Network

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.10.ipas-180 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 181-1-181-7

Author(s):

Takahiro Kudo ◽

Takanori Fujisawa ◽

Takuro Yamaguchi ◽

Masaaki Ikehara

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Network Architecture ◽

Large Scale ◽

Blind Deconvolution ◽

Training Dataset ◽

Image Deconvolution ◽

Classic Problem ◽

Key Points ◽

Blind Image

Image deconvolution has been an important issue recently. It has two kinds of approaches: non-blind and blind. Non-blind deconvolution is a classic problem of image deblurring, which assumes that the PSF is known and does not change universally in space. Recently, Convolutional Neural Network (CNN) has been used for non-blind deconvolution. Though CNNs can deal with complex changes for unknown images, some CNN-based conventional methods can only handle small PSFs and does not consider the use of large PSFs in the real world. In this paper we propose a non-blind deconvolution framework based on a CNN that can remove large scale ringing in a deblurred image. Our method has three key points. The first is that our network architecture is able to preserve both large and small features in the image. The second is that the training dataset is created to preserve the details. The third is that we extend the images to minimize the effects of large ringing on the image borders. In our experiments, we used three kinds of large PSFs and were able to observe high-precision results from our method both quantitatively and qualitatively.

Download Full-text

DeepSSPred: A Deep Learning Based Sulfenylation site predictor via a novel n-segmented optimize federated feature encoder

Protein and Peptide Letters ◽

10.2174/0929866527666201202103411 ◽

2020 ◽

Vol 27 ◽

Author(s):

Zaheer Ullah Khan ◽

Dechang Pi

Keyword(s):

Large Scale ◽

Computational Models ◽

Research Work ◽

Training Data ◽

Training Dataset ◽

Validation Dataset ◽

Cytokine Signaling ◽

Minority Class ◽

Independent Dataset ◽

Feature Encoding

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.

Download Full-text

ABC-Gly: identifying protein lysine glycation sites with artificial bee colony algorithm

Current Proteomics ◽

10.2174/1570164617666191227120136 ◽

2019 ◽

Vol 17 ◽

Author(s):

Yanqiu Yao ◽

Xiaosa Zhao ◽

Qiao Ning ◽

Junping Zhou

Keyword(s):

Support Vector Machine ◽

Amino Acid ◽

Artificial Bee Colony Algorithm ◽

Artificial Bee Colony ◽

Training Dataset ◽

Support Vector ◽

Supplementary File ◽

Feature Subset ◽

Lipid Molecule ◽

Bee Colony

Background: Glycation is a nonenzymatic post-translational modification process by attaching a sugar molecule to a protein or lipid molecule. It may impair the function and change the characteristic of the proteins which may lead to some metabolic diseases. In order to understand the underlying molecular mechanisms of glycation, computational prediction methods have been developed because of their convenience and high speed. However, a more effective computational tool is still a challenging task in computational biology. Methods: In this study, we showed an accurate identification tool named ABC-Gly for predicting lysine glycation sites. At first, we utilized three informative features, including position-specific amino acid propensity, secondary structure and the composition of k-spaced amino acid pairs to encode the peptides. Moreover, to sufficiently exploit discriminative features thus can improve the prediction and generalization ability of the model, we developed a two-step feature selection, which combined the Fisher score and an improved binary artificial bee colony algorithm based on support vector machine. Finally, based on the optimal feature subset, we constructed the effective model by using Support Vector Machine on the training dataset. Results: The performance of the proposed predictor ABC-Gly was measured with the sensitivity of 76.43%, the specificity of 91.10%, the balanced accuracy of 83.76%, the area under the receiver-operating characteristic curve (AUC) of 0.9313, a Matthew’s Correlation Coefficient (MCC) of 0.6861 by 10-fold cross-validation on training dataset, and a balanced accuracy of 59.05% on independent dataset. Compared to the state-of-the-art predictors on the training dataset, the proposed predictor achieved significant improvement in the AUC of 0.156 and MCC of 0.336. Conclusion: The detailed analysis results indicated that our predictor may serve as a powerful complementary tool to other existing methods for predicting protein lysine glycation. The source code and datasets of the ABC-Gly were provided in the Supplementary File 1.

Download Full-text

Application of Machine Learning in Animal Disease Analysis and Prediction

Current Bioinformatics ◽

10.2174/1574893615999200728195613 ◽

2020 ◽

Vol 15 ◽

Author(s):

Shuwen Zhang ◽

Qiang Su ◽

Qin Chen

Keyword(s):

Machine Learning ◽

Unsupervised Learning ◽

Supervised Learning ◽

Clustering Algorithm ◽

Principal Component ◽

Support Vector ◽

Animal Disease ◽

Human Beings ◽

Animal Diseases ◽

Disease Analysis

Abstract: Major animal diseases pose a great threat to animal husbandry and human beings. With the deepening of globalization and the abundance of data resources, the prediction and analysis of animal diseases by using big data are becoming more and more important. The focus of machine learning is to make computers learn how to learn from data and use the learned experience to analyze and predict. Firstly, this paper introduces the animal epidemic situation and machine learning. Then it briefly introduces the application of machine learning in animal disease analysis and prediction. Machine learning is mainly divided into supervised learning and unsupervised learning. Supervised learning includes support vector machines, naive bayes, decision trees, random forests, logistic regression, artificial neural networks, deep learning, and AdaBoost. Unsupervised learning has maximum expectation algorithm, principal component analysis hierarchical clustering algorithm and maxent. Through the discussion of this paper, people have a clearer concept of machine learning and understand its application prospect in animal diseases.

Download Full-text