Application of the Polyhedral Conic Functions Method in the Text Classification and Comparative Analysis

Scientific Programming ◽

10.1155/2018/5349284 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11

Author(s):

Nur Uylaş Satı ◽

Burak Ordin

Keyword(s):

Text Classification ◽

Numerical Experiments ◽

Text Categorization ◽

Cross Validation ◽

State Of The Art ◽

Inequality Constraints ◽

Online Information ◽

Direct Proportion ◽

Art Methods ◽

Tenfold Cross Validation

In direct proportion to the heavy increase of online information data, the attention to text categorization (classification) has also increased. In text categorization problem, namely, text classification, the goal is to classify the documents into predefined classes (categories or labels). Recently various methods in data mining have been experienced for text classification in literature except polyhedral conic function (PCF) methods. In this paper, PCFs are used to classify the documents. The separation algorithms via PCFs which include linear programming subproblems with inequality constraints are presented. Numerical experiments are done on real-world text datasets. Comparisons are made between state-of-the-art methods by presenting obtained tenfold cross-validation results, accuracy values, and running times in tables. The results verify that in text classification PCF methods are as effective in terms of accuracy values as state-of-the-art methods.

Download Full-text

State-of-the-art methods in healthcare text classification system AI paradigm

Frontiers in Bioscience ◽

10.2741/4826 ◽

2020 ◽

Vol 25 (4) ◽

pp. 646-672 ◽

Cited By ~ 1

Author(s):

Jasjit S Suri

Keyword(s):

Text Classification ◽

Classification System ◽

State Of The Art ◽

Art Methods

Download Full-text

A Novel Approach Based on Point Cut Set to Predict Associations of Diseases and LncRNAs

Current Bioinformatics ◽

10.2174/1574893613666181026122045 ◽

2019 ◽

Vol 14 (4) ◽

pp. 333-343 ◽

Cited By ~ 3

Author(s):

Linai Kuang ◽

Haochen Zhao ◽

Lei Wang ◽

Zhanwei Xuan ◽

Tingrui Pei

Keyword(s):

Cross Validation ◽

State Of The Art ◽

Interaction Network ◽

Research Field ◽

Computational Method ◽

Difference Matrix ◽

Art Methods ◽

Disease Associations ◽

Cut Set ◽

Fold Cross Validation

Background: In recent years, more evidence have progressively indicated that Long non-coding RNAs (lncRNAs) play vital roles in wide-ranging human diseases, which can serve as potential biomarkers and drug targets. Comparing with vast lncRNAs being found, the relationships between lncRNAs and diseases remain largely unknown. Objective: The prediction of novel and potential associations between lncRNAs and diseases would contribute to dissect the complex mechanisms of disease pathogenesis. associations while known disease-lncRNA associations are required only. Method: In this paper, a new computational method based on Point Cut Set is proposed to predict LncRNA-Disease Associations (PCSLDA) based on known lncRNA-disease associations. Compared with the existing state-of-the-art methods, the major novelty of PCSLDA lies in the incorporation of distance difference matrix and point cut set to set the distance correlation coefficient of nodes in the lncRNA-disease interaction network. Hence, PCSLDA can be applied to forecast potential lncRNAdisease associations while known disease-lncRNA associations are required only. Results: Simulation results show that PCSLDA can significantly outperform previous state-of-the-art methods with reliable AUC of 0.8902 in the leave-one-out cross-validation and AUCs of 0.7634 and 0.8317 in 5-fold cross-validation and 10-fold cross-validation respectively. And additionally, 70% of top 10 predicted cancer-lncRNA associations can be confirmed. Conclusion: It is anticipated that our proposed model can be a great addition to the biomedical research field.

Download Full-text

Temporal Feature Aggregation for Text Classification Based on Ensembled Deep-Learning Models

International Journal of Future Computer and Communication ◽

10.18178/ijfcc.2021.10.2.575 ◽

2021 ◽

pp. 23-28

Author(s):

Jiali Yu ◽

◽

Zhiliang Qin ◽

Linghao Lin ◽

Yu Qin ◽

...

Keyword(s):

Deep Learning ◽

Text Classification ◽

Cross Validation ◽

State Of The Art ◽

The State ◽

Learning Models ◽

Feature Aggregation ◽

Validation Score ◽

The Cross ◽

Temporal Feature

In this paper, we focus on the text classification task, which is a most import task in the area of Natural Language Processing (NLP). We propose an innovative convolutional neural network (CNN) model to perform temporal feature aggregation (TFA) effectively, which has a highly representative capacity to extract sequential features from vectorized numerical embeddings. First, we feed embedded vectors into a bi-directional LSTM (Bi-LSTM) model to capture the contextual information of each word. Afterwards, we propose to use the state-of-the-art deep-learning models as key components of the architecture, i.e., the Xception model and the WaveNet model, to extract temporal features from deep convolutional layers concurrently. To facilitate an effective feature fusion, we concatenate the outputs of two component models before forwarding to a drop-out layer to alleviate over-fitting and subsequently a fully-connected dense layer to perform the final classification of input texts. Experiments demonstrate that the proposed method achieves performance comparable to the state-of-the-art models while at a significantly lower computational complexity. Our approach obtains the cross-validation score of 95.83% for the Quora Insincere Question Classification (QIQC) dataset, and the cross-validation score of 83.10% for the Spooky Author Identification (SAI) dataset, respectively, which are among the best published results. The proposed method can be readily generalized to signal processing tasks, e.g., environmental sound classification (ESC) and machine fault analysis (MFA).

Download Full-text

A Topical Word Embeddings for Text Classification

10.5753/eniac.2018.4401 ◽

2018 ◽

Author(s):

João Marcos Carvalho Lima ◽

José Everardo Bessa Maia

Keyword(s):

Text Classification ◽

Text Categorization ◽

Topic Model ◽

State Of The Art ◽

Bag Of Words ◽

Word Embeddings ◽

Document Representation ◽

Linear Classifier ◽

Low Dimensionality ◽

Nonlinear Classifier

This paper presents an approach that uses topic models based on LDA to represent documents in text categorization problems. The document representation is achieved through the cosine similarity between document embeddings and embeddings of topic words, creating a Bag-of-Topics (BoT) variant. The performance of this approach is compared against those of two other representations: BoW (Bag-of-Words) and Topic Model, both based on standard tf-idf. Also, to reveal the effect of the classifier, we compared the performance of the nonlinear classifier SVM against that of the linear classifier Naive Bayes, taken as baseline. To evaluate the approach we use two bases, one multi-label (RCV-1) and another single-label (20 Newsgroup). The model presents significant results with low dimensionality when compared to the state of the art.

Download Full-text

MEDA: Meta-Learning with Data Augmentation for Few-Shot Text Classification

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/541 ◽

2021 ◽

Author(s):

Pengfei Sun ◽

Yawen Ouyang ◽

Wenming Zhang ◽

Xin-yu Dai

Keyword(s):

Text Classification ◽

Data Augmentation ◽

State Of The Art ◽

Promising Technique ◽

Learning Methods ◽

Text Data ◽

Art Methods ◽

Visual Tasks ◽

Meta Learning

Meta-learning has recently emerged as a promising technique to address the challenge of few-shot learning. However, standard meta-learning methods mainly focus on visual tasks, which makes it hard for them to deal with diverse text data directly. In this paper, we introduce a novel framework for few-shot text classification, which is named as MEta-learning with Data Augmentation (MEDA). MEDA is composed of two modules, a ball generator and a meta-learner, which are learned jointly. The ball generator is to increase the number of shots per class by generating more samples, so that meta-learner can be trained with both original and augmented samples. It is worth noting that ball generator is agnostic to the choice of the meta-learning methods. Experiment results show that on both datasets, MEDA outperforms existing state-of-the-art methods and significantly improves the performance of meta-learning on few-shot text classification.

Download Full-text

A Brief Survey on Text Classification Using Various Machine Learning Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v8i1.521 ◽

2018 ◽

Vol 8 (1) ◽

pp. 14

Author(s):

Padmavathi .S ◽

M. Chidambaram

Keyword(s):

Machine Learning ◽

Text Classification ◽

Fixed Number ◽

Machine Learning Techniques ◽

Online Information ◽

Rule Based ◽

Learning Techniques ◽

Machine Learning Approach ◽

Rule Based Approach

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.

Download Full-text

Deep Learning for text in limted data settings

10.36227/techrxiv.12100692 ◽

2020 ◽

Author(s):

Pathikkumar Patel ◽

Bhargav Lad ◽

Jinan Fiaidhi

Keyword(s):

Machine Learning ◽

Time Series ◽

Deep Learning ◽

Sentiment Analysis ◽

Transfer Learning ◽

Text Classification ◽

State Of The Art ◽

Time Series Forecasting ◽

Text Data ◽

Performance Levels

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.

Download Full-text

Multi-hop assortativities for network classification

Journal of Complex Networks ◽

10.1093/comnet/cny034 ◽

2018 ◽

Vol 7 (4) ◽

pp. 603-622 ◽

Cited By ~ 1

Author(s):

Leonardo Gutiérrez-Gómez ◽

Jean-Charles Delvenne

Keyword(s):

Machine Learning ◽

Scientific Collaboration ◽

State Of The Art ◽

Medical Engineering ◽

Research Field ◽

Classification Task ◽

Collaboration Network ◽

Structural Patterns ◽

Art Methods

Abstract Several social, medical, engineering and biological challenges rely on discovering the functionality of networks from their structure and node metadata, when it is available. For example, in chemoinformatics one might want to detect whether a molecule is toxic based on structure and atomic types, or discover the research field of a scientific collaboration network. Existing techniques rely on counting or measuring structural patterns that are known to show large variations from network to network, such as the number of triangles, or the assortativity of node metadata. We introduce the concept of multi-hop assortativity, that captures the similarity of the nodes situated at the extremities of a randomly selected path of a given length. We show that multi-hop assortativity unifies various existing concepts and offers a versatile family of ‘fingerprints’ to characterize networks. These fingerprints allow in turn to recover the functionalities of a network, with the help of the machine learning toolbox. Our method is evaluated empirically on established social and chemoinformatic network benchmarks. Results reveal that our assortativity based features are competitive providing highly accurate results often outperforming state of the art methods for the network classification task.

Download Full-text

Automatic Detection of Discrimination Actions from Social Images

Electronics ◽

10.3390/electronics10030325 ◽

2021 ◽

Vol 10 (3) ◽

pp. 325

Author(s):

Zhihao Wu ◽

Baopeng Zhang ◽

Tianchen Zhou ◽

Yan Li ◽

Jianping Fan

Keyword(s):

Action Recognition ◽

State Of The Art ◽

Automatic Detection ◽

Experimental Results ◽

Practical Approach ◽

Detection And Identification ◽

Art Methods ◽

Image Set ◽

Social Images ◽

Relationship Identification

In this paper, we developed a practical approach for automatic detection of discrimination actions from social images. Firstly, an image set is established, in which various discrimination actions and relations are manually labeled. To the best of our knowledge, this is the first work to create a dataset for discrimination action recognition and relationship identification. Secondly, a practical approach is developed to achieve automatic detection and identification of discrimination actions and relationships from social images. Thirdly, the task of relationship identification is seamlessly integrated with the task of discrimination action recognition into one single network called the Co-operative Visual Translation Embedding++ network (CVTransE++). We also compared our proposed method with numerous state-of-the-art methods, and our experimental results demonstrated that our proposed methods can significantly outperform state-of-the-art approaches.

Download Full-text

RLC-GNN: An Improved Deep Architecture for Spatial-Based Graph Neural Network with Application to Fraud Detection

Applied Sciences ◽

10.3390/app11125656 ◽

2021 ◽

Vol 11 (12) ◽

pp. 5656

Author(s):

Yufan Zeng ◽

Jiashan Tang

Keyword(s):

Numerical Experiments ◽

State Of The Art ◽

Single Layer ◽

Fraud Detection ◽

Layer By Layer ◽

Residual Structure ◽

Detection Algorithms ◽

Deep Architecture ◽

Graph Neural Networks ◽

Node Embeddings

Graph neural networks (GNNs) have been very successful at solving fraud detection tasks. The GNN-based detection algorithms learn node embeddings by aggregating neighboring information. Recently, CAmouflage-REsistant GNN (CARE-GNN) is proposed, and this algorithm achieves state-of-the-art results on fraud detection tasks by dealing with relation camouflages and feature camouflages. However, stacking multiple layers in a traditional way defined by hop leads to a rapid performance drop. As the single-layer CARE-GNN cannot extract more information to fix the potential mistakes, the performance heavily relies on the only one layer. In order to avoid the case of single-layer learning, in this paper, we consider a multi-layer architecture which can form a complementary relationship with residual structure. We propose an improved algorithm named Residual Layered CARE-GNN (RLC-GNN). The new algorithm learns layer by layer progressively and corrects mistakes continuously. We choose three metrics—recall, AUC, and F1-score—to evaluate proposed algorithm. Numerical experiments are conducted. We obtain up to 5.66%, 7.72%, and 9.09% improvements in recall, AUC, and F1-score, respectively, on Yelp dataset. Moreover, we also obtain up to 3.66%, 4.27%, and 3.25% improvements in the same three metrics on the Amazon dataset.

Download Full-text