A Hybrid BERT Model That Incorporates Label Semantics via Adjustive Attention for Multi-Label Text Classification

GILE: A Generalized Input-Label Embedding for Text Classification

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00259 ◽

2019 ◽

Vol 7 ◽

pp. 139-155 ◽

Cited By ~ 1

Author(s):

Nikolaos Pappas ◽

James Henderson

Keyword(s):

Text Classification ◽

Joint Space ◽

Classification Performance ◽

Cross Entropy ◽

Categorical Variables ◽

Classification Models ◽

Set Size ◽

Nonlinear Input ◽

Label Semantics

Neural text classification models typically treat output labels as categorical variables that lack description and semantics. This forces their parametrization to be dependent on the label set size, and, hence, they are unable to scale to large label sets and generalize to unseen ones. Existing joint input-label text models overcome these issues by exploiting label descriptions, but they are unable to capture complex label relationships, have rigid parametrization, and their gains on unseen labels happen often at the expense of weak performance on the labels seen during training. In this paper, we propose a new input-label model that generalizes over previous such models, addresses their limitations, and does not compromise performance on seen labels. The model consists of a joint nonlinear input-label embedding with controllable capacity and a joint-space-dependent classification unit that is trained with cross-entropy loss to optimize classification performance. We evaluate models on full-resource and low- or zero-resource text classification of multilingual news and biomedical text with a large label set. Our model outperforms monolingual and multilingual models that do not leverage label semantics and previous joint input-label space models in both scenarios.

Download Full-text

Correlation-Guided Representation for Multi-Label Text Classification

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/463 ◽

2021 ◽

Author(s):

Qian-Wen Zhang ◽

Ximing Zhang ◽

Zhao Yan ◽

Ruifang Liu ◽

Yunbo Cao ◽

...

Keyword(s):

Language Processing ◽

Text Classification ◽

Low Frequency ◽

Classification Performance ◽

Categorical Variables ◽

Text Representation ◽

Label Semantics ◽

Higher Weights ◽

Label Correlations ◽

Text Information

Multi-label text classification is an essential task in natural language processing. Existing multi-label classification models generally consider labels as categorical variables and ignore the exploitation of label semantics. In this paper, we view the task as a correlation-guided text representation problem: an attention-based two-step framework is proposed to integrate text information and label semantics by jointly learning words and labels in the same space. In this way, we aim to capture high-order label-label correlations as well as context-label correlations. Specifically, the proposed approach works by learning token-level representations of words and labels globally through a multi-layer Transformer and constructing an attention vector through word-label correlation matrix to generate the text representation. It ensures that relevant words receive higher weights than irrelevant words and thus directly optimizes the classification performance. Extensive experiments over benchmark multi-label datasets clearly validate the effectiveness of the proposed approach, and further analysis demonstrates that it is competitive in both predicting low-frequency labels and convergence speed.

Download Full-text

Hierarchy-aware Label Semantics Matching Network for Hierarchical Text Classification

10.18653/v1/2021.acl-long.337 ◽

2021 ◽

Author(s):

Haibin Chen ◽

Qianli Ma ◽

Zhenxi Lin ◽

Jiangyue Yan

Keyword(s):

Text Classification ◽

Matching Network ◽

Label Semantics ◽

Hierarchical Text Classification

Download Full-text

Leveraging Accident Investigation Reports as Leading Indicators of Construction Safety Using Text Classification

Construction Research Congress 2020 ◽

10.1061/9780784482872.053 ◽

2020 ◽

Author(s):

Shraddha Shrestha ◽

Syed Ahnaf Morshed ◽

Nipesh Pradhananga ◽

Xuan Lv

Keyword(s):

Text Classification ◽

Construction Safety ◽

Leading Indicators ◽

Accident Investigation

Download Full-text

A Brief Survey on Text Classification Using Various Machine Learning Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v8i1.521 ◽

2018 ◽

Vol 8 (1) ◽

pp. 14

Author(s):

Padmavathi .S ◽

M. Chidambaram

Keyword(s):

Machine Learning ◽

Text Classification ◽

Fixed Number ◽

Machine Learning Techniques ◽

Online Information ◽

Rule Based ◽

Learning Techniques ◽

Machine Learning Approach ◽

Rule Based Approach

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.

Download Full-text

Improved text classification algorithm for spam filtering based on CABSOFV

Future Computer and Information Technology ◽

10.2495/icfcit131301 ◽

2013 ◽

Cited By ~ 1

Author(s):

G. Y. Wei ◽

L. Zou ◽

J. Pan

Keyword(s):

Text Classification ◽

Classification Algorithm ◽

Spam Filtering

Download Full-text

Survey of Feature Selection and Text Classification Methods for Genetic Mutation Classification

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i4.933937 ◽

2019 ◽

Vol 7 (4) ◽

pp. 933-937

Author(s):

Varun Saproo ◽

Rujuta Upadhyay ◽

Manisha Valera

Keyword(s):

Feature Selection ◽

Text Classification ◽

Genetic Mutation ◽

Classification Methods

Download Full-text

Deep Learning for text in limted data settings

10.36227/techrxiv.12100692 ◽

2020 ◽

Author(s):

Pathikkumar Patel ◽

Bhargav Lad ◽

Jinan Fiaidhi

Keyword(s):

Machine Learning ◽

Time Series ◽

Deep Learning ◽

Sentiment Analysis ◽

Transfer Learning ◽

Text Classification ◽

State Of The Art ◽

Time Series Forecasting ◽

Text Data ◽

Performance Levels

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.

Download Full-text