Zero-Shot Learning for Cross-Lingual News Sentiment Classification

Andraž Pelicon; Marko Pranjić; Dragana Miljković; Blaž Škrlj; Senja Pollak

doi:10.3390/app10175993

Zero-Shot Learning for Cross-Lingual News Sentiment Classification

Applied Sciences ◽

10.3390/app10175993 ◽

2020 ◽

Vol 10 (17) ◽

pp. 5993

Author(s):

Andraž Pelicon ◽

Marko Pranjić ◽

Dragana Miljković ◽

Blaž Škrlj ◽

Senja Pollak

Keyword(s):

Classification System ◽

State Of The Art ◽

Sentiment Classification ◽

Training Data ◽

Test Set ◽

Novel Technique ◽

Analysis Task ◽

Negative News ◽

Cross Lingual ◽

News Sentiment

In this paper, we address the task of zero-shot cross-lingual news sentiment classification. Given the annotated dataset of positive, neutral, and negative news in Slovene, the aim is to develop a news classification system that assigns the sentiment category not only to Slovene news, but to news in another language without any training data required. Our system is based on the multilingual BERTmodel, while we test different approaches for handling long documents and propose a novel technique for sentiment enrichment of the BERT model as an intermediate training step. With the proposed approach, we achieve state-of-the-art performance on the sentiment analysis task on Slovenian news. We evaluate the zero-shot cross-lingual capabilities of our system on a novel news sentiment test set in Croatian. The results show that the cross-lingual approach also largely outperforms the majority classifier, as well as all settings without sentiment enrichment in pre-training.

Download Full-text

Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification (Extended Abstract)

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/802 ◽

2018 ◽

Author(s):

Alejandro Moreo Fernández ◽

Andrea Esuli ◽

Fabrizio Sebastiani

Keyword(s):

Domain Adaptation ◽

State Of The Art ◽

Sentiment Classification ◽

Training Data ◽

Target Domain ◽

Source Domain ◽

Machine Learning Methods ◽

Cross Domain ◽

Current State ◽

Cross Lingual

Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a “target” domain when the only available training data belongs to a different “source” domain. In this extended abstract, we briefly describe our new DA method called Distributional Correspondence Indexing (DCI) for sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reflects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. The experiments we have conducted show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification.

Download Full-text

Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification.

Journal of Artificial Intelligence Research ◽

10.1613/jair.4762 ◽

2016 ◽

Vol 55 ◽

pp. 131-163 ◽

Cited By ~ 13

Author(s):

Alejandro Moreo Fernández ◽

Andrea Esuli ◽

Fabrizio Sebastiani

Keyword(s):

Domain Adaptation ◽

State Of The Art ◽

Computational Cost ◽

Sentiment Classification ◽

Training Data ◽

Target Domain ◽

Machine Learning Methods ◽

Cross Domain ◽

Current State ◽

Cross Lingual

Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a "target'' domain when the only available training data belongs to a different "source'' domain. In this paper we present the Distributional Correspondence Indexing (DCI) method for domain adaptation in sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reflects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. Term correspondence is quantified by means of a distributional correspondence function (DCF). We propose a number of efficient DCFs that are motivated by the distributional hypothesis, i.e., the hypothesis according to which terms with similar meaning tend to have similar distributions in text. Experiments show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification. DCI also brings about a significantly reduced computational cost, and requires a smaller amount of human intervention. As a final contribution, we discuss a more challenging formulation of the domain adaptation problem, in which both the cross-domain and cross-lingual dimensions are tackled simultaneously.

Download Full-text

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00039 ◽

2018 ◽

Vol 6 ◽

pp. 557-570 ◽

Cited By ~ 23

Author(s):

Xilun Chen ◽

Yu Sun ◽

Ben Athiwaratkun ◽

Claire Cardie ◽

Kilian Weinberger

Keyword(s):

State Of The Art ◽

Classification Problem ◽

Sentiment Classification ◽

Great Success ◽

Source Language ◽

Shared Feature ◽

Low Resource ◽

Feature Extractor ◽

Cross Lingual ◽

Averaging Network

In recent years great success has been achieved in sentiment classification for English, thanks in part to the availability of copious annotated resources. Unfortunately, most languages do not enjoy such an abundance of labeled data. To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN 1 ) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exist. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages. Experiments on Chinese and Arabic sentiment classification demonstrate that ADAN significantly outperforms state-of-the-art systems.

Download Full-text

Reinforced Transformer with Cross-Lingual Distillation for Cross-Lingual Aspect Sentiment Classification

Electronics ◽

10.3390/electronics10030270 ◽

2021 ◽

Vol 10 (3) ◽

pp. 270

Author(s):

Hanqian Wu ◽

Zhike Wang ◽

Feng Qing ◽

Shoushan Li

Keyword(s):

General Purpose ◽

Sentiment Classification ◽

Training Data ◽

Target Language ◽

Source Language ◽

Domain Specific ◽

Novel Approach ◽

The Rich ◽

Target Languages ◽

Cross Lingual

Though great progress has been made in the Aspect-Based Sentiment Analysis(ABSA) task through research, most of the previous work focuses on English-based ABSA problems, and there are few efforts on other languages mainly due to the lack of training data. In this paper, we propose an approach for performing a Cross-Lingual Aspect Sentiment Classification (CLASC) task which leverages the rich resources in one language (source language) for aspect sentiment classification in a under-resourced language (target language). Specifically, we first build a bilingual lexicon for domain-specific training data to translate the aspect category annotated in the source-language corpus and then translate sentences from the source language to the target language via Machine Translation (MT) tools. However, most MT systems are general-purpose, it non-avoidably introduces translation ambiguities which would degrade the performance of CLASC. In this context, we propose a novel approach called Reinforced Transformer with Cross-Lingual Distillation (RTCLD) combined with target-sensitive adversarial learning to minimize the undesirable effects of translation ambiguities in sentence translation. We conduct experiments on different language combinations, treating English as the source language and Chinese, Russian, and Spanish as target languages. The experimental results show that our proposed approach outperforms the state-of-the-art methods on different target languages.

Download Full-text

CORE: Automatic Molecule Optimization Using Copy & Refine Strategy

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5404 ◽

2020 ◽

Vol 34 (01) ◽

pp. 638-645

Author(s):

Tianfan Fu ◽

Cao Xiao ◽

Jimeng Sun

Keyword(s):

Success Rate ◽

State Of The Art ◽

Optimization Methods ◽

Training Data ◽

Large Set ◽

Test Set ◽

Zinc Database ◽

Tree Generation ◽

Complete Test ◽

Adversarial Training

Molecule optimization is about generating molecule Y with more desirable properties based on an input molecule X. The state-of-the-art approaches partition the molecules into a large set of substructures S and grow the new molecule structure by iteratively predicting which substructure from S to add. However, since the set of available substructures S is large, such an iterative prediction task is often inaccurate especially for substructures that are infrequent in the training data. To address this challenge, we propose a new generating strategy called “Copy&Refine” (CORE), where at each step the generator first decides whether to copy an existing substructure from input X or to generate a new substructure, then the most promising substructure will be added to the new molecule. Combining together with scaffolding tree generation and adversarial training, CORE can significantly improve several latest molecule optimization methods in various measures including drug likeness (QED), dopamine receptor (DRD2) and penalized LogP. We tested CORE and baselines using the ZINC database and CORE obtained up to 11% and 21% relatively improvement over the baselines on success rate on the complete test set and the subset with infrequent substructures, respectively.

Download Full-text

Cross-lingual sentiment classification: Similarity discovery plus training data adjustment

Knowledge-Based Systems ◽

10.1016/j.knosys.2016.06.004 ◽

2016 ◽

Vol 107 ◽

pp. 129-141 ◽

Cited By ~ 7

Author(s):

Peng Zhang ◽

Suge Wang ◽

Deyu Li

Keyword(s):

Sentiment Classification ◽

Training Data ◽

Cross Lingual ◽

Data Adjustment

Download Full-text

A Distributed Representation-Based Framework for Cross-Lingual Transfer Parsing

Journal of Artificial Intelligence Research ◽

10.1613/jair.4955 ◽

2016 ◽

Vol 55 ◽

pp. 995-1023 ◽

Cited By ~ 1

Author(s):

Jiang Guo ◽

Wanxiang Che ◽

David Yarowsky ◽

Haifeng Wang ◽

Ting Liu

Keyword(s):

Vector Space ◽

State Of The Art ◽

Error Reduction ◽

Training Data ◽

Distributed Representation ◽

Feature Representations ◽

Lexical Feature ◽

Model Transfer ◽

Target Languages ◽

Cross Lingual

This paper investigates the problem of cross-lingual transfer parsing, aiming at inducing dependency parsers for low-resource languages while using only training data from a resource-rich language (e.g., English). Existing model transfer approaches typically don't include lexical features, which are not transferable across languages. In this paper, we bridge the lexical feature gap by using distributed feature representations and their composition. We provide two algorithms for inducing cross-lingual distributed representations of words, which map vocabularies from two different languages into a common vector space. Consequently, both lexical features and non-lexical features can be used in our model for cross-lingual transfer. Furthermore, our framework is flexible enough to incorporate additional useful features such as cross-lingual word clusters. Our combined contributions achieve an average relative error reduction of 10.9% in labeled attachment score as compared with the delexicalized parser, trained on English universal treebank and transferred to three other languages. It also significantly outperforms state-of-the-art delexicalized models augmented with projected cluster features on identical data. Finally, we demonstrate that our models can be further boosted with minimal supervision (e.g., 100 annotated sentences) from target languages, which is of great significance for practical usage.

Download Full-text

Improving Semi-Supervised Learning for Audio Classification with FixMatch

Electronics ◽

10.3390/electronics10151807 ◽

2021 ◽

Vol 10 (15) ◽

pp. 1807

Author(s):

Sascha Grollmisch ◽

Estefanía Cano

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Transfer Learning ◽

Data Transfer ◽

State Of The Art ◽

Training Data ◽

Audio Classification ◽

Image Domain ◽

Full Dataset ◽

Audio Data

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.

Download Full-text

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

Applied Sciences ◽

10.3390/app11114894 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4894

Author(s):

Anna Scius-Bertrand ◽

Michael Jungo ◽

Beat Wolf ◽

Andreas Fischer ◽

Marc Bui

Keyword(s):

Object Detection ◽

State Of The Art ◽

Positive Impact ◽

Detection System ◽

Training Data ◽

Detection Accuracy ◽

Current State ◽

Alignment Task ◽

Scanned Image ◽

Automatic Transcription

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

Download Full-text

New polyp image classification technique using transfer learning of network-in-network structure in endoscopic images

Scientific Reports ◽

10.1038/s41598-021-83199-9 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Young Jae Kim ◽

Jang Pyo Bae ◽

Jun-Won Chung ◽

Dong Kyun Park ◽

Kwang Gi Kim ◽

...

Keyword(s):

Colorectal Cancer ◽

Transfer Learning ◽

Test Data ◽

State Of The Art ◽

Early Stage ◽

Statistical Significance ◽

Recall Rate ◽

Training Data ◽

Fine Tuning ◽

Accuracy Evaluation

AbstractWhile colorectal cancer is known to occur in the gastrointestinal tract. It is the third most common form of cancer of 27 major types of cancer in South Korea and worldwide. Colorectal polyps are known to increase the potential of developing colorectal cancer. Detected polyps need to be resected to reduce the risk of developing cancer. This research improved the performance of polyp classification through the fine-tuning of Network-in-Network (NIN) after applying a pre-trained model of the ImageNet database. Random shuffling is performed 20 times on 1000 colonoscopy images. Each set of data are divided into 800 images of training data and 200 images of test data. An accuracy evaluation is performed on 200 images of test data in 20 experiments. Three compared methods were constructed from AlexNet by transferring the weights trained by three different state-of-the-art databases. A normal AlexNet based method without transfer learning was also compared. The accuracy of the proposed method was higher in statistical significance than the accuracy of four other state-of-the-art methods, and showed an 18.9% improvement over the normal AlexNet based method. The area under the curve was approximately 0.930 ± 0.020, and the recall rate was 0.929 ± 0.029. An automatic algorithm can assist endoscopists in identifying polyps that are adenomatous by considering a high recall rate and accuracy. This system can enable the timely resection of polyps at an early stage.

Download Full-text