AraSenCorpus: A Semi-Supervised Approach for Sentiment Annotation of a Large Arabic Text Corpus

Ali Al-Laith; Muhammad Shahbaz; Hind F. Alaskar; Asim Rehmat

doi:10.3390/app11052434

AraSenCorpus: A Semi-Supervised Approach for Sentiment Annotation of a Large Arabic Text Corpus

Applied Sciences ◽

10.3390/app11052434 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2434

Author(s):

Ali Al-Laith ◽

Muhammad Shahbaz ◽

Hind F. Alaskar ◽

Asim Rehmat

Keyword(s):

Short Term Memory ◽

State Of The Art ◽

Arabic Text ◽

Short Term ◽

Learning Classifier ◽

Learning Technique ◽

Benchmark Datasets ◽

Long Short Term Memory ◽

Self Learning ◽

Modern Standard

At a time when research in the field of sentiment analysis tends to study advanced topics in languages, such as English, other languages such as Arabic still suffer from basic problems and challenges, most notably the availability of large corpora. Furthermore, manual annotation is time-consuming and difficult when the corpus is too large. This paper presents a semi-supervised self-learning technique, to extend an Arabic sentiment annotated corpus with unlabeled data, named AraSenCorpus. We use a neural network to train a set of models on a manually labeled dataset containing 15,000 tweets. We used these models to extend the corpus to a large Arabic sentiment corpus called “AraSenCorpus”. AraSenCorpus contains 4.5 million tweets and covers both modern standard Arabic and some of the Arabic dialects. The long-short term memory (LSTM) deep learning classifier is used to train and test the final corpus. We evaluate our proposed framework on two external benchmark datasets to ensure the improvement of the Arabic sentiment classification. The experimental results show that our corpus outperforms the existing state-of-the-art systems.

Download Full-text

Learning Object Context for Dense Captioning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018650 ◽

2019 ◽

Vol 33 ◽

pp. 8650-8657 ◽

Cited By ~ 1

Author(s):

Xiangyang Li ◽

Shuqiang Jiang ◽

Jungong Han

Keyword(s):

Short Term Memory ◽

State Of The Art ◽

Short Term ◽

Visual Elements ◽

Context Learning ◽

Learning Procedure ◽

Benchmark Datasets ◽

Long Short Term Memory ◽

Lstm Network ◽

Context Features

Dense captioning is a challenging task which not only detects visual elements in images but also generates natural language sentences to describe them. Previous approaches do not leverage object information in images for this task. However, objects provide valuable cues to help predict the locations of caption regions as caption regions often highly overlap with objects (i.e. caption regions are usually parts of objects or combinations of them). Meanwhile, objects also provide important information for describing a target caption region as the corresponding description not only depicts its properties, but also involves its interactions with objects in the image. In this work, we propose a novel scheme with an object context encoding Long Short-Term Memory (LSTM) network to automatically learn complementary object context for each caption region, transferring knowledge from objects to caption regions. All contextual objects are arranged as a sequence and progressively fed into the context encoding module to obtain context features. Then both the learned object context features and region features are used to predict the bounding box offsets and generate the descriptions. The context learning procedure is in conjunction with the optimization of both location prediction and caption generation, thus enabling the object context encoding LSTM to capture and aggregate useful object context. Experiments on benchmark datasets demonstrate the superiority of our proposed approach over the state-of-the-art methods.

Download Full-text

A Semi-supervised Approach for Sentiment Analysis of Arab(ic+izi) Messages: Application to the Algerian Dialect

SN Computer Science ◽

10.1007/s42979-021-00510-1 ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Imane Guellil ◽

Ahsan Adeel ◽

Faical Azouaou ◽

Fodil Benali ◽

Ala-Eddine Hachani ◽

...

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Native Speakers ◽

State Of The Art ◽

Research Literature ◽

Short Term ◽

Long Short Term Memory ◽

The One ◽

Modern Standard

AbstractIn this paper, we propose a semi-supervised approach for sentiment analysis of Arabic and its dialects. This approach is based on a sentiment corpus, constructed automatically and reviewed manually by Algerian dialect native speakers. This approach consists of constructing and applying a set of deep learning algorithms to classify the sentiment of Arabic messages as positive or negative. It was applied on Facebook messages written in Modern Standard Arabic (MSA) as well as in Algerian dialect (DALG, which is a low resourced-dialect, spoken by more than 40 million people) with both scripts Arabic and Arabizi. To handle Arabizi, we consider both options: transliteration (largely used in the research literature for handling Arabizi) and translation (never used in the research literature for handling Arabizi). For highlighting the effectiveness of a semi-supervised approach, we carried out different experiments using both corpora for the training (i.e. the corpus constructed automatically and the one that was reviewed manually). The experiments were done on many test corpora dedicated to MSA/DALG, which were proposed and evaluated in the research literature. Both classifiers are used, shallow and deep learning classifiers such as Random Forest (RF), Logistic Regression(LR) Convolutional Neural Network (CNN) and Long short-term memory (LSTM). These classifiers are combined with word embedding models such as Word2vec and fastText that were used for sentiment classification. Experimental results (F1 score up to 95% for intrinsic experiments and up to 89% for extrinsic experiments) showed that the proposed system outperforms the existing state-of-the-art methodologies (the best improvement is up to 25%).

Download Full-text

A Two-Layer Long Short-Term Memory Network for Bottleneck Prediction in Multi-Job Manufacturing Systems

Volume 3: Manufacturing Equipment and Systems ◽

10.1115/msec2018-6678 ◽

2018 ◽

Cited By ~ 1

Author(s):

Xingjian Lai ◽

Huanyi Shui ◽

Jun Ni

Keyword(s):

Manufacturing Systems ◽

Short Term Memory ◽

Complex Dynamics ◽

State Of The Art ◽

Short Term ◽

Term Memory ◽

Future Production ◽

Effective Manner ◽

Long Short Term Memory ◽

Factory Floor

Throughput bottlenecks define and constrain the productivity of a production line. Prediction of future bottlenecks provides a great support for decision-making on the factory floor, which can help to foresee and formulate appropriate actions before production to improve the system throughput in a cost-effective manner. Bottleneck prediction remains a challenging task in literature. The difficulty lies in the complex dynamics of manufacturing systems. There are multiple factors collaboratively affecting bottleneck conditions, such as machine performance, machine degradation, line structure, operator skill level, and product release schedules. These factors impact on one another in a nonlinear manner and exhibit long-term temporal dependencies. State-of-the-art research utilizes various assumptions to simplify the modeling by reducing the input dimensionality. As a result, those models cannot accurately reflect complex dynamics of the bottleneck in a manufacturing system. To tackle this problem, this paper will propose a systematic framework to design a two-layer Long Short-Term Memory (LSTM) network tailored to the dynamic bottleneck prediction problem in multi-job manufacturing systems. This neural network based approach takes advantage of historical high dimensional factory floor data to predict system bottlenecks dynamically considering the future production planning inputs. The model is demonstrated with data from an automotive underbody assembly line. The result shows that the proposed method can achieve higher prediction accuracy compared with current state-of-the-art approaches.

Download Full-text

Arabic dialect sentiment analysis with ZERO effort. \\ Case study: Algerian dialect

INTELIGENCIA ARTIFICIAL ◽

10.4114/intartif.vol23iss65pp124-135 ◽

2020 ◽

Vol 23 (65) ◽

pp. 124-135

Author(s):

Imane Guellil ◽

Marcelo Mendoza ◽

Faical Azouaou

Keyword(s):

Sentiment Analysis ◽

Short Term Memory ◽

State Of The Art ◽

Short Term ◽

Term Memory ◽

Ongoing Work ◽

Long Short Term Memory ◽

Large Corpus ◽

Unique Condition

This paper presents an analytic study showing that it is entirely possible to analyze the sentiment of an Arabic dialect without constructing any resources. The idea of this work is to use the resources dedicated to a given dialect \textit{X} for analyzing the sentiment of another dialect \textit{Y}. The unique condition is to have \textit{X} and \textit{Y} in the same category of dialects. We apply this idea on Algerian dialect, which is a Maghrebi Arabic dialect that suffers from limited available tools and other handling resources required for automatic sentiment analysis. To do this analysis, we rely on Maghrebi dialect resources and two manually annotated sentiment corpus for respectively Tunisian and Moroccan dialect. We also use a large corpus for Maghrebi dialect. We use a state-of-the-art system and propose a new deep learning architecture for automatically classify the sentiment of Arabic dialect (Algerian dialect). Experimental results show that F1-score is up to 83% and it is achieved by Multilayer Perceptron (MLP) with Tunisian corpus and with Long short-term memory (LSTM) with the combination of Tunisian and Moroccan. An improvement of 15% compared to its closest competitor was observed through this study. Ongoing work is aimed at manually constructing an annotated sentiment corpus for Algerian dialect and comparing the results

Download Full-text

Natural language description of images using hybrid recurrent neural network

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v9i4.pp2932-2940 ◽

2019 ◽

Vol 9 (4) ◽

pp. 2932

Author(s):

Md. Asifuzzaman Jishan ◽

Khan Raqib Mahmud ◽

Abul Kalam Al Azad

Keyword(s):

Neural Network ◽

Natural Language ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Text Line ◽

Short Term ◽

Word Representation ◽

Benchmark Datasets ◽

Long Short Term Memory ◽

Language Description

We presented a learning model that generated natural language description of images. The model utilized the connections between natural language and visual data by produced text line based contents from a given image. Our Hybrid Recurrent Neural Network model is based on the intricacies of Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Bi-directional Recurrent Neural Network (BRNN) models. We conducted experiments on three benchmark datasets, e.g., Flickr8K, Flickr30K, and MS COCO. Our hybrid model utilized LSTM model to encode text line or sentences independent of the object location and BRNN for word representation, this reduced the computational complexities without compromising the accuracy of the descriptor. The model produced better accuracy in retrieving natural language based description on the dataset.

Download Full-text

An Optimized Abstractive Text Summarization Model Using Peephole Convolutional LSTM

Symmetry ◽

10.3390/sym11101290 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1290 ◽

Cited By ~ 2

Author(s):

Rahman ◽

Siddiqui

Keyword(s):

Language Processing ◽

Short Term Memory ◽

State Of The Art ◽

Text Summarization ◽

Short Term ◽

Term Memory ◽

Semantic Coherence ◽

Long Short Term Memory ◽

Central Composite ◽

Convolutional Lstm

Abstractive text summarization that generates a summary by paraphrasing a long text remains an open significant problem for natural language processing. In this paper, we present an abstractive text summarization model, multi-layered attentional peephole convolutional LSTM (long short-term memory) (MAPCoL) that automatically generates a summary from a long text. We optimize parameters of MAPCoL using central composite design (CCD) in combination with the response surface methodology (RSM), which gives the highest accuracy in terms of summary generation. We record the accuracy of our model (MAPCoL) on a CNN/DailyMail dataset. We perform a comparative analysis of the accuracy of MAPCoL with that of the state-of-the-art models in different experimental settings. The MAPCoL also outperforms the traditional LSTM-based models in respect of semantic coherence in the output summary.

Download Full-text

Predictive Analysis of Cryptocurrency Price Using Deep Learning

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.27.17889 ◽

2018 ◽

Vol 7 (3.27) ◽

pp. 258 ◽

Cited By ~ 4

Author(s):

Yecheng Yao ◽

Jungho Yi ◽

Shengjun Zhai ◽

Yuwen Lin ◽

Taekseung Kim ◽

...

Keyword(s):

Deep Learning ◽

International Relations ◽

Short Term Memory ◽

Training Data ◽

Short Term ◽

Effective Learning ◽

Learning Techniques ◽

Benchmark Datasets ◽

Novel Method ◽

Long Short Term Memory

The decentralization of cryptocurrencies has greatly reduced the level of central control over them, impacting international relations and trade. Further, wide fluctuations in cryptocurrency price indicate an urgent need for an accurate way to forecast this price. This paper proposes a novel method to predict cryptocurrency price by considering various factors such as market cap, volume, circulating supply, and maximum supply based on deep learning techniques such as the recurrent neural network (RNN) and the long short-term memory (LSTM),which are effective learning models for training data, with the LSTM being better at recognizing longer-term associations. The proposed approach is implemented in Python and validated for benchmark datasets. The results verify the applicability of the proposed approach for the accurate prediction of cryptocurrency price.

Download Full-text

JAZZ MELODY GENERATION USING RECURRENT NETWORKS AND REINFORCEMENT LEARNING

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213006002849 ◽

2006 ◽

Vol 15 (04) ◽

pp. 623-650

Author(s):

JUDY A. FRANKLIN

Keyword(s):

Reinforcement Learning ◽

Dynamic Systems ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

State Of The Art ◽

Recurrent Network ◽

Recurrent Networks ◽

Short Term ◽

Long Short Term Memory ◽

Lstm Network

Recurrent (neural) networks have been deployed as models for learning musical processes, by computational scientists who study processes such as dynamic systems. Over time, more intricate music has been learned as the state of the art in recurrent networks improves. One particular recurrent network, the Long Short-Term Memory (LSTM) network shows promise for learning long songs, and generating new songs. We are experimenting with a module containing two inter-recurrent LSTM networks to cooperatively learn several human melodies, based on the songs' harmonic structures, and on the feedback inherent in the network. We show that these networks can learn to reproduce four human melodies. We then present as input new harmonizations, so as to generate new songs. We describe the reharmonizations, and show the new melodies that result. We also present a hierarchical structure for using reinforcement learning to choose LSTM modules during the course of melody generation.

Download Full-text

Multi‐dimensional long short‐term memory networks for artificial Arabic text recognition in news video

IET Computer Vision ◽

10.1049/iet-cvi.2017.0468 ◽

2018 ◽

Vol 12 (5) ◽

pp. 710-719 ◽

Cited By ~ 13

Author(s):

Oussama Zayene ◽

Sameh Masmoudi Touj ◽

Jean Hennebert ◽

Rolf Ingold ◽

Najoua Essoukri Ben Amara

Keyword(s):

Short Term Memory ◽

Text Recognition ◽

Arabic Text ◽

Short Term ◽

Term Memory ◽

News Video ◽

Long Short Term Memory

Download Full-text

DIANet: Dense-and-Implicit Attention Network

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5842 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4206-4214

Author(s):

Zhongzhan Huang ◽

Senwei Liang ◽

Mingfu Liang ◽

Haizhao Yang

Keyword(s):

Classification Accuracy ◽

Short Term Memory ◽

Residual Network ◽

Short Term ◽

Long Distance ◽

Term Memory ◽

Attention Networks ◽

Network Layers ◽

Benchmark Datasets ◽

Long Short Term Memory

Attention networks have successfully boosted the performance in various vision problems. Previous works lay emphasis on designing a new attention module and individually plug them into the networks. Our paper proposes a novel-and-simple framework that shares an attention module throughout different network layers to encourage the integration of layer-wise information and this parameter-sharing module is referred to as Dense-and-Implicit-Attention (DIA) unit. Many choices of modules can be used in the DIA unit. Since Long Short Term Memory (LSTM) has a capacity of capturing long-distance dependency, we focus on the case when the DIA unit is the modified LSTM (called DIA-LSTM). Experiments on benchmark datasets show that the DIA-LSTM unit is capable of emphasizing layer-wise feature interrelation and leads to significant improvement of image classification accuracy. We further empirically show that the DIA-LSTM has a strong regularization ability on stabilizing the training of deep networks by the experiments with the removal of skip connections (He et al. 2016a) or Batch Normalization (Ioffe and Szegedy 2015) in the whole residual network.

Download Full-text