Multitask Learning with Knowledge Base for Joint Intent Detection and Slot Filling

Ting He; Xiaohong Xu; Yating Wu; Huazhen Wang; Jian Chen

doi:10.3390/app11114887

Multitask Learning with Knowledge Base for Joint Intent Detection and Slot Filling

Applied Sciences ◽

10.3390/app11114887 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4887

Author(s):

Ting He ◽

Xiaohong Xu ◽

Yating Wu ◽

Huazhen Wang ◽

Jian Chen

Keyword(s):

Knowledge Base ◽

Resource Sharing ◽

Short Term Memory ◽

State Of The Art ◽

Detection System ◽

Joint Model ◽

Detection Accuracy ◽

Dialog Systems ◽

Task Oriented ◽

Slot Filling

Intent detection and slot filling are important modules in task-oriented dialog systems. In order to make full use of the relationship between different modules and resource sharing, solving the problem of a lack of semantics, this paper proposes a multitasking learning intent-detection system, based on the knowledge-base and slot-filling joint model. The approach has been used to share information and rich external utility between intent and slot modules in a three-part process. First, this model obtains shared parameters and features between the two modules based on long short-term memory and convolutional neural networks. Second, a knowledge base is introduced into the model to improve its performance. Finally, a weighted-loss function is built to optimize the joint model. Experimental results demonstrate that our model achieves better performance compared with state-of-the-art algorithms on a benchmark Airline Travel Information System (ATIS) dataset and the Snips dataset. Our joint model achieves state-of-the-art results on the benchmark ATIS dataset with a 1.33% intent-detection accuracy improvement, a 0.94% slot filling F value improvement, and with 0.19% and 0.31% improvements respectively on the Snips dataset.

Download Full-text

Deep Cascade Multi-Task Learning for Slot Filling in Online Shopping Assistant

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016465 ◽

2019 ◽

Vol 33 ◽

pp. 6465-6472 ◽

Cited By ~ 3

Author(s):

Yu Gong ◽

Xusheng Luo ◽

Yu Zhu ◽

Wenwu Ou ◽

Zhao Li ◽

...

Keyword(s):

Natural Language ◽

Knowledge Base ◽

Online Shopping ◽

State Of The Art ◽

Language Understanding ◽

Dialog Systems ◽

Named Entity ◽

Online Test ◽

Benchmark Datasets ◽

Slot Filling

Slot filling is a critical task in natural language understanding (NLU) for dialog systems. State-of-the-art approaches treat it as a sequence labeling problem and adopt such models as BiLSTM-CRF. While these models work relatively well on standard benchmark datasets, they face challenges in the context of E-commerce where the slot labels are more informative and carry richer expressions. In this work, inspired by the unique structure of E-commerce knowledge base, we propose a novel multi-task model with cascade and residual connections, which jointly learns segment tagging, named entity tagging and slot filling. Experiments show the effectiveness of the proposed cascade and residual structures. Our model has a 14.6% advantage in F1 score over the strong baseline methods on a new Chinese E-commerce shopping assistant dataset, while achieving competitive accuracies on a standard dataset. Furthermore, online test deployed on such dominant E-commerce platform shows 130% improvement on accuracy of understanding user utterances. Our model has already gone into production in the E-commerce platform.

Download Full-text

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

Applied Sciences ◽

10.3390/app11114894 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4894

Author(s):

Anna Scius-Bertrand ◽

Michael Jungo ◽

Beat Wolf ◽

Andreas Fischer ◽

Marc Bui

Keyword(s):

Object Detection ◽

State Of The Art ◽

Positive Impact ◽

Detection System ◽

Training Data ◽

Detection Accuracy ◽

Current State ◽

Alignment Task ◽

Scanned Image ◽

Automatic Transcription

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

Download Full-text

Hybrid-Based Analysis Impact on Ransomware Detection for Android Systems

Applied Sciences ◽

10.3390/app112210976 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10976

Author(s):

Rana Almohaini ◽

Iman Almomani ◽

Aala AlKhayer

Keyword(s):

Dynamic Analysis ◽

Static Analysis ◽

Hybrid System ◽

State Of The Art ◽

Detection System ◽

Detection Accuracy ◽

Accuracy Rate ◽

Hybrid Detection ◽

Dynamic Analyses ◽

Data Files

Android ransomware is one of the most threatening attacks that is increasing at an alarming rate. Ransomware attacks usually target Android users by either locking their devices or encrypting their data files and then requesting them to pay money to unlock the devices or recover the files back. Existing solutions for detecting ransomware mainly use static analysis. However, limited approaches apply dynamic analysis specifically for ransomware detection. Furthermore, the performance of these approaches is either poor or often fails in the presence of code obfuscation techniques or benign applications that use cryptography methods for their APIs usage. Additionally, most of them are unable to detect ransomware attacks at early stages. Therefore, this paper proposes a hybrid detection system that effectively utilizes both static and dynamic analyses to detect ransomware with high accuracy. For the static analysis, the proposed hybrid system considered more than 70 state-of-the-art antivirus engines. For the dynamic analysis, this research explored the existing dynamic tools and conducted an in-depth comparative study to find the proper tool to integrate it in detecting ransomware whenever needed. To evaluate the performance of the proposed hybrid system, we analyzed statically and dynamically over one hundred ransomware samples. These samples originated from 10 different ransomware families. The experiments’ results revealed that static analysis achieved almost half of the detection accuracy—ranging around 40–55%, compared to the dynamic analysis, which reached a 100% accuracy rate. Moreover, this research reports some of the high API classes, methods, and permissions used in these ransomware apps. Finally, some case studies are highlighted, including failed running apps and crypto-ransomware patterns.

Download Full-text

Unsupervised Learning of KB Queries in Task-Oriented Dialogs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00372 ◽

2021 ◽

Vol 9 ◽

pp. 374-390

Author(s):

Dinesh Raghu ◽

Nikhil Gupta ◽

Mausam

Keyword(s):

Reinforcement Learning ◽

Knowledge Base ◽

State Of The Art ◽

The Novel ◽

Generate System ◽

User Intent ◽

Research Challenges ◽

Policy Optimization ◽

Task Oriented ◽

And Training

Abstract Task-oriented dialog (TOD) systems often need to formulate knowledge base (KB) queries corresponding to the user intent and use the query results to generate system responses. Existing approaches require dialog datasets to explicitly annotate these KB queries—these annotations can be time consuming, and expensive. In response, we define the novel problems of predicting the KB query and training the dialog agent, without explicit KB query annotation. For query prediction, we propose a reinforcement learning (RL) baseline, which rewards the generation of those queries whose KB results cover the entities mentioned in subsequent dialog. Further analysis reveals that correlation among query attributes in KB can significantly confuse memory augmented policy optimization (MAPO), an existing state of the art RL agent. To address this, we improve the MAPO baseline with simple but important modifications suited to our task. To train the full TOD system for our setting, we propose a pipelined approach: it independently predicts when to make a KB query (query position predictor), then predicts a KB query at the predicted position (query predictor), and uses the results of predicted query in subsequent dialog (next response predictor). Overall, our work proposes first solutions to our novel problem, and our analysis highlights the research challenges in training TOD systems without query annotation.

Download Full-text

FastText-Based Intent Detection for Inflected Languages

Information ◽

10.3390/info10050161 ◽

2019 ◽

Vol 10 (5) ◽

pp. 161 ◽

Cited By ~ 4

Author(s):

Kaspars Balodis ◽

Daiga Deksne

Keyword(s):

Neural Network ◽

State Of The Art ◽

Detection System ◽

Detection Accuracy ◽

Word Embeddings ◽

Neural Network Classifier ◽

Dialogue System ◽

Baltic Countries

Intent detection is one of the main tasks of a dialogue system. In this paper, we present our intent detection system that is based on fastText word embeddings and a neural network classifier. We find an improvement in fastText sentence vectorization, which, in some cases, shows a significant increase in intent detection accuracy. We evaluate the system on languages commonly spoken in Baltic countries—Estonian, Latvian, Lithuanian, English, and Russian. The results show that our intent detection system provides state-of-the-art results on three previously published datasets, outperforming many popular services. In addition to this, for Latvian, we explore how the accuracy of intent detection is affected if we normalize the text in advance.

Download Full-text

Malicious Traffic classification using Long Short-Term Memory (LSTM) model

10.21203/rs.3.rs-159180/v1 ◽

2021 ◽

Author(s):

Naresh Kumar Thapa K ◽

N. Duraipandian

Keyword(s):

Short Term Memory ◽

State Of The Art ◽

Detection System ◽

Classification Systems ◽

Traffic Classification ◽

Short Term ◽

Fixed Sequence ◽

Term Memory ◽

Proposed Model ◽

Long Short Term Memory

Abstract Malicious traffic classification is the initial and primary step for any network-based security systems. This traffic classification systems include behavior-based anomaly detection system and Intrusion Detection System. Existing methods always relies on the conventional techniques and process the data in the fixed sequence, which may leads to performance issues. Furthermore, conventional techniques require proper annotation to process the volumetric data. Relying on the data annotation for efficient traffic classification may leads to network loops and bandwidth issues within the network. To address the above-mentioned issues, this paper presents a novel solution based on artificial intelligence perspective. The key idea of this paper is to propose a novel malicious classification system using Long Short-Term Memory (LSTM) model. To validate the efficiency of the proposed model, an experimental setup along with experimental validation is carried out. From the experimental results, it is proven that the proposed model is better in terms of accuracy, throughput when compared to the state-of-the-art models. Further, the accuracy of the proposed model outperforms the existing state of the art models with increase in 5% and overall 99.5% in accuracy.

Download Full-text

Elastic CRFs for Open-Ontology Slot Filling

Applied Sciences ◽

10.3390/app112210675 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10675

Author(s):

Yinpei Dai ◽

Yichi Zhang ◽

Hong Liu ◽

Zhijian Ou ◽

Yi Huang ◽

...

Keyword(s):

Random Field ◽

Conditional Random Field ◽

Dialog Systems ◽

Cross Domain ◽

Crucial Component ◽

Semantic Concepts ◽

The One ◽

Task Oriented ◽

Slot Filling ◽

Language Description

Slot filling is a crucial component in task-oriented dialog systems that is used to parse (user) utterances into semantic concepts called slots. An ontology is defined by the collection of slots and the values that each slot can take. The most widely used practice of treating slot filling as a sequence labeling task suffers from two main drawbacks. First, the ontology is usually pre-defined and fixed and therefore is not able to detect new labels for unseen slots. Second, the one-hot encoding of slot labels ignores the correlations between slots with similar semantics, which makes it difficult to share knowledge learned across different domains. To address these problems, we propose a new model called elastic conditional random field (eCRF), where each slot is represented by the embedding of its natural language description and modeled by a CRF layer. New slot values can be detected by eCRF whenever a language description is available for the slot. In our experiment, we show that eCRFs outperform existing models in both in-domain and cross-domain tasks, especially in predicting unseen slots and values.

Download Full-text

3D Object Detection and Instance Segmentation from 3D Range and 2D Color Images

Sensors ◽

10.3390/s21041213 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1213

Author(s):

Xiaoke Shen ◽

Ioannis Stamos

Keyword(s):

Object Detection ◽

State Of The Art ◽

Detection System ◽

Significant Loss ◽

Detection Accuracy ◽

Lighting Conditions ◽

Rgb Images ◽

3D Object Detection ◽

And Robotics ◽

Instance Segmentation

Instance segmentation and object detection are significant problems in the fields of computer vision and robotics. We address those problems by proposing a novel object segmentation and detection system. First, we detect 2D objects based on RGB, depth only, or RGB-D images. A 3D convolutional-based system, named Frustum VoxNet, is proposed. This system generates frustums from 2D detection results, proposes 3D candidate voxelized images for each frustum, and uses a 3D convolutional neural network (CNN) based on these candidates voxelized images to perform the 3D instance segmentation and object detection. Results on the SUN RGB-D dataset show that our RGB-D-based system’s 3D inference is much faster than state-of-the-art methods, without a significant loss of accuracy. At the same time, we can provide segmentation and detection results using depth only images, with accuracy comparable to RGB-D-based systems. This is important since our methods can also work well in low lighting conditions, or with sensors that do not acquire RGB images. Finally, the use of segmentation as part of our pipeline increases detection accuracy, while providing at the same time 3D instance segmentation.

Download Full-text

Task-Oriented Dialog Systems That Consider Multiple Appropriate Responses under the Same Context

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6507 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9604-9611

Author(s):

Yichi Zhang ◽

Zhijian Ou ◽

Zhou Yu

Keyword(s):

Data Augmentation ◽

State Of The Art ◽

Task Completion ◽

Dialog Systems ◽

State Action ◽

Response Diversity ◽

Dialog System ◽

Additional State ◽

The One ◽

Task Oriented

Conversations have an intrinsic one-to-many property, which means that multiple responses can be appropriate for the same dialog context. In task-oriented dialogs, this property leads to different valid dialog policies towards task completion. However, none of the existing task-oriented dialog generation approaches takes this property into account. We propose a Multi-Action Data Augmentation (MADA) framework to utilize the one-to-many property to generate diverse appropriate dialog responses. Specifically, we first use dialog states to summarize the dialog history, and then discover all possible mappings from every dialog state to its different valid system actions. During dialog system training, we enable the current dialog state to map to all valid system actions discovered in the previous process to create additional state-action pairs. By incorporating these additional pairs, the dialog policy learns a balanced action distribution, which further guides the dialog model to generate diverse responses. Experimental results show that the proposed framework consistently improves dialog policy diversity, and results in improved response diversity and appropriateness. Our model obtains state-of-the-art results on MultiWOZ.

Download Full-text

Towards Automatic Depression Detection: A BiLSTM/1D CNN-Based Model

Applied Sciences ◽

10.3390/app10238701 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8701

Author(s):

Lin Lin ◽

Xuri Chen ◽

Ying Shen ◽

Lin Zhang

Keyword(s):

Short Term Memory ◽

Detection System ◽

Global Mental Health ◽

Speech Signals ◽

Detection Accuracy ◽

Patient Interviews ◽

Depression Detection ◽

Text Features ◽

Fully Connected ◽

Linguistic Content

Depression is a global mental health problem, the worst cases of which can lead to self-injury or suicide. An automatic depression detection system is of great help in facilitating clinical diagnosis and early intervention of depression. In this work, we propose a new automatic depression detection method utilizing speech signals and linguistic content from patient interviews. Specifically, the proposed method consists of three components, which include a Bidirectional Long Short-Term Memory (BiLSTM) network with an attention layer to deal with linguistic content, a One-Dimensional Convolutional Neural Network (1D CNN) to deal with speech signals, and a fully connected network integrating the outputs of the previous two models to assess the depressive state. Evaluated on two publicly available datasets, our method achieves state-of-the-art performance compared with the existing methods. In addition, our method utilizes audio and text features simultaneously. Therefore, it can get rid of the misleading information provided by the patients. As a conclusion, our method can automatically evaluate the depression state and does not require an expert to conduct the psychological evaluation on site. Our method greatly improves the detection accuracy, as well as the efficiency.

Download Full-text