Dialogue Act Classification, Instance-Based Learning, and Higher Order Dialogue
            Structure

Barbara Di Eugenio; Zhuli Xie; Riccardo Serafin

doi:10.5087/dad.2010.002

Dialogue Act Classification, Instance-Based Learning, and Higher Order Dialogue Structure

Dialogue & Discourse ◽

10.5087/dad.2010.002 ◽

2010 ◽

Vol 1 (2) ◽

pp. 1-24 ◽

Cited By ~ 7

Author(s):

Barbara Di Eugenio ◽

Zhuli Xie ◽

Riccardo Serafin

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Linguistic Features ◽

Instance Based Learning ◽

Dialogue Structure ◽

K Nearest Neighbor Algorithm ◽

Semantic Spaces ◽

Better Than

In this paper, we explore instance-based learning methods for dialogue act classification on two corpora, MapTask and CallHome Spanish. We start with Latent Semantic Analysis (LSA), and extend it as Feature Latent Semantic Analysis (FLSA). FLSA adds richer linguistic features to LSA, which only uses words. In particular, we explore the extended dialogue context, both linearly (the previous dialogue act) and hierarchically (conversational games). We show how the k-Nearest Neighbor algorithm obtains its best results when applied to the reduced semantic spaces generated by FLSA. Empirically, our results are better than previously published results on these two corpora; linguistically, we confirm and extend previous observations that the hierarchical dialogue structure encoded via the notion of Game is of primary importance for dialogue act recognition.

Download Full-text

The K-nearest neighbor algorithm predicted rehabilitation potential better than current Clinical Assessment Protocol

Journal of Clinical Epidemiology ◽

10.1016/j.jclinepi.2007.06.001 ◽

2007 ◽

Vol 60 (10) ◽

pp. 1015-1021 ◽

Cited By ~ 22

Author(s):

Mu Zhu ◽

Wenhong Chen ◽

John P. Hirdes ◽

Paul Stolee

Keyword(s):

Clinical Assessment ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Assessment Protocol ◽

Rehabilitation Potential ◽

K Nearest Neighbor Algorithm ◽

Better Than

Download Full-text

Evaluating the compliance of modern electronic banking and digital cryptocurrency systems with the information society's requirements

Finance and Credit ◽

10.24891/fc.27.1.88 ◽

2021 ◽

Vol 27 (1) ◽

pp. 88-112

Author(s):

Ekaterina I. DYUDIKOVA ◽

Natal'ya N. KUNITSYNA

Keyword(s):

Big Data ◽

Text Mining ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Nearest Neighbor ◽

Systems Approach ◽

Social Needs ◽

K Nearest Neighbor ◽

Electronic Banking ◽

Digital Platforms

Subject. The digital economy emerged as a new generation of financial instruments, such as cryptocurrencies, were invented and proliferated, which were able to counteract global challenges. Those who oppose to the legitimization of digital assets and their integration into the payment infrastructure do not point out material advantages and support drastic transformations of the existing financial system. However, assuming very risky digital payments, the scope of cruptocurrency still grows. The article presents the outcome of intellectual text analysis of feedback left by users of electronic banking and digital cryptocurrency systems. Doing so, we determined to what extent they are satisfied with various systems. Objectives. The study is intended to provide the theoretical and methodological rationale for, and practically test the model that determines key themes in analyzable non-structured big data and allows to automatically evaluate the satisfaction of users with various payment systems. Methods. We resorted to the formal logic, systems approach, methods of comparative analysis, text mining and latent semantic analysis. Results. We analyzed reviews uploaded to www.banki.ru and www.otzovik.ru through parsing, stop word elimination, stemming, probabilistic thematic modeling based on the latent semantic analysis. We assessed to what extent users are satisfied with various systems by examining their reviews through the text tone analysis, the k-nearest neighbor algorithm and automated scoring of unrated reviews. Conclusions and Relevance. Text mining of unstructured big data shows that digital platforms, notwithstanding their infancy and high risks, already mostly satisfy social needs as compared to electronic banking systems, which determines the reasonableness of integrating them into the payment system to unlock their potential.

Download Full-text

A hybrid machine learning approach of fuzzy-rough-k-nearest neighbor, latent semantic analysis, and ranker search for efficient disease diagnosis

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211820 ◽

2021 ◽

pp. 1-16

Author(s):

Sunil Kumar Jha ◽

Ninoslav Marina ◽

Jinwei Wang ◽

Zulfiqar Ahmad

Keyword(s):

Machine Learning ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Nearest Neighbor ◽

Disease Diagnosis ◽

Learning Approach ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Machine Learning Approach ◽

Hybrid Machine

Machine learning approaches have a valuable contribution in improving competency in automated decision systems. Several machine learning approaches have been developed in the past studies in individual disease diagnosis prediction. The present study aims to develop a hybrid machine learning approach for diagnosis predictions of multiple diseases based on the combination of efficient feature generation, selection, and classification methods. Specifically, the combination of latent semantic analysis, ranker search, and fuzzy-rough-k-nearest neighbor has been proposed and validated in the diagnosis prediction of the primary tumor, post-operative, breast cancer, lymphography, audiology, fertility, immunotherapy, and COVID-19, etc. The performance of the proposed approach is compared with single and other hybrid machine learning approaches in terms of accuracy, analysis time, precision, recall, F-measure, the area under ROC, and the Kappa coefficient. The proposed hybrid approach performs better than single and other hybrid approaches in the diagnosis prediction of each of the selected diseases. Precisely, the suggested approach achieved the maximum recognition accuracy of 99.12%of the primary tumor, 96.45%of breast cancer Wisconsin, 94.44%of cryotherapy, 93.81%of audiology, and significant improvement in the classification accuracy and other evaluation metrics in the recognition of the rest of the selected diseases. Besides, it handles the missing values in the dataset effectively.

Download Full-text

Klasifikasi Topik Multi Label pada Hadis Shahih Bukhari Menggunakan K-Nearest Neighbor dan Latent Semantic Analysis

JURIKOM (Jurnal Riset Komputer) ◽

10.30865/jurikom.v7i1.2013 ◽

2020 ◽

Vol 7 (1) ◽

pp. 140

Author(s):

Dian Chusnul Hidayati ◽

Said Al Faraby ◽

Adiwijaya Adiwijaya

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Nearest Neighbor ◽

Islamic Law ◽

Computation Time ◽

K Nearest Neighbor ◽

Space Model ◽

Binary Relevance ◽

Long Time ◽

Vector Dimension

Hadith is the second source of Islamic law after Al-Quran, making it important to study. However, there are some difficulties in learning hadith, such as to determine which hadith belongs to the topic of suggestions, prohibitions, and information. This certainly obstructs the hadith learning process, especially for Muslims. Therefore, it is necessary to classify hadiths into the topic of suggestions, prohibitions, information, and a combination of the three topics which also called as multi-label topic. The classification can be done with the K-Nearest Neighbor, it is one of the best methods in the Vector Space Model and is the simplest but quite effective method. However, the KNN has a lack in dealing with high vector dimension, resulting in the long time computing classification. For that reason, it is necessary to classify Sahih Bukhari's Hadiths into the topic of recommendations, prohibitions, and information using the Latent-Semantic Analysis - K-nearest Neighbor (LSA-KNN) method. Binary Relevance method is also employed in this research to process the multi-label data. This research shows that the performance of LSA-KNN is 90.28% with the computation time is 19 minutes 21 seconds and the performance of KNN is 90.23% with the computation time is 37 minutes 06 seconds, which means that the LSA-KNN method has a better performance than KNN

Download Full-text

Finding Optimal Stations Using Euclidean Distance and Adjustable Surrounding Sphere

Applied Sciences ◽

10.3390/app11020848 ◽

2021 ◽

Vol 11 (2) ◽

pp. 848

Author(s):

Athita Onuean ◽

Hanmin Jung ◽

Krisana Chinnasarn

Keyword(s):

Air Pollution ◽

Euclidean Distance ◽

Nearest Neighbor ◽

Monitoring Network ◽

K Nearest Neighbor ◽

Rapid Urbanization ◽

Spatial Coverage ◽

K Nearest Neighbor Algorithm ◽

Effective Network ◽

Better Than

Air quality monitoring network (AQMN) plays an important role in air pollution management. However, setting up an initial network in a city often lacks necessary information such as historical pollution and geographical data, which makes it challenging to establish an effective network. Meanwhile, cities with an existing one do not adequately represent spatial coverage of air pollution issues or face rapid urbanization where additional stations are needed. To resolve the two cases, we propose four methods for finding stations and constructing a network using Euclidean distance and the k-nearest neighbor algorithm, consisting of Euclidean Distance (ED), Fixed Surrounding Sphere (FSS), Euclidean Distance + Fixed Surrounding Sphere (ED + FSS), and Euclidean Distance + Adjustable Surrounding Sphere (ED + ASS). We introduce and apply a coverage percentage and weighted coverage degree for evaluating the results from our proposed methods. Our experiment result shows that ED + ASS is better than other methods for finding stations to enhance spatial coverage. In the case of setting up the initial networks, coverage percentages are improved up to 22%, 37%, and 56% compared with the existing network, and adding a station in the existing one improved up by 34%, 130%, and 39%, in Sejong, Bonn, and Bangkok cities, respectively. Our method depicts acceptable results and will be implemented as a guide for establishing a new network and can be a tool for improving spatial coverage of the existing network for future expansions in air monitoring.

Download Full-text

A Scalable K-Nearest Neighbor Algorithm for Recommendation System Problems

2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO) ◽

10.23919/mipro48935.2020.9245195 ◽

2020 ◽

Author(s):

A. Sagdic ◽

C. Tekinbas ◽

E. Arslan ◽

T. Kucukyilmaz

Keyword(s):

Recommendation System ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

Perancangan Aplikasi Prediksi Kelulusan Tepat Waktu Bagi Mahasiswa Baru Dengan Teknik Data Mining (Studi Kasus: Data Akademik Mahasiswa STMIK Dipanegara Makassar)

Creative Information Technology Journal ◽

10.24076/citec.2014v1i4.27 ◽

2015 ◽

Vol 1 (4) ◽

pp. 270

Author(s):

Muhammad Syukri Mustafa ◽

I. Wayan Simpen

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Test Results ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Sample Data ◽

New Students ◽

K Nearest Neighbor Algorithm ◽

Using Data ◽

Existing Data

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.

Download Full-text

Handwritten Balinesse Character Recognition using K-Nearest Neighbor

10.31227/osf.io/z6m8u ◽

2018 ◽

Author(s):

I Wayan Agus Surya Darma

Keyword(s):

Feature Extraction ◽

Success Rate ◽

Character Recognition ◽

Nearest Neighbor ◽

Recognition System ◽

Extraction Process ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

Character Feature

Balinese character recognition is a technique to recognize feature or pattern of Balinese character. Feature of Balinese character is generated through feature extraction process. This research using handwritten Balinese character. Feature extraction is a process to obtain the feature of character. In this research, feature extraction process generated semantic and direction feature of handwritten Balinese character. Recognition is using K-Nearest Neighbor algorithm to recognize 81 handwritten Balinese character. The feature of Balinese character images tester are compared with reference features. Result of the recognition system with K=3 and reference=10 is achieved a success rate of 97,53%.

Download Full-text

A Temperature Identification Method Based on Chromaticity Statistical Features of Raw Format Visible Image and K-nearest Neighbor Algorithm

2020 IEEE 1st China International Youth Conference on Electrical Engineering (CIYCEE) ◽

10.1109/ciycee49808.2020.9332599 ◽

2020 ◽

Author(s):

Wenmao Li ◽

Qizheng Ye ◽

Zhe Yuan ◽

Yang He

Keyword(s):

Nearest Neighbor ◽

Statistical Features ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Visible Image ◽

Identification Method ◽

K Nearest Neighbor Algorithm

Download Full-text

Identification and Classification of Technical Lignins by means of Principle Component Analysis and k‐Nearest Neighbor Algorithm

Chemistry–Methods ◽

10.1002/cmtd.202100065 ◽

2021 ◽

Vol 1 (8) ◽

pp. 352-353

Author(s):

Friedrich Fink ◽

Franziska Emmerling ◽

Jana Falkenhagen

Keyword(s):

Principle Component Analysis ◽

Nearest Neighbor ◽

Component Analysis ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Principle Component ◽

K Nearest Neighbor Algorithm ◽

Technical Lignins

Download Full-text