Evaluating Information-Retrieval Models and Machine-Learning Classifiers for Measuring the Social Perception towards Infectious Diseases

Oscar Apolinardo-Arzube; José Antonio García-Díaz; José Medina-Moreira; Harry Luna-Aveiga; Rafael Valencia-García

doi:10.3390/app9142858

Evaluating Information-Retrieval Models and Machine-Learning Classifiers for Measuring the Social Perception towards Infectious Diseases

Applied Sciences ◽

10.3390/app9142858 ◽

2019 ◽

Vol 9 (14) ◽

pp. 2858 ◽

Cited By ~ 7

Author(s):

Oscar Apolinardo-Arzube ◽

José Antonio García-Díaz ◽

José Medina-Moreira ◽

Harry Luna-Aveiga ◽

Rafael Valencia-García

Keyword(s):

Machine Learning ◽

Infectious Diseases ◽

Research Field ◽

Online Information ◽

Surveillance Systems ◽

Retrieval Models ◽

Machine Learning Classifiers ◽

Detection Systems ◽

Learning Classifiers ◽

The One

Recent outbreaks of infectious diseases remind us the importance of early-detection systems improvement. Infodemiology is a novel research field that analyzes online information regarding public health that aims to complement traditional surveillance methods. However, the large volume of information requires the development of algorithms that handle natural language efficiently. In the bibliography, it is possible to find different techniques to carry out these infodemiology studies. However, as far as our knowledge, there are no comprehensive studies that compare the accuracy of these techniques. Consequently, we conducted an infodemiology-based study to extract positive or negative utterances related to infectious diseases so that future syndromic surveillance systems can be improved. The contribution of this paper is two-fold. On the one hand, we use Twitter to compile and label a balanced corpus of infectious diseases with 6164 utterances written in Spanish and collected from Central America. On the other hand, we compare two statistical-models: word-grams and char-grams. The experimentation involved the analysis of different gram sizes, different partitions of the corpus, and two machine-learning classifiers: Random-Forest and Sequential Minimal Optimization. The results reach a 90.80% of accuracy applying the char-grams model with five-char-gram sequences. As a final contribution, the compiled corpus is released.

Download Full-text

Detecting Critical Conceptual Mistakes in Google Translated Medical Information on Infectious Diseases: using Bayesian Machine Learning Classifiers (Preprint)

JMIR Medical Informatics ◽

10.2196/31743 ◽

2021 ◽

Author(s):

Wenxiu Xie ◽

Meng Ji ◽

Tianyong Hao ◽

Chi-Yin Chow

Keyword(s):

Machine Learning ◽

Infectious Diseases ◽

Medical Information ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Bayesian Machine Learning

Download Full-text

Machine Learning for Detecting Credit Card Frauds

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1003.0982s1219 ◽

2020 ◽

Vol 8 (2S12) ◽

pp. 16-23

Keyword(s):

Machine Learning ◽

Credit Card ◽

Data Science ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

System Administrator ◽

The One ◽

Local Outlier ◽

Isolation Forest ◽

Different Parts

Credit card frauds has been a threat that has evolved as a major source of loss for the financial sectors. It has been seen in the different parts of world causing loss of billions of dollars. It is also a area which needs attention from the researchers as the task of fraud detection can be automated using the different machine learning classifiers and data science. If the frauds model encounter the fraudulent transactions it will raise an alarm to the system administrator. The paper proposes a model which uses the machine learning classifiers to detect the fraudulent transactions. The classifiers used in the paper are SVM (Support Vectore Machine ), Isolation Forest and Local Outlier. The focus of the research is to detect the fraudulent transactions to 100% and also we emphasise on the fact that no normal transaction should be detected as fraud wrongly. The process starts with preprocessing the data and then the classifers are applied. The results from each classifers is evaluated to check the one with the better performance. The performance can be increased with use of deep learning algorithms but with the rise in expennses.

Download Full-text

Predicting Infectious Diseases by Using Machine Learning Classifiers

Bioinformatics and Biomedical Engineering - Lecture Notes in Computer Science ◽

10.1007/978-3-030-45385-5_53 ◽

2020 ◽

pp. 590-599

Author(s):

Juan A. Gómez-Pulido ◽

José M. Romero-Muelas ◽

José M. Gómez-Pulido ◽

José L. Castillo Sequera ◽

José Sanz Moreno ◽

...

Keyword(s):

Machine Learning ◽

Infectious Diseases ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Prediction of Chronic and Infectious Diseases using Machine Learning Classifiers- A Systematic Approach

International Journal of Intelligent Engineering and Systems ◽

10.22266/ijies2020.0831.02 ◽

2020 ◽

Vol 13 (4) ◽

pp. 11-20

Author(s):

N Kumar ◽

◽

K Sikamani ◽

Keyword(s):

Machine Learning ◽

Infectious Diseases ◽

Systematic Approach ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Testing Machine Learning Classifiers based on Compositional Metamorphic Relations

International Journal of Performability Engineering ◽

10.23940/ijpe.20.01.p8.6777 ◽

2020 ◽

Vol 16 (1) ◽

pp. 67

Author(s):

Minghua Jia ◽

Xiaodong Wang ◽

Yue Xu ◽

Zhanqi Cui ◽

Ruilin Xie

Keyword(s):

Machine Learning ◽

Testing Machine ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Performance Evaluation of Machine Learning Classifiers for Epileptic Seizure Detection

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i8.122129 ◽

2019 ◽

Vol 7 (8) ◽

pp. 122-129

Author(s):

Mirwais Farahi ◽

Doreswamy .

Keyword(s):

Machine Learning ◽

Performance Evaluation ◽

Epileptic Seizure ◽

Seizure Detection ◽

Epileptic Seizure Detection ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Botnet Detection with Machine Learning Classifiers

Journal of Research on the Lepidoptera ◽

10.36872/lepi/v51i2/301100 ◽

2020 ◽

Vol 51 (2) ◽

pp. 329-335

Author(s):

POKURI ASHOK KUMAR

Keyword(s):

Machine Learning ◽

Botnet Detection ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Sr-Mlc: Scalable Resilience Machine Learning Classifiers Approach in Cyber Security

SSRN Electronic Journal ◽

10.2139/ssrn.3492708 ◽

2019 ◽

Author(s):

Anil Lamba ◽

Natasha Dutta

Keyword(s):

Machine Learning ◽

Cyber Security ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Machine Learning Classifiers for Efficient Spammers Detection in Twitter OSN

SSRN Electronic Journal ◽

10.2139/ssrn.3734170 ◽

2020 ◽

Author(s):

Praveen Kumar Sadineni

Keyword(s):

Machine Learning ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Assessing the Effect of Training Sampling Design on the Performance of Machine Learning Classifiers for Land Cover Mapping Using Multi-Temporal Remote Sensing Data and Google Earth Engine

Remote Sensing ◽

10.3390/rs13081433 ◽

2021 ◽

Vol 13 (8) ◽

pp. 1433

Author(s):

Shobitha Shetty ◽

Prasun Kumar Gupta ◽

Mariana Belgiu ◽

S. K. Srivastav

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Random Sampling ◽

Sampling Design ◽

Remote Sensing Data ◽

Google Earth ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Multi Temporal ◽

Google Earth Engine

Machine learning classifiers are being increasingly used nowadays for Land Use and Land Cover (LULC) mapping from remote sensing images. However, arriving at the right choice of classifier requires understanding the main factors influencing their performance. The present study investigated firstly the effect of training sampling design on the classification results obtained by Random Forest (RF) classifier and, secondly, it compared its performance with other machine learning classifiers for LULC mapping using multi-temporal satellite remote sensing data and the Google Earth Engine (GEE) platform. We evaluated the impact of three sampling methods, namely Stratified Equal Random Sampling (SRS(Eq)), Stratified Proportional Random Sampling (SRS(Prop)), and Stratified Systematic Sampling (SSS) upon the classification results obtained by the RF trained LULC model. Our results showed that the SRS(Prop) method favors major classes while achieving good overall accuracy. The SRS(Eq) method provides good class-level accuracies, even for minority classes, whereas the SSS method performs well for areas with large intra-class variability. Toward evaluating the performance of machine learning classifiers, RF outperformed Classification and Regression Trees (CART), Support Vector Machine (SVM), and Relevance Vector Machine (RVM) with a >95% confidence level. The performance of CART and SVM classifiers were found to be similar. RVM achieved good classification results with a limited number of training samples.

Download Full-text