Deep Learning-based Sentiment Analysis and Topic Modeling on Tourism During Covid-19 Pandemic

Frontiers in Computer Science ◽

10.3389/fcomp.2021.775368 ◽

2021 ◽

Vol 3 ◽

Author(s):

Ram Krishn Mishra ◽

Siddhaling Urolagin ◽

J. Angel Arul Jothi ◽

Ashwin Sanjay Neogi ◽

Nishad Nawaz

Keyword(s):

Social Media ◽

Deep Learning ◽

Topic Modeling ◽

Tourism Industry ◽

Classification Model ◽

Training Set ◽

Test Set ◽

Multiple Parameters ◽

Social Media Platforms ◽

Flow Of Information

The Covid-19 pandemic has disrupted the world economy and significantly influenced the tourism industry. Millions of people have shared their emotions, views, facts, and circumstances on numerous social media platforms, which has resulted in a massive flow of information. The high-density social media data has drawn many researchers to extract valuable information and understand the user’s emotions during the pandemic time. The research looks at the data collected from the micro-blogging site Twitter for the tourism sector, emphasizing sub-domains hospitality and healthcare. The sentiment of approximately 20,000 tweets have been calculated using Valence Aware Dictionary for Sentiment Reasoning (VADER) model. Furthermore, topic modeling was used to reveal certain hidden themes and determine the narrative and direction of the topics related to tourism healthcare, and hospitality. Topic modeling also helped us to identify inter-cluster similar terms and analyzing the flow of information from a group of a similar opinion. Finally, a cutting-edge deep learning classification model was used with different epoch sizes of the dataset to anticipate and classify the people’s feelings. The deep learning model has been tested with multiple parameters such as training set accuracy, test set accuracy, validation loss, validation accuracy, etc., and resulted in more than a 90% in training set accuracy tourism hospitality and healthcare reported 80.9 and 78.7% respectively on test set accuracy.

Download Full-text

A Simple Method to Train the AI Diagnosis Model of Pulmonary Nodules

Computational and Mathematical Methods in Medicine ◽

10.1155/2020/2812874 ◽

2020 ◽

Vol 2020 ◽

pp. 1-6

Author(s):

Zhehao He ◽

Wang Lv ◽

Jian Hu

Keyword(s):

Deep Learning ◽

Data Augmentation ◽

Pulmonary Nodules ◽

Learning System ◽

Pathological Diagnosis ◽

Classification Model ◽

Lung Nodules ◽

Training Set ◽

Simple Method ◽

Test Set

Background. The differential diagnosis of subcentimetre lung nodules with a diameter of less than 1 cm has always been one of the problems of imaging doctors and thoracic surgeons. We plan to create a deep learning model for the diagnosis of pulmonary nodules in a simple method. Methods. Image data and pathological diagnosis of patients come from the First Affiliated Hospital of Zhejiang University School of Medicine from October 1, 2016, to October 1, 2019. After data preprocessing and data augmentation, the training set is used to train the model. The test set is used to evaluate the trained model. At the same time, the clinician will also diagnose the test set. Results. A total of 2,295 images of 496 lung nodules and their corresponding pathological diagnosis were selected as a training set and test set. After data augmentation, the number of training set images reached 12,510 images, including 6,648 malignant nodular images and 5,862 benign nodular images. The area under the P-R curve of the trained model is 0.836 in the classification of malignant and benign nodules. The area under the ROC curve of the trained model is 0.896 (95% CI: 78.96%~100.18%), which is higher than that of three doctors. However, the P value is not less than 0.05. Conclusion. With the help of an automatic machine learning system, clinicians can create a deep learning pulmonary nodule pathology classification model without the help of deep learning experts. The diagnostic efficiency of this model is not inferior to that of the clinician.

Download Full-text

Feature-Weighted Sampling for Proper Evaluation of Classification Models

Applied Sciences ◽

10.3390/app11052039 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2039

Author(s):

Hyunseok Shin ◽

Sejong Oh

Keyword(s):

Random Sampling ◽

Sampling Method ◽

Classification Model ◽

Training Set ◽

Test Set ◽

Feature Importance ◽

Proper Training ◽

Machine Learning Applications ◽

Test Sets ◽

The Given

In machine learning applications, classification schemes have been widely used for prediction tasks. Typically, to develop a prediction model, the given dataset is divided into training and test sets; the training set is used to build the model and the test set is used to evaluate the model. Furthermore, random sampling is traditionally used to divide datasets. The problem, however, is that the performance of the model is evaluated differently depending on how we divide the training and test sets. Therefore, in this study, we proposed an improved sampling method for the accurate evaluation of a classification model. We first generated numerous candidate cases of train/test sets using the R-value-based sampling method. We evaluated the similarity of distributions of the candidate cases with the whole dataset, and the case with the smallest distribution–difference was selected as the final train/test set. Histograms and feature importance were used to evaluate the similarity of distributions. The proposed method produces more proper training and test sets than previous sampling methods, including random and non-random sampling.

Download Full-text

Intelligent Detection of False Information in Arabic Tweets Utilizing Hybrid Harris Hawks Based Feature Selection and Machine Learning Models

Symmetry ◽

10.3390/sym13040556 ◽

2021 ◽

Vol 13 (4) ◽

pp. 556

Author(s):

Thaer Thaher ◽

Mahmoud Saheb ◽

Hamza Turabieh ◽

Hamouda Chantar

Keyword(s):

Machine Learning ◽

Social Media ◽

Feature Selection ◽

Language Processing ◽

User Profile ◽

Vital Role ◽

Classification Model ◽

Fake News ◽

False Information ◽

Social Media Platforms

Fake or false information on social media platforms is a significant challenge that leads to deliberately misleading users due to the inclusion of rumors, propaganda, or deceptive information about a person, organization, or service. Twitter is one of the most widely used social media platforms, especially in the Arab region, where the number of users is steadily increasing, accompanied by an increase in the rate of fake news. This drew the attention of researchers to provide a safe online environment free of misleading information. This paper aims to propose a smart classification model for the early detection of fake news in Arabic tweets utilizing Natural Language Processing (NLP) techniques, Machine Learning (ML) models, and Harris Hawks Optimizer (HHO) as a wrapper-based feature selection approach. Arabic Twitter corpus composed of 1862 previously annotated tweets was utilized by this research to assess the efficiency of the proposed model. The Bag of Words (BoW) model is utilized using different term-weighting schemes for feature extraction. Eight well-known learning algorithms are investigated with varying combinations of features, including user-profile, content-based, and words-features. Reported results showed that the Logistic Regression (LR) with Term Frequency-Inverse Document Frequency (TF-IDF) model scores the best rank. Moreover, feature selection based on the binary HHO algorithm plays a vital role in reducing dimensionality, thereby enhancing the learning model’s performance for fake news detection. Interestingly, the proposed BHHO-LR model can yield a better enhancement of 5% compared with previous works on the same dataset.

Download Full-text

Weakly supervised deep learning for determining the prognostic value of 18F-FDG PET/CT in extranodal natural killer/T cell lymphoma, nasal type

European Journal of Nuclear Medicine and Molecular Imaging ◽

10.1007/s00259-021-05232-3 ◽

2021 ◽

Author(s):

Rui Guo ◽

Xiaobin Hu ◽

Haoming Song ◽

Pengpeng Xu ◽

Haoping Xu ◽

...

Keyword(s):

Deep Learning ◽

Fdg Pet ◽

Cell Lymphoma ◽

Training Set ◽

Test Set ◽

Natural Killer T Cell ◽

Pet Ct ◽

Weakly Supervised ◽

Fdg Pet Ct ◽

Killer T Cell

Abstract Purpose To develop a weakly supervised deep learning (WSDL) method that could utilize incomplete/missing survival data to predict the prognosis of extranodal natural killer/T cell lymphoma, nasal type (ENKTL) based on pretreatment 18F-FDG PET/CT results. Methods One hundred and sixty-seven patients with ENKTL who underwent pretreatment 18F-FDG PET/CT were retrospectively collected. Eighty-four patients were followed up for at least 2 years (training set = 64, test set = 20). A WSDL method was developed to enable the integration of the remaining 83 patients with incomplete/missing follow-up information in the training set. To test generalization, these data were derived from three types of scanners. Prediction similarity index (PSI) was derived from deep learning features of images. Its discriminative ability was calculated and compared with that of a conventional deep learning (CDL) method. Univariate and multivariate analyses helped explore the significance of PSI and clinical features. Results PSI achieved area under the curve scores of 0.9858 and 0.9946 (training set) and 0.8750 and 0.7344 (test set) in the prediction of progression-free survival (PFS) with the WSDL and CDL methods, respectively. PSI threshold of 1.0 could significantly differentiate the prognosis. In the test set, WSDL and CDL achieved prediction sensitivity, specificity, and accuracy of 87.50% and 62.50%, 83.33% and 83.33%, and 85.00% and 75.00%, respectively. Multivariate analysis confirmed PSI to be an independent significant predictor of PFS in both the methods. Conclusion The WSDL-based framework was more effective for extracting 18F-FDG PET/CT features and predicting the prognosis of ENKTL than the CDL method.

Download Full-text

Sustainable Tourism Empowered by Social Network Analysis to Gain a Competitive Edge at a Historic Site

Tourism and Hospitality ◽

10.3390/tourhosp2040022 ◽

2021 ◽

Vol 2 (4) ◽

pp. 332-346

Author(s):

Cathrine Linnes ◽

Holly Itoga ◽

Jerome Agrusa ◽

Joseph Lema

Keyword(s):

Social Media ◽

Social Network ◽

Social Network Analysis ◽

Network Analysis ◽

Tourism Industry ◽

Tourism Destination ◽

Strategic Direction ◽

Social Media Platforms ◽

Hospitality And Tourism Industry ◽

Effective Use

Social media has had a strong presence in many people’s lives over the last decade. In addition, social media platforms have allowed people to share opinions, provide advice on numerous factors, including where to visit, as well as to stay connected and maintain friendships. The hospitality and tourism industry, however, can make effective use of these powerful tools for marketing purposes, collaboration and information sharing, and service offerings. Reviewing social media followers’ behaviors and interests offers a wealth of information and valuable data for a variety of tourism organizations. This case study focuses on an analysis of the social networks applied to the fortified town of Fredrikstad in Norway. The data used in this research study were collected from the Facebook site of the tourist authority. The results of this research project demonstrate the strengths of applying a social network analysis to a dataset, which can aid in the strategic direction of a tourism destination. The conversations of the greatest interest can successfully be identified as well as the growth of the online network. This paper adds knowledge to the literature through the application of a social network analysis regarding the success of a tourism destination and its future potential.

Download Full-text

Detecting Damage Building Using Real-Time Crowdsourced Images And Transfer Learning

10.21203/rs.3.rs-964756/v1 ◽

2021 ◽

Author(s):

Gaurav Chachra ◽

Qingkai Kong ◽

Jim Huang ◽

Srujay Korlakunta ◽

Jennifer Grannen ◽

...

Keyword(s):

Social Media ◽

Deep Learning ◽

Real Time ◽

Transfer Learning ◽

Learning Model ◽

Research Community ◽

Rescue Work ◽

The Public ◽

Social Media Platforms ◽

Deep Learning Model

Abstract After significant earthquakes, we can see images posted on social media platforms by individuals and media agencies owing to the mass usage of smartphones these days. These images can be utilized to provide information about the shaking damage in the earthquake region both to the public and research community, and potentially to guide rescue work. This paper presents an automated way to extract the damaged building images after earthquakes from social media platforms such as Twitter and thus identify the particular user posts containing such images. Using transfer learning and ~6500 manually labelled images, we trained a deep learning model to recognize images with damaged buildings in the scene. The trained model achieved good performance when tested on newly acquired images of earthquakes at different locations and ran in near real-time on Twitter feed after the 2020 M7.0 earthquake in Turkey. Furthermore, to better understand how the model makes decisions, we also implemented the Grad-CAM method to visualize the important locations on the images that facilitate the decision.

Download Full-text

Deep Learning and Machine Learning Techniques for Analyzing Travelers' Online Reviews

10.4018/978-1-7998-8306-7.ch002 ◽

2022 ◽

pp. 20-39

Author(s):

Elliot Mbunge ◽

Benhildah Muchemwa

Keyword(s):

Machine Learning ◽

Social Media ◽

Deep Learning ◽

Hospitality Industry ◽

Learning Models ◽

Online Data ◽

Social Media Platforms ◽

Using Data ◽

Tourism And Hospitality Industry ◽

Tourism And Hospitality

Social media platforms play a tremendous role in the tourism and hospitality industry. Social media platforms are increasingly becoming a source of information. The complexity and increasing size of tourists' online data make it difficult to extract meaningful insights using traditional models. Therefore, this scoping and comprehensive review aimed to analyze machine learning and deep learning models applied to model tourism data. The study revealed that deep learning and machine learning models are used for forecasting and predicting tourism demand using data from search query data, Google trends, and social media platforms. Also, the study revealed that data-driven models can assist managers and policymakers in mapping and segmenting tourism hotspots and attractions and predicting revenue that is likely to be generated, exploring targeting marketing, segmenting tourists based on their spending patterns, lifestyle, and age group. However, hybrid deep learning models such as inceptionV3, MobilenetsV3, and YOLOv4 are not yet explored in the tourism and hospitality industry.

Download Full-text

Transitive Topic Modeling with Conversational Structure Context: Discovering Topics that are Most Popular in Online Discussions

International Journal of Semantic Computing ◽

10.1142/s1793351x20400103 ◽

2020 ◽

Vol 14 (02) ◽

pp. 273-293

Author(s):

Yingcheng Sun ◽

Richard Kolacinski ◽

Kenneth Loparo

Keyword(s):

Social Media ◽

Topic Modeling ◽

Topic Model ◽

Online Discussions ◽

Challenging Problem ◽

Topic Extraction ◽

Limited Success ◽

Social Media Platforms ◽

Improved Performance ◽

Conversational Structure

With the explosive growth of online discussions published everyday on social media platforms, comprehension and discovery of the most popular topics have become a challenging problem. Conventional topic models have had limited success in online discussions because the corpus is extremely sparse and noisy. To overcome their limitations, we use the discussion thread tree structure and propose a “popularity” metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the “transitivity” concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.

Download Full-text

Multiclass Classifier for P-Glycoprotein Substrates, Inhibitors, and Non-Active Compounds

Molecules ◽

10.3390/molecules24102006 ◽

2019 ◽

Vol 24 (10) ◽

pp. 2006 ◽

Cited By ~ 1

Author(s):

Liadys Mora Lagares ◽

Nikola Minovski ◽

Marjana Novič

Keyword(s):

In Silico ◽

Transmembrane Protein ◽

External Validation ◽

Assessment Process ◽

Classification Model ◽

Training Set ◽

Test Set ◽

Active Compounds ◽

P Glycoprotein ◽

Validation Set

P-glycoprotein (P-gp) is a transmembrane protein that actively transports a wide variety of chemically diverse compounds out of the cell. It is highly associated with the ADMET (absorption, distribution, metabolism, excretion and toxicity) properties of drugs/drug candidates and contributes to decreasing toxicity by eliminating compounds from cells, thereby preventing intracellular accumulation. Therefore, in the drug discovery and toxicological assessment process it is advisable to pay attention to whether a compound under development could be transported by P-gp or not. In this study, an in silico multiclass classification model capable of predicting the probability of a compound to interact with P-gp was developed using a counter-propagation artificial neural network (CP ANN) based on a set of 2D molecular descriptors, as well as an extensive dataset of 2512 compounds (1178 P-gp inhibitors, 477 P-gp substrates and 857 P-gp non-active compounds). The model provided a good classification performance, producing non error rate (NER) values of 0.93 for the training set and 0.85 for the test set, while the average precision (AvPr) was 0.93 for the training set and 0.87 for the test set. An external validation set of 385 compounds was used to challenge the model’s performance. On the external validation set the NER and AvPr values were 0.70 for both indices. We believe that this in silico classifier could be effectively used as a reliable virtual screening tool for identifying potential P-gp ligands.

Download Full-text

CID: Categorical Influencer Detection on microtext-based social media

Online Information Review ◽

10.1108/oir-02-2019-0062 ◽

2020 ◽

Vol 44 (5) ◽

pp. 1027-1055

Author(s):

Thanh-Tho Quan ◽

Duc-Trung Mai ◽

Thanh-Duy Tran

Keyword(s):

Social Media ◽

Deep Learning ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Learning Approaches ◽

Content Type ◽

Modeling Process ◽

Variational Autoencoder ◽

Media Channels

PurposeThis paper proposes an approach to identify categorical influencers (i.e. influencers is the person who is active in the targeted categories) in social media channels. Categorical influencers are important for media marketing but to automatically detect them remains a challenge.Design/methodology/approachWe deployed the emerging deep learning approaches. Precisely, we used word embedding to encode semantic information of words occurring in the common microtext of social media and used variational autoencoder (VAE) to approximate the topic modeling process, through which the active categories of influencers are automatically detected. We developed a system known as Categorical Influencer Detection (CID) to realize those ideas.FindingsThe approach of using VAE to simulate the Latent Dirichlet Allocation (LDA) process can effectively handle the task of topic modeling on the vast dataset of microtext on social media channels.Research limitations/implicationsThis work has two major contributions. The first one is the detection of topics on microtexts using deep learning approach. The second is the identification of categorical influencers in social media.Practical implicationsThis work can help brands to do digital marketing on social media effectively by approaching appropriate influencers. A real case study is given to illustrate it.Originality/valueIn this paper, we discuss an approach to automatically identify the active categories of influencers by performing topic detection from the microtext related to the influencers in social media channels. To do so, we use deep learning to approximate the topic modeling process of the conventional approaches (such as LDA).

Download Full-text