Customized Data Extraction and Effective Text Data Preprocessing Technique for Hydroxychloroquin Related Twitter Data

This chapter discusses mainly on dynamic behavior of railway passengers by using twitter data during regular and emergency situations. Social network data is providing dynamic and realistic data in various fields. As per the current chapter theme, if the twitter data of railway field is considered then it can be used for enhancement of railway services. Using this data, a comprehensive framework for modeling passenger tweets data which incorporates passenger opinions towards facilities provided by railways are discussed. The major issues elaborated regarding dynamic data extraction, preparation of twitter text content and text processing for finding sentiment levels is presented by two case studies; which are sentiment analysis on passenger's opinions about quality of railway services and identification of passenger travel demands using geotagged twitter data. The sentiment analysis ascertains passenger opinions towards facilities provided by railways either positive or negative based on their journey experiences.

Download Full-text

Research and Application of a Novel Combined Model Based on Multiobjective Optimization for Multistep-Ahead Electric Load Forecasting

Energies ◽

10.3390/en12101931 ◽

2019 ◽

Vol 12 (10) ◽

pp. 1931 ◽

Cited By ~ 3

Author(s):

Yechi Zhang ◽

Jianzhou Wang ◽

Haiyan Lu

Keyword(s):

Power Distribution ◽

Data Preprocessing ◽

Combined Model ◽

Forecasting Accuracy ◽

Forecasting Models ◽

Power Stations ◽

South Wales ◽

Model Combining ◽

Forecasting Performance ◽

Preprocessing Technique

Accurate forecasting of electric loads has a great impact on actual power generation, power distribution, and tariff pricing. Therefore, in recent years, scholars all over the world have been proposing more forecasting models aimed at improving forecasting performance; however, many of them are conventional forecasting models which do not take the limitations of individual predicting models or data preprocessing into account, leading to poor forecasting accuracy. In this study, to overcome these drawbacks, a novel model combining a data preprocessing technique, forecasting algorithms and an advanced optimization algorithm is developed. Thirty-minute electrical load data from power stations in New South Wales and Queensland, Australia, are used as the testing data to estimate our proposed model’s effectiveness. From experimental results, our proposed combined model shows absolute superiority in both forecasting accuracy and forecasting stability compared with other conventional forecasting models.

Download Full-text

Neural network model based on data preprocessing technique for foreign tourists prediction

10.1063/1.5043018 ◽

2018 ◽

Author(s):

Purwanto ◽

Sunardi ◽

Fenty Tristanti Julfia

Keyword(s):

Neural Network ◽

Network Model ◽

Neural Network Model ◽

Data Preprocessing ◽

Model Based ◽

Preprocessing Technique

Download Full-text

Enhanced Hybrid Data Preprocessing Technique for Eliminating Inconsistencies in the Diabetic Dataset to Improve Mining Results

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2018.7396 ◽

2018 ◽

Vol 15 (6) ◽

pp. 1999-2002

Author(s):

S Sathya ◽

A Rajesh

Keyword(s):

Data Preprocessing ◽

Hybrid Data ◽

Preprocessing Technique

Download Full-text

Sentiment Analysis of Twitter Data to Examine the Movement of Exchange Rate and Sensex

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9179 ◽

2020 ◽

Vol 17 (8) ◽

pp. 3323-3327

Author(s):

N. Chethan ◽

R. Sangeetha

Keyword(s):

Exchange Rate ◽

Sentiment Analysis ◽

Stock Price ◽

Text Data ◽

Price Movement ◽

Word Cloud ◽

Twitter Data ◽

R Programming ◽

Sentiment Score ◽

Key Events

In this paper tweets available on social media about USD/INR exchange rate, BSE Sensex, NSE Nifty have been collected and Sentiment Analysis using R programming has been performed. A sentiment score has been obtained for each of the sentences and also word cloud plot have been obtained. In this paper twitter feeds are collected using the keywords: USD/INR, #USD/INR, #BSE, #Sensex, #NSE. For the purpose of obtaining the tweets, R programming is used. In this study to obtain the word cloud plot, the sentiment has been classified across 8 categories viz Anticipation, anger, trust, surprise, sadness, joy, fear and disgust. On a day to day basis, Sentiment Analysis gives the overall sentiment on a given day stating if the sentiment for a given day is either Positive or Negative or whether it is Neutral. It also breaks down the tweets into various categories which help in identifying the moods of the investors not only by the sentiment but also by the number of tweets. Further, the word cloud plot offers a simple and effective way of capturing the key events or news which was discussed on Twitter. Sentiment analysis can be used effectively by investors to make a prediction of what direction the stock price movements will happen based on the sentiment prevailing in the market. This study also shows how R programming can be used to perform sentiment analysis on the stock price movement based on twitter feeds. Word cloud can be used to visualize text data in which the size of each word cloud denotes its significance.

Download Full-text

PERFECTIONOF CLASSIFICATION ACCURACY IN TEXT CATEGORIZATION

International Journal of Advanced Research ◽

10.21474/ijar01/13437 ◽

2021 ◽

Vol 9 (09) ◽

pp. 484-488

Author(s):

Rajeev Tripathi ◽

Keyword(s):

Sentiment Analysis ◽

Text Classification ◽

Classification Accuracy ◽

Text Categorization ◽

Classification Model ◽

Text Data ◽

Twitter Data ◽

Long Time ◽

Google Alerts ◽

Email Spam

Problems and strategies for text classification have already been known for a long time. Theyre widely utilised by companies like Google and Yahoo for email spam screening, sentiment analysis of Twitter data, and automatic news categories in Google alerts. Were still working on getting the findings to be as accurate as possible. When dealing with large amounts of text data, however, the models performance and accuracy become a difficulty. The type of words utilised in the corpus and the type of features produced for classification have a big impact on the performance of a text classification model.

Download Full-text

Importance of Text Data Preprocessing & Implementation in RapidMiner

Proceedings of the First International Conference on Information Technology and Knowledge Management ◽

10.15439/2017km46 ◽

2018 ◽

Cited By ~ 2

Author(s):

Vaishali Kalra ◽

Rashmi Aggarwal

Keyword(s):

Data Preprocessing ◽

Text Data

Download Full-text

A Highly Effective Data Preprocessing in Side-Channel Attack Using Empirical Mode Decomposition

Security and Communication Networks ◽

10.1155/2019/6124165 ◽

2019 ◽

Vol 2019 ◽

pp. 1-10

Author(s):

ShuaiWei Zhang ◽

XiaoYuan Yang ◽

Lin Chen ◽

Weidong Zhong

Keyword(s):

Noise Reduction ◽

Empirical Mode Decomposition ◽

Data Preprocessing ◽

Side Channel ◽

Intrinsic Mode Functions ◽

Side Channel Attack ◽

Time Frequency ◽

Mode Decomposition ◽

Highly Effective ◽

Preprocessing Technique

Side-channel attacks on cryptographic chips in embedded systems have been attracting considerable interest from the field of information security in recent years. Many research studies have contributed to improve the side-channel attack efficiency, in which most of the works assume the noise of the encryption signal has a linear stable Gaussian distribution. However, their performances of noise reduction were moderate. Thus, in this paper, we describe a highly effective data-preprocessing technique for noise reduction based on empirical mode decomposition (EMD) and demonstrate its application for a side-channel attack. EMD is a time-frequency analysis method for nonlinear unstable signal processing, which requires no prior knowledge about the cryptographic chip. During the procedure of data preprocessing, the collected traces will be self-adaptably decomposed into sum of several intrinsic mode functions (IMF) based on their own characteristics. And then, meaningful IMF will be reorganized to reduce its noise and increase the efficiency of key recovering through correlation power analysis attack. This technique decreases the total number of traces for key recovering by 17.7%, compared to traditional attack methods, which is verified by attack efficiency analysis of the SM4 block cipher algorithm on the FPGA power consumption analysis platform.

Download Full-text