Detecting Fake News using Machine Learning Algorithms

10.36227/techrxiv.12089133.v1 ◽

2020 ◽

Author(s):

Harika Kudarvalli ◽

Jinan Fiaidhi

Keyword(s):

Social Media ◽

Real Time ◽

Short Term Memory ◽

Vital Role ◽

Machine Learning Algorithms ◽

Support Vector ◽

Fake News ◽

Time Data ◽

Time News ◽

Social Media Platforms

Spreading fake news has become a serious issue in the current social media world. It is broadcasted with dishonest intentions to mislead people. This has caused many unfortunate incidents in different countries. The most recent one was the latest presidential elections where the voters were mis lead to support a leader. Twitter is one of the most popular social media platforms where users look up for real time news. We extracted real time data on multiple domains through twitter and performed analysis. The dataset was preprocessed and user_verified column played a vital role. Multiple machine algorithms were then performed on the extracted features from preprocessed dataset. Logistic Regression and Support Vector Machine had promising results with both above 92% accuracy. Naive Bayes and Long-Short Term memory didn't achieve desired accuracies. The model can also be applied to images and videos for better detection of fake news.

Download Full-text

Intelligent Detection of False Information in Arabic Tweets Utilizing Hybrid Harris Hawks Based Feature Selection and Machine Learning Models

Symmetry ◽

10.3390/sym13040556 ◽

2021 ◽

Vol 13 (4) ◽

pp. 556

Author(s):

Thaer Thaher ◽

Mahmoud Saheb ◽

Hamza Turabieh ◽

Hamouda Chantar

Keyword(s):

Machine Learning ◽

Social Media ◽

Feature Selection ◽

Language Processing ◽

User Profile ◽

Vital Role ◽

Classification Model ◽

Fake News ◽

False Information ◽

Social Media Platforms

Fake or false information on social media platforms is a significant challenge that leads to deliberately misleading users due to the inclusion of rumors, propaganda, or deceptive information about a person, organization, or service. Twitter is one of the most widely used social media platforms, especially in the Arab region, where the number of users is steadily increasing, accompanied by an increase in the rate of fake news. This drew the attention of researchers to provide a safe online environment free of misleading information. This paper aims to propose a smart classification model for the early detection of fake news in Arabic tweets utilizing Natural Language Processing (NLP) techniques, Machine Learning (ML) models, and Harris Hawks Optimizer (HHO) as a wrapper-based feature selection approach. Arabic Twitter corpus composed of 1862 previously annotated tweets was utilized by this research to assess the efficiency of the proposed model. The Bag of Words (BoW) model is utilized using different term-weighting schemes for feature extraction. Eight well-known learning algorithms are investigated with varying combinations of features, including user-profile, content-based, and words-features. Reported results showed that the Logistic Regression (LR) with Term Frequency-Inverse Document Frequency (TF-IDF) model scores the best rank. Moreover, feature selection based on the binary HHO algorithm plays a vital role in reducing dimensionality, thereby enhancing the learning model’s performance for fake news detection. Interestingly, the proposed BHHO-LR model can yield a better enhancement of 5% compared with previous works on the same dataset.

Download Full-text

Detecting Fake News Over Job Posts via Bi-Directional Long Short-Term Memory (BIDLSTM)

International Journal of Web-Based Learning and Teaching Technologies ◽

10.4018/ijwltt.287096 ◽

2021 ◽

Vol 16 (6) ◽

pp. 1-18

Author(s):

T. V. Divya ◽

Barnali Gupta Banik

Keyword(s):

Social Media ◽

Performance Metrics ◽

Short Term Memory ◽

Word Embedding ◽

Support Vector ◽

Fake News ◽

Short Term ◽

Term Memory ◽

Online Social Media ◽

Long Short Term Memory

Fake news detection on job advertisements has grabbed the attention of many researchers over past decade. Various classifiers such as Support Vector Machine (SVM), XGBoost Classifier and Random Forest (RF) methods are greatly utilized for fake and real news detection pertaining to job advertisement posts in social media. Bi-Directional Long Short-Term Memory (Bi-LSTM) classifier is greatly utilized for learning word representations in lower-dimensional vector space and learning significant words word embedding or terms revealed through Word embedding algorithm. The fake news detection is greatly achieved along with real news on job post from online social media is achieved by Bi-LSTM classifier and thereby evaluating corresponding performance. The performance metrics such as Precision, Recall, F1-score, and Accuracy are assessed for effectiveness by fraudulency based on job posts. The outcome infers the effectiveness and prominence of features for detecting false news. .

Download Full-text

Smart Cardiac Framework for an Early Detection of Cardiac Arrest Condition and Risk

Frontiers in Public Health ◽

10.3389/fpubh.2021.762303 ◽

2021 ◽

Vol 9 ◽

Author(s):

Apeksha Shah ◽

Swati Ahirrao ◽

Sharnil Pandya ◽

Ketan Kotecha ◽

Suresh Rathod

Keyword(s):

Cardiac Arrest ◽

Real Time ◽

Cox Regression ◽

Risk Classification ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Time Data ◽

Extreme Gradient Boosting ◽

Real Time Data

Cardiovascular disease (CVD) is considered to be one of the most epidemic diseases in the world today. Predicting CVDs, such as cardiac arrest, is a difficult task in the area of healthcare. The healthcare industry has a vast collection of datasets for analysis and prediction purposes. Somehow, the predictions made on these publicly available datasets may be erroneous. To make the prediction accurate, real-time data need to be collected. This study collected real-time data using sensors and stored it on a cloud computing platform, such as Google Firebase. The acquired data is then classified using six machine-learning algorithms: Artificial Neural Network (ANN), Random Forest Classifier (RFC), Gradient Boost Extreme Gradient Boosting (XGBoost) classifier, Support Vector Machine (SVM), Naïve Bayes (NB), and Decision Tree (DT). Furthermore, we have presented two novel gender-based risk classification and age-wise risk classification approach in the undertaken study. The presented approaches have used Kaplan-Meier and Cox regression survival analysis methodologies for risk detection and classification. The presented approaches also assist health experts in identifying the risk probability risk and the 10-year risk score prediction. The proposed system is an economical alternative to the existing system due to its low cost. The outcome obtained shows an enhanced level of performance with an overall accuracy of 98% using DT on our collected dataset for cardiac risk prediction. We also introduced two risk classification models for gender- and age-wise people to detect their survival probability. The outcome of the proposed model shows accurate probability in both classes.

Download Full-text

Identification of user’s credibility on twitter social networks

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v24.i1.pp554-563 ◽

2021 ◽

Vol 24 (1) ◽

pp. 554

Author(s):

Faraz Ahmad ◽

S. A. M. Rizvi

Keyword(s):

Social Networks ◽

Social Media ◽

Machine Learning Algorithms ◽

Support Vector ◽

Nearest Neighbours ◽

The Social ◽

Class Labelling ◽

Social Media Platforms ◽

Media Platform ◽

The Given

<p>Twitter is one of the most influential social media platforms, facilitates the spreading of information in the form of text, images, and videos. However, the credibility of posted content is still trailed by an interrogation mark. Introduction: In this paper, a model has been developed for finding the user’s credibility based on the tweets which they had posted on Twitter social networks. The model consists of machine learning algorithms that assist not only in categorizing the tweets into credibility classes but also helps in finding user’s credibility ratings on the social media platform. Methods and results: The dataset and associated features of 100,000 tweets were extracted and pre-processed. Furthermore, the credibility class labelling of tweets was performed using four different human annotators. The meaning cloud and natural language understanding platforms were used for calculating the polarity, sentiment, and emotions score. The K-Means algorithm was applied for finding the clusters of tweets based on features set, whereas, random forest, support vector machine, naïve Bayes, K-nearest-neighbours (KNN), J48 decision tree, and multilayer perceptron were used for classifying the tweets into credibility classes. A significant level of accuracy, precision, and recall was provided by all the classifiers for all the given credibility classes.</p>

Download Full-text

Redis-Based Messaging Queue and Cache-Enabled Parallel Processing Social Media Analytics Framework

The Computer Journal ◽

10.1093/comjnl/bxaa114 ◽

2020 ◽

Author(s):

Ravindra Kumar Singh ◽

Harsh Kumar Verma

Keyword(s):

Machine Learning ◽

Social Media ◽

Real Time ◽

Data Analytics ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Social Media Analytics ◽

Data Engineering ◽

Extreme Gradient Boosting

Abstract The extensive usage of social media polarity analysis claims the need for real-time analytics and runtime outcomes on dashboards. In data analytics, only 30% of the time is consumed in modeling and evaluation stages and 70% is consumed in data engineering tasks. There are lots of machine learning algorithms to achieve a desirable outcome in prediction points of view, but they lack in handling data and their transformation so-called data engineering tasks, and reducing its time remained still challenging. The contribution of this research paper is to encounter the mentioned challenges by presenting a parallelly, scalable, effective, responsive and fault-tolerant framework to perform end-to-end data analytics tasks in real-time and batch-processing manner. An experimental analysis on Twitter posts supported the claims and signifies the benefits of parallelism of data processing units. This research has highlighted the importance of processing mentioned URLs and embedded images along with post content to boost the prediction efficiency. Furthermore, this research additionally provided a comparison of naive Bayes, support vector machines, extreme gradient boosting and long short-term memory (LSTM) machine learning techniques for sentiment analysis on Twitter posts and concluded LSTM as the most effective technique in this regard.

Download Full-text

VARTTA: A Visual Analytics System for Making Sense of Real-Time Twitter Data

Data ◽

10.3390/data5010020 ◽

2020 ◽

Vol 5 (1) ◽

pp. 20

Author(s):

Amir Haghighati ◽

Kamran Sedig

Keyword(s):

Social Media ◽

Real Time ◽

Visual Analytics ◽

Time Data ◽

Making Sense ◽

Twitter Data ◽

Social Media Platforms ◽

Media Platform ◽

Analytical Tools ◽

Data Visualizations

Through social media platforms, massive amounts of data are being produced. As a microblogging social media platform, Twitter enables its users to post short updates as “tweets” on an unprecedented scale. Once analyzed using machine learning (ML) techniques and in aggregate, Twitter data can be an invaluable resource for gaining insight into different domains of discussion and public opinion. However, when applied to real-time data streams, due to covariate shifts in the data (i.e., changes in the distributions of the inputs of ML algorithms), existing ML approaches result in different types of biases and provide uncertain outputs. In this paper, we describe VARTTA (Visual Analytics for Real-Time Twitter datA), a visual analytics system that combines data visualizations, human-data interaction, and ML algorithms to help users monitor, analyze, and make sense of the streams of tweets in a real-time manner. As a case study, we demonstrate the use of VARTTA in political discussions. VARTTA not only provides users with powerful analytical tools, but also enables them to diagnose and to heuristically suggest fixes for the errors in the outcome, resulting in a more detailed understanding of the tweets. Finally, we outline several issues to be considered while designing other similar visual analytics systems.

Download Full-text

A Framework for Enhancing Real-time Social Media Data to Improve Disaster Management Process

Proceedings of the ICA ◽

10.5194/ica-proc-1-101-2018 ◽

2018 ◽

Vol 1 ◽

pp. 1-5

Author(s):

Syed Attique Shah ◽

Dursun Zafer Şeker ◽

Hande Demirel

Keyword(s):

Social Media ◽

Real Time ◽

Disaster Management ◽

Management System ◽

Design Science ◽

Vital Role ◽

Time Data ◽

Social Media Data ◽

Media Data ◽

Disaster Management System

Social Media datasets are playing a vital role to provide information that can support decision making in nearly all domains of technology. It is due to the fact that social media is a quick and economical approach for data collection from public through methods like crowdsourcing. It is already proved by existing research that in case of any disaster (natural or man-made) the information extracted from Social Media sites is very critical to Disaster Management Systems for response and reconstruction. This study comprises of two components, the first part proposes a framework that provides updated and filtered real time input data for the disaster management system through social media and the second part consists of a designed web user API for a structured and defined real time data input process. This study contributes to the discipline of design science for the information systems domain. The aim of this study is to propose a framework that can filter and organize data from the unstructured social media sources through recognized methods and to bring this retrieved data to the same level as that of taken through a structured and predefined mechanism of a web API. Both components are designed to a level such that they can potentially collaborate and produce updated information for a disaster management system to carry out accurate and effective.

Download Full-text

Machine Learning Algorithms for Disease Prediction Using IoT Environment

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8914.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 4303-4307 ◽

Cited By ~ 1

Keyword(s):

Real Time ◽

Healthcare System ◽

Learning Algorithms ◽

Vital Role ◽

Machine Learning Algorithms ◽

Sensor Nodes ◽

Proper Solution ◽

Time Data ◽

Supervised Learning Algorithms ◽

Rate Sensor

In the most advanced healthcare application environment, the use of IoT technologies brings convenience to medical professionals and patients, since they have applied to health areas. In IoT, Body sensor network (BSN) technology plays a vital role in the healthcare system where lightweight wireless and low-powered sensor nodes used for monitoring the patients. In this paper, we propose a healthcare system using IoT and BSN technology. This system includes various sensors like pulse rate sensor, temperature sensor, and blood pressure sensor. These sensors sense the parameters and send the data to the controller. According to the conditions, the buzzer will on as temperature exceeds the given range. It carries the sensed data to the LCD to display on it. At the same time, data send to doctors using the internet, so that they can give quick and proper solution in real-time. Many patients suffer because of not getting the timely and appropriate solution and help for their problem. Proposed system hence offers the real-time solution and help in case of emergency. This system is convenient; therefore, a person can carry it with them. Thus continuous health checking is possible. The system also predicts the disease for a particular patient base on current reading using various supervised learning algorithms

Download Full-text

FNDNLSTM

10.4018/978-1-7998-8061-5.ch012 ◽

2021 ◽

pp. 218-232

Author(s):

Steni Mol T. S. ◽

P. S. Sreeja

Keyword(s):

Social Media ◽

Short Term Memory ◽

Classification Model ◽

Fake News ◽

Term Memory ◽

The Social ◽

Social Media Platforms ◽

Long Short Term Memory ◽

False News ◽

Model Technique

In the present scenario, social media platforms have become more accessible sources for news. Social media posts need not always be truthful information. These posts are widely disseminated with little regard for the truth. It is necessary to realize the evolution and origins of false news patterns in order to improve the progression of quality news and combat fake news on social media. This chapter discusses the most frequently used social media (Facebook) and the type of information exchanged to solve this issue. This chapter proposes a novel framework based on the “Fake News Detection Network – Long Short-Term Memory” (FNDN-LSTM) model to discriminate between fake news and real news. The social media news dataset is to be taken and preprocessed using the TF BERT model (technique). The preprocessed data will be passed through a feature selection model, which will select the significant features for classification. The selected features will be passed through the FNDN-LSTM classification model for identifying fake news.

Download Full-text