scholarly journals VARTTA: A Visual Analytics System for Making Sense of Real-Time Twitter Data

Data ◽  
2020 ◽  
Vol 5 (1) ◽  
pp. 20
Author(s):  
Amir Haghighati ◽  
Kamran Sedig

Through social media platforms, massive amounts of data are being produced. As a microblogging social media platform, Twitter enables its users to post short updates as “tweets” on an unprecedented scale. Once analyzed using machine learning (ML) techniques and in aggregate, Twitter data can be an invaluable resource for gaining insight into different domains of discussion and public opinion. However, when applied to real-time data streams, due to covariate shifts in the data (i.e., changes in the distributions of the inputs of ML algorithms), existing ML approaches result in different types of biases and provide uncertain outputs. In this paper, we describe VARTTA (Visual Analytics for Real-Time Twitter datA), a visual analytics system that combines data visualizations, human-data interaction, and ML algorithms to help users monitor, analyze, and make sense of the streams of tweets in a real-time manner. As a case study, we demonstrate the use of VARTTA in political discussions. VARTTA not only provides users with powerful analytical tools, but also enables them to diagnose and to heuristically suggest fixes for the errors in the outcome, resulting in a more detailed understanding of the tweets. Finally, we outline several issues to be considered while designing other similar visual analytics systems.

The rise of social media platforms like Twitter and the increasing adoption by people in order to stay connected provide a large source of data to perform analysis based on the various trends, events and even various personalities. Such analysis also provides insight into a person’s likes and inclinations in real time independent of the data size. Several techniques have been created to retrieve such data however the most efficient technique is clustering. This paper provides an overview of the algorithms of the various clustering methods as well as looking at their efficiency in determining trending information. The clustered data may be further classified by topics for real time analysis on a large dynamic data set. In this paper, data classification is performed and analyzed for flaws followed by another classification on the same data set.


2020 ◽  
Author(s):  
Harika Kudarvalli ◽  
Jinan Fiaidhi

Spreading fake news has become a serious issue in the current social media world. It is broadcasted with dishonest intentions to mislead people. This has caused many unfortunate incidents in different countries. The most recent one was the latest presidential elections where the voters were mis lead to support a leader. Twitter is one of the most popular social media platforms where users look up for real time news. We extracted real time data on multiple domains through twitter and performed analysis. The dataset was preprocessed and user_verified column played a vital role. Multiple machine algorithms were then performed on the extracted features from preprocessed dataset. Logistic Regression and Support Vector Machine had promising results with both above 92% accuracy. Naive Bayes and Long-Short Term memory didn't achieve desired accuracies. The model can also be applied to images and videos for better detection of fake news.


Sentiment can be described in the form of any type of approach, thought or verdict which results because of the occurrence of certain emotions. This approach is also known as opinion extraction. In this approach, emotions of different peoples with respect to meticulous rudiments are investigated. For the attainment of opinion related data, social media platforms are the best origins. Twitter may be recognized as a social media platform which is socially accessible to numerous followers. When these followers post some message on twitter, then this is recognized as tweet. The sentiment of twitter data can be analyzed with the feature extraction and classification approach. The hybrid classification is designed in this work which is the combination of KNN and random forest. The KNN classifier extract features of the dataset and random forest will classify data. The approach of hybrid classification is applied in this research work for the sentiment analysis. The performance of the proposed model is tested in terms of accuracy and execution time.


2021 ◽  
Vol 1 (2) ◽  
Author(s):  
Dilmini Rathnayaka ◽  
Pubudu K.P.N Jayasena ◽  
Iraj Ratnayake

Sentiment analysis mainly supports sorting out the polarity and provides valuable information with the use of raw data in social media platforms. Many fields like health, business, and security require real-time data analysis for instant decision-making situations.Since Twitter is considered a popular social media platform to collect data easily, this paper is considering data analysis methods of Twitter data, real-time Twitter data analysis based on geo-location. Twitter data classification and analysis can be done with the use of diverse algorithms and deciding the most appropriate algorithm for data analysis, can be accomplished by implementing and testing these diverse algorithms.This paper is discussing the major description of sentiment analysis, data collection methods, data pre-processing, feature extraction, and sentiment analysis methods related to Twitter data. Real-time data analysis arises as a major method of analyzing the data available online and the real-time Twitter data analysis process is described throughout this paper. Several methods of classifying the polarized Twitter data are discussed within the paper while depicting a proposed method of Twitter data analyzing algorithm. Location-based Twitter data analysis is another crucial aspect of sentiment analyses, that enables data sorting according to geo-location, and this paper describes the way of analyzing Twitter data based on geo-location. Further, a comparison about several sentiment analysis algorithms used by previous researchers has been reported and finally, a conclusion has been provided.


2018 ◽  
Vol 6 (1) ◽  
pp. 46-58
Author(s):  
Halvdan Haugsbakken

In recent years, several Norwegian public organizations have introduced Enterprise Social Media Platforms. The rationale for their implementation pertains to a goal of improving internal communications and work processes in organizational life. Such objectives can be attained on the condition that employees adopt the platform and embrace the practice of sharing. Although sharing work on Enterprise Social Media Platforms can bring benefits, making sense of the practice of sharing constitutes a challenge. In this regard, the paper performs an analysis on a case whereby an Enterprise Social Media Platform was introduced in a Norwegian public organization. The analytical focus is on the challenges and experiences of making sense of the practice of sharing. The research results show that users faced challenges in making sense of sharing. The paper indicates that sharing is interpreted and performed as an informing practice, which results in an information overload problem and causes users to become disengaged. The study suggests a continued need for the application of theoretical lenses that emphasize interpretation and practice in the implementation of new digital technologies in organizations.  


2020 ◽  
Author(s):  
Harika Kudarvalli ◽  
Jinan Fiaidhi

Spreading fake news has become a serious issue in the current social media world. It is broadcasted with dishonest intentions to mislead people. This has caused many unfortunate incidents in different countries. The most recent one was the latest presidential elections where the voters were mis lead to support a leader. Twitter is one of the most popular social media platforms where users look up for real time news. We extracted real time data on multiple domains through twitter and performed analysis. The dataset was preprocessed and user_verified column played a vital role. Multiple machine algorithms were then performed on the extracted features from preprocessed dataset. Logistic Regression and Support Vector Machine had promising results with both above 92% accuracy. Naive Bayes and Long-Short Term memory didn't achieve desired accuracies. The model can also be applied to images and videos for better detection of fake news.


2020 ◽  
Vol 8 (6) ◽  
pp. 1042-1044

Social media has developed drastically over the years. These days, individuals from all around the globe utilize online networking destinations to share data and information. Twitter is a well known communication site where users update information or messages known as tweets. Users share their day by day lives, post their opinions on everything, for example, brands and places. Various purchasers and advertisers utilize these tweets to accumulate bits of knowledge of their items and opinions on them. The aim of this paper is to exhibit a model that can perform sentiment analysis of real-time data collected from twitter and classify the tweets into positive, negative or neutral based on the sentiment expressed in them.


2020 ◽  
Author(s):  
Emily Chen ◽  
Kristina Lerman ◽  
Emilio Ferrara

BACKGROUND At the time of this writing, the coronavirus disease (COVID-19) pandemic outbreak has already put tremendous strain on many countries' citizens, resources, and economies around the world. Social distancing measures, travel bans, self-quarantines, and business closures are changing the very fabric of societies worldwide. With people forced out of public spaces, much of the conversation about these phenomena now occurs online on social media platforms like Twitter. OBJECTIVE In this paper, we describe a multilingual COVID-19 Twitter data set that we are making available to the research community via our COVID-19-TweetIDs GitHub repository. METHODS We started this ongoing data collection on January 28, 2020, leveraging Twitter’s streaming application programming interface (API) and Tweepy to follow certain keywords and accounts that were trending at the time data collection began. We used Twitter’s search API to query for past tweets, resulting in the earliest tweets in our collection dating back to January 21, 2020. RESULTS Since the inception of our collection, we have actively maintained and updated our GitHub repository on a weekly basis. We have published over 123 million tweets, with over 60% of the tweets in English. This paper also presents basic statistics that show that Twitter activity responds and reacts to COVID-19-related events. CONCLUSIONS It is our hope that our contribution will enable the study of online conversation dynamics in the context of a planetary-scale epidemic outbreak of unprecedented proportions and implications. This data set could also help track COVID-19-related misinformation and unverified rumors or enable the understanding of fear and panic—and undoubtedly more.


Author(s):  
Tariq Soussan ◽  
Marcello Trovati

Social media platforms are widely used to share opinions, facts, and real-time general information on specific events. This chapter will focus on discussing and presenting data analytics approaches which combine a variety of techniques based on text mining, machine learning, network analysis, and mathematical modelling to assess real-time data extracted from social media and other suitable data related to pandemic outbreaks. The use of real-time insights regarding pandemic outbreaks provides a valuable tool to inform and validate existing modelling techniques and methods. Furthermore, this would also support the discovering process of actionable information to facilitate the decision-making process by enhancing the most informed and appropriate decision, based on the available data. The chapter will also focus on the visualisation and usability of the insight identified during the process to address a non-technical audience.


2016 ◽  
Vol 18 (3) ◽  
pp. 255-276 ◽  
Author(s):  
Martin Sykora

Purpose The purpose of this paper is to explore implicit crowdsourcing, leveraging social media in real-time scenarios for intelligent systems. Design/methodology/approach A case study using an illustrative example system, which systematically used a custom social media platform for automated financial news analysis and summarisation was developed, evaluated and discussed. Literature review related to crowdsourcing and collective intelligence in intelligent systems was also conducted to provide context and to further explore the case study. Findings It was shown how, and that useful intelligent systems can be constructed from appropriately engineered custom social media platforms which are integrated with intelligent automated processes. A recent inter-rater agreement measure for evaluating quality of implicit crowd contributions was also explored and found to be of value. Practical implications This paper argues that when social media platforms are closely integrated with other automated processes into a single system, this may provide a highly worthwhile online and real-time approach to intelligent systems through implicit crowdsourcing. Key practical issues, such as achieving high-quality crowd contributions, challenges of efficient workflows and real-time crowd integration into intelligent systems, were discussed. Important ethical and related considerations were also covered. Originality/value A contribution to existing theory was made by proposing how social media Web platforms may benefit crowdsourcing. As opposed to traditional crowdsourcing platforms, the presented approach and example system has a set of social elements that encourages implicit crowdsourcing. Instances of crowdsourcing with existing social media, such as Twitter, often also called crowd piggybacking, have been used in the past; however, using an entirely custom-built social media system for implicit crowdsourcing is relatively novel and has several advantages. Some of the discussion in context of intelligent systems construction are novel and contribute to the existing body of literature in this field.


Sign in / Sign up

Export Citation Format

Share Document