More emojis, less :) The competition for paralinguistic function in microblog writing

Author(s):  
Umashanthi Pavalanathan ◽  
Jacob Eisenstein

Many non-standard elements of ‘netspeak’ writing can be viewed as efforts to replicate the linguistic role played by nonverbal modalities in speech, conveying contextual information such as affect and interpersonal stance. Recently, a new non-standard communicative tool has emerged in online writing: emojis. These unicode characters contain a standardized set of pictographs, some of which are visually similar to well-known emoticons. Do emojis play the same linguistic role as emoticons and other ASCII-based writing innovations? If so, might the introduction of emojis eventually displace the earlier, user-created forms of contextual expression? Using a matching approach to causal statistical inference, we show that as social media users adopt emojis, they dramatically reduce their use of emoticons, suggesting that these linguistic resources compete for the same communicative function. Furthermore, we demonstrate that the adoption of emojis leads to a corresponding increase in the use of standard spellings, suggesting that all forms of non-standard writing are losing out in a competition with emojis. Finally, we identify specific textual features that make some emoticons especially likely to be replaced by emojis.

2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Suppawong Tuarob ◽  
Poom Wettayakorn ◽  
Ponpat Phetchai ◽  
Siripong Traivijitkhun ◽  
Sunghoon Lim ◽  
...  

AbstractThe explosion of online information with the recent advent of digital technology in information processing, information storing, information sharing, natural language processing, and text mining techniques has enabled stock investors to uncover market movement and volatility from heterogeneous content. For example, a typical stock market investor reads the news, explores market sentiment, and analyzes technical details in order to make a sound decision prior to purchasing or selling a particular company’s stock. However, capturing a dynamic stock market trend is challenging owing to high fluctuation and the non-stationary nature of the stock market. Although existing studies have attempted to enhance stock prediction, few have provided a complete decision-support system for investors to retrieve real-time data from multiple sources and extract insightful information for sound decision-making. To address the above challenge, we propose a unified solution for data collection, analysis, and visualization in real-time stock market prediction to retrieve and process relevant financial data from news articles, social media, and company technical information. We aim to provide not only useful information for stock investors but also meaningful visualization that enables investors to effectively interpret storyline events affecting stock prices. Specifically, we utilize an ensemble stacking of diversified machine-learning-based estimators and innovative contextual feature engineering to predict the next day’s stock prices. Experiment results show that our proposed stock forecasting method outperforms a traditional baseline with an average mean absolute percentage error of 0.93. Our findings confirm that leveraging an ensemble scheme of machine learning methods with contextual information improves stock prediction performance. Finally, our study could be further extended to a wide variety of innovative financial applications that seek to incorporate external insight from contextual information such as large-scale online news articles and social media data.


2020 ◽  
Vol 176 ◽  
pp. 612-621
Author(s):  
Meisy Fortunatus ◽  
Patricia Anthony ◽  
Stuart Charters

2021 ◽  
Vol 2 (4) ◽  
pp. 418-433
Author(s):  
Nabi Rezvani ◽  
Amin Beheshti

Cyberbullying detection is a rising research topic due to its paramount impact on social media users, especially youngsters and adolescents. While there has been an enormous amount of progress in utilising efficient machine learning and NLP techniques for tackling this task, recent methods have not fully addressed contextualizing the textual content to the highest possible extent. The textual content of social media posts and comments is normally long, noisy and mixed with lots of irrelevant tokens and characters, and therefore utilizing an attention-based approach that can focus on more relevant parts of the text can be quite pertinent. Moreover, social media information is normally multi-modal in nature and may contain various metadata and contextual information that can contribute to enhancing the Cyberbullying prediction system. In this research, we propose a novel machine learning method that, (i) fine tunes a variant of BERT, a deep attention-based language model, which is capable of detecting patterns in long and noisy bodies of text; (ii)~extracts contextual information from multiple sources including metadata information, images and even external knowledge sources and uses these features to complement the learner model; and (iii) efficiently combines textual and contextual features using boosting and a wide-and-deep architecture. We compare our proposed method with state-of-the-art methods and highlight how our approach significantly outperforming the quality of results compared to those methods in most cases.


2013 ◽  
Vol 49 (1) ◽  
pp. 222-247 ◽  
Author(s):  
Flavio Figueiredo ◽  
Henrique Pinto ◽  
Fabiano Belém ◽  
Jussara Almeida ◽  
Marcos Gonçalves ◽  
...  

2016 ◽  
Vol 5 (2) ◽  
pp. 339-355 ◽  
Author(s):  
Kim Holmberg ◽  
Johan Bastubacka ◽  
Mike Thelwall

This study investigates religious communication in social media by analyzing messages sent to God on Twitter. More specifically, the goal of this research is to map and analyze the various contexts in which God is addressed on Twitter, and how the tweets may reflect religious beliefs, ritual functions, and life issues. Using content analysis techniques and phenomenography, tweets addressing God were investigated. The results of this descriptive and indicative study show that religion and religiosity are communicated on Twitter in a manner that creates a unique sphere in which praise and profanities coexist. The tweets in the sample vary a great deal in their content and communicative function, ranging from profanities to prayers and from requests to win the lottery to conversations with and comments about God. Some tweets address God as a form of humour or satire, cursing, or otherwise without any deeper religious intention, while other tweets are apparently genuine messages directed to the transcendent, prayers, with which the senders want to show and share their belief with their followers on Twitter.


2020 ◽  
Vol 8 (2) ◽  
pp. 169
Author(s):  
Afiyati Afiyati ◽  
Azhari Azhari ◽  
Anny Kartika Sari ◽  
Abdul Karim

Nowadays, sarcasm recognition and detection simplified with various domains knowledge, among others, computer science, social science, psychology, mathematics, and many more. This article aims to explain trends in sentiment analysis especially sarcasm detection in the last ten years and its direction in the future. We review journals with the title’s keyword “sarcasm” and published from the year 2008 until 2018. The articles were classified based on the most frequently discussed topics among others: the dataset, pre-processing, annotations, approaches, features, context, and methods used. The significant increase in the number of articles on “sarcasm” in recent years indicates that research in this area still has enormous opportunities. The research about “sarcasm” also became very interesting because only a few researchers offer solutions for unstructured language. Some hybrid approaches using classification and feature extraction are used to identify the sarcasm sentence using deep learning models. This article will provide a further explanation of the most widely used algorithms for sarcasm detection with object social media. At the end of this article also shown that the critical aspect of research on sarcasm sentence that could be done in the future is dataset usage with various languages that cover unstructured data problem with contextual information will effectively detect sarcasm sentence and will improve the existing performance.


Author(s):  
Junfang Gong ◽  
Runjia Li ◽  
Hong Yao ◽  
Xiaojun Kang ◽  
Shengwen Li

The human daily activity category represents individual lifestyle and pattern, such as sports and shopping, which reflect personal habits, lifestyle, and preferences and are of great value for human health and many other application fields. Currently, compared to questionnaires, social media as a sensor provides low-cost and easy-to-access data sources, providing new opportunities for obtaining human daily activity category data. However, there are still some challenges to accurately recognizing posts because existing studies ignore contextual information or word order in posts and remain unsatisfactory for capturing the activity semantics of words. To address this problem, we propose a general model for recognizing the human activity category based on deep learning. This model not only describes how to extract a sequence of higher-level word phrase representations in posts based on the deep learning sequence model but also how to integrate temporal information and external knowledge to capture the activity semantics in posts. Considering that no benchmark dataset is available in such studies, we built a dataset that was used for training and evaluating the model. The experimental results show that the proposed model significantly improves the accuracy of recognizing the human activity category compared with traditional classification methods.


2021 ◽  
Author(s):  
Jinwei Liu ◽  
Long Cheng ◽  
Hongmei Chi ◽  
Cong Liu ◽  
Richard A. Alo

2017 ◽  
Vol 57 (7) ◽  
pp. 883-898 ◽  
Author(s):  
Huy Quan Vu ◽  
Gang Li ◽  
Rob Law ◽  
Yanchun Zhang

Approaches to traditional travel diary construction rely on tourist participation and manual recording; hence, they are not only time-consuming but also limited in the scale and the number of samples. Online social network platforms have been used as alternative data sources for capturing the movements and travel patterns of tourists at a large scale. However, they fail to provide detailed contextual information on tourist activities for further analysis. In this paper, we present a new approach to travel diary construction based on the venue check-in data available in mobile social media with rich information on locations, time, and activities. Our case study focuses on the inbound tourism in Hong Kong using a data set composed of 17,355 check-ins generated by 600 tourists. We demonstrate how the proposed travel diary can provide useful practical implications for applications in location management, transportation management, impact management, and tourist experience promotion among others.


Sign in / Sign up

Export Citation Format

Share Document