scholarly journals Multimodal Hate Speech Detection in Greek Social Media

2021 ◽  
Vol 5 (7) ◽  
pp. 34
Author(s):  
Konstantinos Perifanos ◽  
Dionysis Goutsos

Hateful and abusive speech presents a major challenge for all online social media platforms. Recent advances in Natural Language Processing and Natural Language Understanding allow for more accurate detection of hate speech in textual streams. This study presents a new multimodal approach to hate speech detection by combining Computer Vision and Natural Language processing models for abusive context detection. Our study focuses on Twitter messages and, more specifically, on hateful, xenophobic, and racist speech in Greek aimed at refugees and migrants. In our approach, we combine transfer learning and fine-tuning of Bidirectional Encoder Representations from Transformers (BERT) and Residual Neural Networks (Resnet). Our contribution includes the development of a new dataset for hate speech classification, consisting of tweet IDs, along with the code to obtain their visual appearance, as they would have been rendered in a web browser. We have also released a pre-trained Language Model trained on Greek tweets, which has been used in our experiments. We report a consistently high level of accuracy (accuracy score = 0.970, f1-score = 0.947 in our best model) in racist and xenophobic speech detection.

Author(s):  
Mitta Roja

Abstract: Cyberbullying is a major problem encountered on internet that affects teenagers and also adults. It has lead to mishappenings like suicide and depression. Regulation of content on Social media platorms has become a growing need. The following study uses data from two different forms of cyberbullying, hate speech tweets from Twittter and comments based on personal attacks from Wikipedia forums to build a model based on detection of Cyberbullying in text data using Natural Language Processing and Machine learning. Threemethods for Feature extraction and four classifiers are studied to outline the best approach. For Tweet data the model provides accuracies above 90% and for Wikipedia data it givesaccuracies above 80%. Keywords: Cyberbullying, Hate speech, Personal attacks,Machine learning, Feature extraction, Twitter, Wikipedia


Author(s):  
Sayani Ghosal ◽  
Amita Jain

Hate content detection is the most prospective and challenging research area under the natural language processing domain. Hate speech abuse individuals or groups of people based on religion, caste, language, or sex. Enormous growth of digital media and cyberspace has encouraged researchers to work on hatred speech detection. A commonly acceptable automatic hate detection system is required to stop flowing hate-motivated data. Anonymous hate content is affecting the young generation and adults on social networking sites. Through numerous studies and review papers, the chapter identifies the need for artificial intelligence (AI) in hate speech research. The chapter explores the current state-of-the-art and prospects of AI in natural language processing (NLP) and machine learning algorithms. The chapter aims to identify the most successful methods or techniques for hate speech detection to date. Revolution in this research helps social media to provide a healthy environment for everyone.


Spreading of fake news in online social media is a major nuisance to the public and there is no state of art tool to detect whether a news is a fake or an original one in an automated manner. Hence, this paper analyses the online social media and the news feeds for detection of fake news. The work proposes solution using Natural Language Processing and Deep Learning techniques for detecting the fake news in online social media.


2021 ◽  
pp. 1-10
Author(s):  
Shuai Zhao ◽  
Fucheng You ◽  
Wen Chang ◽  
Tianyu Zhang ◽  
Man Hu

The BERT pre-trained language model has achieved good results in various subtasks of natural language processing, but its performance in generating Chinese summaries is not ideal. The most intuitive reason is that the BERT model is based on character-level composition, while the Chinese language is mostly in the form of phrases. Directly fine-tuning the BERT model cannot achieve the expected effect. This paper proposes a novel summary generation model with BERT augmented by the pooling layer. In our model, we perform an average pooling operation on token embedding to improve the model’s ability to capture phrase-level semantic information. We use LCSTS and NLPCC2017 to verify our proposed method. Experimental data shows that the average pooling model’s introduction can effectively improve the generated summary quality. Furthermore, different data needs to be set with varying pooling kernel sizes to achieve the best results through comparative analysis. In addition, our proposed method has strong generalizability. It can be applied not only to the task of generating summaries, but also to other natural language processing tasks.


2015 ◽  
Vol 7 (1) ◽  
Author(s):  
Arash Shaban-Nejad ◽  
Sonia Menon ◽  
David Buckeridge

The Vaccon Sentiment Ontology (VASON) provides knowledge on the factors driving vaccine refusal by analyzing content of online social media. VASON facilitates concept extraction and analysis of the extracted concepts using an Natural Language Processing (NLP) module.


Sign in / Sign up

Export Citation Format

Share Document