Detecting Aggressiveness in Tweets: A Hybrid Model for Detecting Cyberbullying in the Spanish Language

Manuel Lepe-Faúndez; Alejandra Segura-Navarrete; Christian Vidal-Castro; Claudia Martínez-Araneda; Clemente Rubio-Manzano

doi:10.3390/app112210706

Detecting Aggressiveness in Tweets: A Hybrid Model for Detecting Cyberbullying in the Spanish Language

Applied Sciences ◽

10.3390/app112210706 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10706

Author(s):

Manuel Lepe-Faúndez ◽

Alejandra Segura-Navarrete ◽

Christian Vidal-Castro ◽

Claudia Martínez-Araneda ◽

Clemente Rubio-Manzano

Keyword(s):

Machine Learning ◽

Social Networks ◽

Web Application ◽

English Language ◽

Science Research ◽

Spanish Language ◽

Machine Learning Algorithms ◽

Future Research ◽

Main Work ◽

Computer Science Research

In recent years, the use of social networks has increased exponentially, which has led to a significant increase in cyberbullying. Currently, in the field of Computer Science, research has been made on how to detect aggressiveness in texts, which is a prelude to detecting cyberbullying. In this field, the main work has been done for English language texts, mainly using Machine Learning (ML) approaches, Lexicon approaches to a lesser extent, and very few works using hybrid approaches. In these, Lexicons and Machine Learning algorithms are used, such as counting the number of bad words in a sentence using a Lexicon of bad words, which serves as an input feature for classification algorithms. This research aims at contributing towards detecting aggressiveness in Spanish language texts by creating different models that combine the Lexicons and ML approach. Twenty-two models that combine techniques and algorithms from both approaches are proposed, and for their application, certain hyperparameters are adjusted in the training datasets of the corpora, to obtain the best results in the test datasets. Three Spanish language corpora are used in the evaluation: Chilean, Mexican, and Chilean-Mexican corpora. The results indicate that hybrid models obtain the best results in the 3 corpora, over implemented models that do not use Lexicons. This shows that by mixing approaches, aggressiveness detection improves. Finally, a web application is developed that gives applicability to each model by classifying tweets, allowing evaluating the performance of models with external corpus and receiving feedback on the prediction of each one for future research. In addition, an API is available that can be integrated into technological tools for parental control, online plugins for writing analysis in social networks, and educational tools, among others.

Download Full-text

Is Hidden Safe? Location Protection against Machine-Learning Prediction Attacks in Social Networks

MIS Quarterly ◽

10.25300/misq/2021/16266 ◽

2021 ◽

Vol 45 (2) ◽

pp. 821-858

Author(s):

Xiao Han ◽

Leye Wang ◽

Weiguo Fan

Keyword(s):

Machine Learning ◽

Social Networks ◽

Private Information ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Future Research ◽

Exposure Risk ◽

Hidden Information ◽

Risk Estimator ◽

Exposure Risks

User privacy protection is a vital issue of concern for online social networks (OSNs). Even though users often intentionally hide their private information in OSNs, since adversaries may conduct prediction attacks to predict hidden information using advanced machine learning techniques, private information that users intend to hide may still be at risk of being exposed. Taking the current city listed on Facebook profiles as a case, we propose a solution that estimates and manages the exposure risk of users’ hidden information. First, we simulate an aggressive prediction attack using advanced state-of-the-art machine learning algorithms by proposing a new current city prediction framework that integrates location indications based on various types of information exposed by users, including demographic attributes, behaviors, and relationships. Second, we study prediction attack results to model patterns of prediction correctness (as correct predictions lead to information exposures) and construct an exposure risk estimator. The proposed exposure risk estimator has the ability not only to notify users of exposure risks related to their hidden current city but can also help users mitigate exposure risks by overhauling and selecting countermeasures. Moreover, our exposure risk estimator can improve the privacy management of OSNs by facilitating empirical studies on the exposure risks of OSN users as a group. Taking the current city as a case, this work offers insight on how to protect other types of private information against machine-learning prediction attacks and reveals several important implications for both practice management and future research.

Download Full-text

Personality Prediction from Social Networks text using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d7146.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 2384-2389

Keyword(s):

Machine Learning ◽

Social Networks ◽

Machine Learning Algorithms ◽

Future Research ◽

Important Research Topic ◽

Broad Variety ◽

Future Research Directions ◽

Personality Prediction ◽

And Function ◽

Work Done

Personality, a typical way of thinking, feeling, and behaviour. Personality embraces moods, attitudes and views and is expressed most obviously in relationships with others. It involves both intrinsic and acquired behavioural features that differentiate one individual from another and can be found in the relationships of people with the surroundings and with the social group. With the development of social networks, a broad variety of techn iques have been developed to identify user personalities based on their social activities and language usage practices. In terms of distinct machine learning algorithms, information sources and function sets, particular methods vary. Personality prediction has been an important research topic for describing user profiles and person not only in psychology but also in computer science. This paper presents a systematic survey of current work done of personality prediction from social networks. We also prepared a Comparison chart of existing techniques for personality prediction on the basis of relevant parameters. Based on this survey, we finally presented a few future research directions related to personality prediction.

Download Full-text

Adversarial Machine Learning on Social Network: A Survey

Frontiers in Physics ◽

10.3389/fphy.2021.766540 ◽

2021 ◽

Vol 9 ◽

Author(s):

Sensen Guo ◽

Xiaoyu Li ◽

Zhiying Mu

Keyword(s):

Machine Learning ◽

Social Networks ◽

Social Network ◽

Sentiment Analysis ◽

Recommendation System ◽

Learning Algorithms ◽

Real Life ◽

Machine Learning Algorithms ◽

Research Progress ◽

Future Research

In recent years, machine learning technology has made great improvements in social networks applications such as social network recommendation systems, sentiment analysis, and text generation. However, it cannot be ignored that machine learning algorithms are vulnerable to adversarial examples, that is, adding perturbations that are imperceptible to the human eye to the original data can cause machine learning algorithms to make wrong outputs with high probability. This also restricts the widespread use of machine learning algorithms in real life. In this paper, we focus on adversarial machine learning algorithms on social networks in recent years from three aspects: sentiment analysis, recommendation system, and spam detection, We review some typical applications of machine learning algorithms and adversarial example generation and defense algorithms for machine learning algorithms in the above three aspects in recent years. besides, we also analyze the current research progress and prospects for the directions of future research.

Download Full-text

Prediction of aircraft estimated time of arrival using machine learning methods

The Aeronautical Journal ◽

10.1017/aer.2021.13 ◽

2021 ◽

pp. 1-15

Author(s):

O. Basturk ◽

C. Cetek

Keyword(s):

Machine Learning ◽

Web Application ◽

Absolute Error ◽

Machine Learning Algorithms ◽

Weather Data ◽

Time Of Arrival ◽

Learning Models ◽

Trajectory Data ◽

Different Sources ◽

Machine Learning Models

ABSTRACT In this study, prediction of aircraft Estimated Time of Arrival (ETA) is proposed using machine learning algorithms. Accurate prediction of ETA is important for management of delay and air traffic flow, runway assignment, gate assignment, collaborative decision making (CDM), coordination of ground personnel and equipment, and optimisation of arrival sequence etc. Machine learning is able to learn from experience and make predictions with weak assumptions or no assumptions at all. In the proposed approach, general flight information, trajectory data and weather data were obtained from different sources in various formats. Raw data were converted to tidy data and inserted into a relational database. To obtain the features for training the machine learning models, the data were explored, cleaned and transformed into convenient features. New features were also derived from the available data. Random forests and deep neural networks were used to train the machine learning models. Both models can predict the ETA with a mean absolute error (MAE) less than 6min after departure, and less than 3min after terminal manoeuvring area (TMA) entrance. Additionally, a web application was developed to dynamically predict the ETA using proposed models.

Download Full-text

PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features

International Journal of Molecular Sciences ◽

10.3390/ijms22052704 ◽

2021 ◽

Vol 22 (5) ◽

pp. 2704

Author(s):

Andi Nur Nilamyani ◽

Firda Nurul Auliah ◽

Mohammad Ali Moni ◽

Watshara Shoombuatong ◽

Md Mehedi Hasan ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Web Application ◽

Computational Prediction ◽

Vital Role ◽

Machine Learning Algorithms ◽

Recursive Feature Elimination ◽

Post Translational Modification ◽

Multiple Sequence ◽

Sequence Features

Nitrotyrosine, which is generated by numerous reactive nitrogen species, is a type of protein post-translational modification. Identification of site-specific nitration modification on tyrosine is a prerequisite to understanding the molecular function of nitrated proteins. Thanks to the progress of machine learning, computational prediction can play a vital role before the biological experimentation. Herein, we developed a computational predictor PredNTS by integrating multiple sequence features including K-mer, composition of k-spaced amino acid pairs (CKSAAP), AAindex, and binary encoding schemes. The important features were selected by the recursive feature elimination approach using a random forest classifier. Finally, we linearly combined the successive random forest (RF) probability scores generated by the different, single encoding-employing RF models. The resultant PredNTS predictor achieved an area under a curve (AUC) of 0.910 using five-fold cross validation. It outperformed the existing predictors on a comprehensive and independent dataset. Furthermore, we investigated several machine learning algorithms to demonstrate the superiority of the employed RF algorithm. The PredNTS is a useful computational resource for the prediction of nitrotyrosine sites. The web-application with the curated datasets of the PredNTS is publicly available.

Download Full-text

An integrated review on machine learning approaches for heart disease prediction: Direction towards future research gaps

Bio-Algorithms and Med-Systems ◽

10.1515/bams-2020-0069 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Fathima Aliyar Vellameeran ◽

Thomas Brindha

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Prediction Models ◽

Machine Learning Algorithms ◽

Future Research ◽

Disease Prediction ◽

Learning Approaches ◽

Research Papers ◽

Using Data ◽

Intelligent Methods

Abstract Objectives To make a clear literature review on state-of-the-art heart disease prediction models. Methods It reviews 61 research papers and states the significant analysis. Initially, the analysis addresses the contributions of each literature works and observes the simulation environment. Here, different types of machine learning algorithms deployed in each contribution. In addition, the utilized dataset for existing heart disease prediction models was observed. Results The performance measures computed in entire papers like prediction accuracy, prediction error, specificity, sensitivity, f-measure, etc., are learned. Further, the best performance is also checked to confirm the effectiveness of entire contributions. Conclusions The comprehensive research challenges and the gap are portrayed based on the development of intelligent methods concerning the unresolved challenges in heart disease prediction using data mining techniques.

Download Full-text

Sarcasm Detection on Social Networks using Machine Learning Algorithms: A Systematic Review

2021 5th International Conference on Trends in Electronics and Informatics (ICOEI) ◽

10.1109/icoei51242.2021.9452954 ◽

2021 ◽

Author(s):

Ranganath Kanakam ◽

Rudra Kalyan Nayak

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Social Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Machine Learning

Machine Learning ◽

10.4018/978-1-60960-818-7.ch102 ◽

2012 ◽

pp. 13-22 ◽

Cited By ~ 1

Author(s):

João Gama ◽

André C.P.L.F. de Carvalho

Keyword(s):

Machine Learning ◽

Language Processing ◽

Text Processing ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Background Information ◽

Future Research ◽

Personal View ◽

Learning Techniques ◽

Future Research Directions

Machine learning techniques have been successfully applied to several real world problems in areas as diverse as image analysis, Semantic Web, bioinformatics, text processing, natural language processing,telecommunications, finance, medical diagnosis, and so forth. A particular application where machine learning plays a key role is data mining, where machine learning techniques have been extensively used for the extraction of association, clustering, prediction, diagnosis, and regression models. This text presents our personal view of the main aspects, major tasks, frequently used algorithms, current research, and future directions of machine learning research. For such, it is organized as follows: Background information concerning machine learning is presented in the second section. The third section discusses different definitions for Machine Learning. Common tasks faced by Machine Learning Systems are described in the fourth section. Popular Machine Learning algorithms and the importance of the loss function are commented on in the fifth section. The sixth and seventh sections present the current trends and future research directions, respectively.

Download Full-text

Machine Learning

Encyclopedia of Information Science and Technology, Second Edition ◽

10.4018/978-1-60566-026-4.ch392 ◽

2011 ◽

pp. 2462-2468 ◽

Cited By ~ 3

Author(s):

João Gama ◽

André C.P.L.F. de Carvalho

Keyword(s):

Machine Learning ◽

Language Processing ◽

Text Processing ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Background Information ◽

Future Research ◽

Personal View ◽

Learning Techniques ◽

Future Research Directions

Download Full-text

Web-Based Machine Learning Application for Heart Disease Prediction

10.4018/978-1-7998-7709-7.ch022 ◽

2022 ◽

pp. 383-393

Author(s):

Lokesh M. Giripunje ◽

Tejas Prashant Sonar ◽

Rohit Shivaji Mali ◽

Jayant C. Modhave ◽

Mahesh B. Gaikwad

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Random Forest ◽

Web Application ◽

Machine Learning Algorithms ◽

World Health ◽

Web Based ◽

The World ◽

Comprehensive Survey ◽

Health Organization

Risk because of heart disease is increasing throughout the world. According to the World Health Organization report, the number of deaths because of heart disease is drastically increasing as compared to other diseases. Multiple factors are responsible for causing heart-related issues. Many approaches were suggested for prediction of heart disease, but none of them were satisfactory in clinical terms. Heart disease therapies and operations available are so costly, and following treatment, heart disease is also costly. This chapter provides a comprehensive survey of existing machine learning algorithms and presents comparison in terms of accuracy, and the authors have found that the random forest classifier is the most accurate model; hence, they are using random forest for further processes. Deployment of machine learning model using web application was done with the help of flask, HTML, GitHub, and Heroku servers. Webpages take input attributes from the users and gives the output regarding the patient heart condition with accuracy of having coronary heart disease in the next 10 years.

Download Full-text