scholarly journals Incorporating Background Checks with Sentiment Analysis to Identify Violence Risky Chinese Microblogs

2019 ◽  
Vol 11 (9) ◽  
pp. 200
Author(s):  
Yun-Fei Jia ◽  
Shan Li ◽  
Renbiao Wu

Based on Web 2.0 technology, more and more people tend to express their attitude or opinions on the Internet. Radical ideas, rumors, terrorism, or violent contents are also propagated on the Internet, causing several incidents of social panic every year in China. In fact, most of this content comprises joking or emotional catharsis. To detect this with conventional techniques usually incurs a large false alarm rate. To address this problem, this paper introduces a technique that combines sentiment analysis with background checks. State-of-the-art sentiment analysis usually depends on training datasets in a specific topic area. Unfortunately, for some domains, such as violence risk speech detection, there is no definitive training data. In particular, topic-independent sentiment analysis of short Chinese text has been rarely reported in the literature. In this paper, the violence risk of the Chinese microblogs is calculated from multiple perspectives. First, a lexicon-based method is used to retrieve violence-related microblogs, and then a similarity-based method is used to extract sentiment words. Semantic rules and emoticons are employed to obtain the sentiment polarity and sentiment strength of short texts. Second, the activity risk is calculated based on the characteristics of part of speech (PoS) sequence and by semantic rules, and then a threshold is set to capture the key users. Finally, the risk is confirmed by historical speeches and the opinions of the friend-circle of the key users. The experimental results show that the proposed approach outperforms the support vector machine (SVM) method on a topic-independent corpus and can effectively reduce the false alarm rate.

2018 ◽  
Vol 34 (3) ◽  
pp. 569-581 ◽  
Author(s):  
Sujata Rani ◽  
Parteek Kumar

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.


2019 ◽  
Vol 11 (2) ◽  
pp. 144
Author(s):  
Danar Wido Seno ◽  
Arief Wibowo

Social media writing content growing make a lot of new words that appear on Twitter in the form of words and abbreviations that appear so that sentiment analysis is increasingly difficult to get high accuracy of textual data on Twitter social media. In this study, the authors conducted research on sentiment analysis of the pairs of candidates for President and Vice President of Indonesia in the 2019 Elections. To obtain higher accuracy results and accommodate the problem of textual data development on Twitter, the authors conducted a combination of methods to conduct the sentiment analysis with unsupervised and supervised methods. namely Lexicon Based. This study used Twitter data in October 2018 using the search keywords with the names of each pair of candidates for President and Vice President of the 2019 Elections totaling 800 datasets. From the study with 800 datasets the best accuracy was obtained with a value of 92.5% with 80% training data composition and 20% testing data with a Precision value in each class between 85.7% - 97.2% and Recall value for each class among 78, 2% - 93.5%. With the Lexicon Based method as a labeling dataset, the process of labeling the Support Vector Machine dataset is no longer done manually but is processed by the Lexicon Based method and the dictionary on the lexicon can be added along with the development of data content on Twitter social media.


2012 ◽  
Vol 2012 ◽  
pp. 1-10 ◽  
Author(s):  
S. Ganapathy ◽  
P. Yogesh ◽  
A. Kannan

Intrusion detection systems were used in the past along with various techniques to detect intrusions in networks effectively. However, most of these systems are able to detect the intruders only with high false alarm rate. In this paper, we propose a new intelligent agent-based intrusion detection model for mobile ad hoc networks using a combination of attribute selection, outlier detection, and enhanced multiclass SVM classification methods. For this purpose, an effective preprocessing technique is proposed that improves the detection accuracy and reduces the processing time. Moreover, two new algorithms, namely, an Intelligent Agent Weighted Distance Outlier Detection algorithm and an Intelligent Agent-based Enhanced Multiclass Support Vector Machine algorithm are proposed for detecting the intruders in a distributed database environment that uses intelligent agents for trust management and coordination in transaction processing. The experimental results of the proposed model show that this system detects anomalies with low false alarm rate and high-detection rate when tested with KDD Cup 99 data set.


Author(s):  
Daniel Febrian Sengkey ◽  
Agustinus Jacobus ◽  
Fabian Johanes Manoppo

Support vector machine (SVM) is a known method for supervised learning in sentiment analysis and there are many studies about the use of SVM in classifying the sentiments in lecturer evaluation. SVM has various parameters that can be tuned and kernels that can be chosen to improve the classifier accuracy. However, not all options have been explored. Therefore, in this study we compared the four SVM kernels: radial, linear, polynomial, and sigmoid, to discover how each kernel influences the accuracy of the classifier. To make a proper assessment, we used our labeled dataset of students’ evaluations toward the lecturer. The dataset was split, one for training the classifier, and another one for testing the model. As an addition, we also used several different ratios of the training:testing dataset. The split ratios are 0.5 to 0.95, with the increment factor of 0.05. The dataset was split randomly, hence the splitting-training-testing processes were repeated 1,000 times for each kernel and splitting ratio. Therefore, at the end of the experiment, we got 40,000 accuracy data. Later, we applied statistical methods to see whether the differences are significant. Based on the statistical test, we found that in this particular case, the linear kernel significantly has higher accuracy compared to the other kernels. However, there is a tradeoff, where the results are getting more varied with a higher proportion of data used for training.


Social media content on the internet is increasing day by day. Since media knowledge helps people in making decisions, web based businesses give their clients an opportunity to express their opinions about items available on the web in the form of surveys and reviews. Sentiment analysis can be used on product reviews or tweets, comments, blogs to infer individual’s feelings or attitudes. Here Aspect Based Sentiment Analysis is used to extract most interesting aspect of a particular product from unlabeled text. We have developed two models for aspect/feature extraction.Model1 uses POS tagging whereas Model2 utilizes TFIDF .In Model 1 we start with noun phrase algorithm and extend it to adjectives and adverbs to extract all the aspect terms. In model2 after data preprocessing TDIDF technique is used. The relative importances of the aspects are calculated and the most important positive, negative and neutral aspects are presented to the user. Naïve Bayes, Support Vector machine, Decision Tree, KNN were used to classify the sentiment polarity of the generated aspects


Author(s):  
Tameem Ahmad ◽  
Mohd Asad Anwar ◽  
Misbahul Haque

This chapter proposes a hybrid classifier technique for network Intrusion Detection System by implementing a method that combines Random Forest classification technique with K-Means and Gaussian Mixture clustering algorithms. Random-forest will build patterns of intrusion over a training data in misuse-detection, while anomaly-detection intrusions will be identiðed by the outlier-detection mechanism. The implementation and simulation of the proposed method for various metrics are carried out under varying threshold values. The effectiveness of the proposed method has been carried out for metrics such as precision, recall, accuracy rate, false alarm rate, and detection rate. The various existing algorithms are analyzed extensively. It is observed experimentally that the proposed method gives superior results compared to the existing simpler classifiers as well as existing hybrid classifier techniques. The proposed hybrid classifier technique outperforms other common existing classifiers with an accuracy of 99.84%, false alarm rate as 0.09% and the detection rate as 99.7%.


2012 ◽  
Vol 19 (4) ◽  
pp. 753-761 ◽  
Author(s):  
Yanlong Cao ◽  
Yuanfeng He ◽  
Huawen Zheng ◽  
Jiangxin Yang

In order to reduce the false alarm rate and missed detection rate of a Loose Parts Monitoring System (LPMS) for Nuclear Power Plants, a new hybrid method combining Linear Predictive Coding (LPC) and Support Vector Machine (SVM) together to discriminate the loose part signal is proposed. The alarm process is divided into two stages. The first stage is to detect the weak burst signal for reducing the missed detection rate. Signal is whitened to improve the SNR, and then the weak burst signal can be detected by checking the short-term Root Mean Square (RMS) of the whitened signal. The second stage is to identify the detected burst signal for reducing the false alarm rate. Taking the signal's LPC coefficients as its characteristics, SVM is then utilized to determine whether the signal is generated by the impact of a loose part. The experiment shows that whitening the signal in the first stage can detect a loose part burst signal even at very low SNR and thusly can significantly reduce the rate of missed detection. In the second alarm stage, the loose parts' burst signal can be distinguished from pulse disturbance by using SVM. Even when the SNR is −15 dB, the system can still achieve a 100% recognition rate


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Cach N. Dang ◽  
María N. Moreno-García ◽  
Fernando De la Prieta

Sentiment analysis on public opinion expressed in social networks, such as Twitter or Facebook, has been developed into a wide range of applications, but there are still many challenges to be addressed. Hybrid techniques have shown to be potential models for reducing sentiment errors on increasingly complex training data. This paper aims to test the reliability of several hybrid techniques on various datasets of different domains. Our research questions are aimed at determining whether it is possible to produce hybrid models that outperform single models with different domains and types of datasets. Hybrid deep sentiment analysis learning models that combine long short-term memory (LSTM) networks, convolutional neural networks (CNN), and support vector machines (SVM) are built and tested on eight textual tweets and review datasets of different domains. The hybrid models are compared against three single models, SVM, LSTM, and CNN. Both reliability and computation time were considered in the evaluation of each technique. The hybrid models increased the accuracy for sentiment analysis compared with single models on all types of datasets, especially the combination of deep learning models with SVM. The reliability of the latter was significantly higher.


2018 ◽  
Vol 18 (01) ◽  
pp. e05 ◽  
Author(s):  
John Adedapo Ojo ◽  
Jamiu Alabi Oladosu

Video-based fire detection (VFD) technologies have received significant attention from both academic and industrial communities recently. However, existing VFD approaches are still susceptible to false alarms due to changes in illumination, camera noise, variability of shape, motion, colour, irregular patterns of smoke and flames, modelling and training inaccuracies. Hence, this work aimed at developing a VSD system that will have a high detection rate, low false-alarm rate and short response time. Moving blocks in video frames were segmented and analysed in HSI colour space, and wavelet energy analysis of the smoke candidate blocks was performed. In addition, Dynamic texture descriptors were obtained using Weber Local Descriptor in Three Orthogonal Planes (WLD-TOP). These features were combined and used as inputs to Support Vector Classifier with radial based kernel function, while post-processing stage employs temporal image filtering to reduce false alarm. The algorithm was implemented in MATLAB 8.1.0.604 (R2013a). Accuracy of 99.30%, detection rate of 99.28% and false alarm rate of 0.65% were obtained when tested with some online videos. The output of this work would find applications in early fire detection systems and other applications such as robot vision and automated inspection.


Sign in / Sign up

Export Citation Format

Share Document