scholarly journals A Hybrid System for Emotion Extraction from Suicide Notes

2012 ◽  
Vol 5s1 ◽  
pp. BII.S8981 ◽  
Author(s):  
Azadeh Nikfarjam ◽  
Ehsan Emadzadeh ◽  
Graciela Gonzalez

The reasons that drive someone to commit suicide are complex and their study has attracted the attention of scientists in different domains. Analyzing this phenomenon could significantly improve the preventive efforts. In this paper we present a method for sentiment analysis of suicide notes submitted to the i2b2/VA/Cincinnati Shared Task 2011. In this task the sentences of 900 suicide notes were labeled with the possible emotions that they reflect. In order to label the sentence with emotions, we propose a hybrid approach which utilizes both rule based and machine learning techniques. To solve the multi class problem a rule-based engine and an SVM model is used for each category. A set of syntactic and semantic features are selected for each sentence to build the rules and train the classifier. The rules are generated manually based on a set of lexical and emotional clues. We propose a new approach to extract the sentence's clauses and constitutive grammatical elements and to use them in syntactic and semantic feature generation. The method utilizes a novel method to measure the polarity of the sentence based on the extracted grammatical elements, reaching precision of 41.79 with recall of 55.03 for an f-measure of 47.50. The overall mean f-measure of all submissions was 48.75% with a standard deviation of 7%.

2016 ◽  
Vol 42 (6) ◽  
pp. 782-797 ◽  
Author(s):  
Haifa K. Aldayel ◽  
Aqil M. Azmi

The fact that people freely express their opinions and ideas in no more than 140 characters makes Twitter one of the most prevalent social networking websites in the world. Being popular in Saudi Arabia, we believe that tweets are a good source to capture the public’s sentiment, especially since the country is in a fractious region. Going over the challenges and the difficulties that the Arabic tweets present – using Saudi Arabia as a basis – we propose our solution. A typical problem is the practice of tweeting in dialectical Arabic. Based on our observation we recommend a hybrid approach that combines semantic orientation and machine learning techniques. Through this approach, the lexical-based classifier will label the training data, a time-consuming task often prepared manually. The output of the lexical classifier will be used as training data for the SVM machine learning classifier. The experiments show that our hybrid approach improved the F-measure of the lexical classifier by 5.76% while the accuracy jumped by 16.41%, achieving an overall F-measure and accuracy of 84 and 84.01% respectively.


2020 ◽  
Vol 54 (4) ◽  
pp. 1161-1181
Author(s):  
Paolo Omero ◽  
Massimiliano Valotto ◽  
Riccardo Bellana ◽  
Ramona Bongelli ◽  
Ilaria Riccioni ◽  
...  

Abstract In a previous study, we manually identified seven categories (verbs, non-verbs, modal verbs in the simple present, modal verbs in the conditional mood, if, uncertain questions, and epistemic future) of Uncertainty Markers (UMs) in a corpus of 80 articles from the British Medical Journal randomly sampled from a 167-year period (1840–2007). The UMs detected on the base of an epistemic stance approach were those referring only to the authors of the articles and only in the present. We also performed preliminary experiments to assess the manual annotated corpus and to establish a baseline for the UMs automatic detection. The results of the experiments showed that most UMs could be recognized with good accuracy, except for the if-category, which includes four subcategories: if-clauses in a narrow sense; if-less clauses; as if/as though; if and whether introducing embedded questions. The unsatisfactory results concerning the if-category were probably due to both its complexity and the inadequacy of the detection rules, which were only lexical, not grammatical. In the current article, we describe a different approach, which combines grammatical and syntactic rules. The performed experiments show that the identification of uncertainty in the if-category has been largely double improved compared to our previous results. The complex overall process of uncertainty detection can greatly profit from a hybrid approach which should combine supervised Machine learning techniques with a knowledge-based approach constituted by a rule-based inference engine devoted to the if-clause case and designed on the basis of the above mentioned epistemic stance approach.


Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


2021 ◽  
Vol 63 (12) ◽  
pp. 1104-1111
Author(s):  
Furkan Sarsilmaz ◽  
Gürkan Kavuran

Abstract In this work, a couple of dissimilar AA2024/AA7075 plates were experimentally welded for the purpose of considering the effect of friction-stir welding (FSW) parameters on mechanical properties. First, the main mechanical properties such as ultimate tensile strength (UTS) and hardness of welded joints were determined experimentally. Secondly, these data were evaluated through modeling and the optimization of the FSW process as well as an optimal parametric combination to affirm tensile strength and hardness using a support vector machine (SVM) and an artificial neural network (ANN). In this study, a new ANN model, including the Nelder-Mead algorithm, was first used and compared with the SVM model in the FSW process. It was concluded that the ANN approach works better than SVM techniques. The validity and accuracy of the proposed method were proved by simulation studies.


2012 ◽  
Vol 4 (2) ◽  
pp. 32-59 ◽  
Author(s):  
K. K. Chaturvedi ◽  
V.B. Singh

Bug severity is the degree of impact that a defect has on the development or operation of a component or system, and can be classified into different levels based on their impact on the system. Identification of severity level can be useful for bug triager in allocating the bug to the concerned bug fixer. Various researchers have attempted text mining techniques in predicting the severity of bugs, detection of duplicate bug reports and assignment of bugs to suitable fixer for its fix. In this paper, an attempt has been made to compare the performance of different machine learning techniques namely Support vector machine (SVM), probability based Naïve Bayes (NB), Decision Tree based J48 (A Java implementation of C4.5), rule based Repeated Incremental Pruning to Produce Error Reduction (RIPPER) and Random Forests (RF) learners in predicting the severity level (1 to 5) of a reported bug by analyzing the summary or short description of the bug reports. The bug report data has been taken from NASA’s PITS (Projects and Issue Tracking System) datasets as closed source and components of Eclipse, Mozilla & GNOME datasets as open source projects. The analysis has been carried out in RapidMiner and STATISTICA data mining tools. The authors measured the performance of different machine learning techniques by considering (i) the value of accuracy and F-Measure for all severity level and (ii) number of best cases at different threshold level of accuracy and F-Measure.


Author(s):  
Niddal Imam ◽  
Biju Issac ◽  
Seibu Mary Jacob

Twitter has changed the way people get information by allowing them to express their opinion and comments on the daily tweets. Unfortunately, due to the high popularity of Twitter, it has become very attractive to spammers. Unlike other types of spam, Twitter spam has become a serious issue in the last few years. The large number of users and the high amount of information being shared on Twitter play an important role in accelerating the spread of spam. In order to protect the users, Twitter and the research community have been developing different spam detection systems by applying different machine-learning techniques. However, a recent study showed that the current machine learning-based detection systems are not able to detect spam accurately because spam tweet characteristics vary over time. This issue is called “Twitter Spam Drift”. In this paper, a semi-supervised learning approach (SSLA) has been proposed to tackle this. The new approach uses the unlabeled data to learn the structure of the domain. Different experiments were performed on English and Arabic datasets to test and evaluate the proposed approach and the results show that the proposed SSLA can reduce the effect of Twitter spam drift and outperform the existing techniques.


Author(s):  
Zhao Zhang ◽  
Yun Yuan ◽  
Xianfeng (Terry) Yang

Accurate and timely estimation of freeway traffic speeds by short segments plays an important role in traffic monitoring systems. In the literature, the ability of machine learning techniques to capture the stochastic characteristics of traffic has been proved. Also, the deployment of intelligent transportation systems (ITSs) has provided enriched traffic data, which enables the adoption of a variety of machine learning methods to estimate freeway traffic speeds. However, the limitation of data quality and coverage remain a big challenge in current traffic monitoring systems. To overcome this problem, this study aims to develop a hybrid machine learning approach, by creating a new training variable based on the second-order traffic flow model, to improve the accuracy of traffic speed estimation. Grounded on a novel integrated framework, the estimation is performed using three machine learning techniques, that is, Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN). All three models are trained with the integrated dataset including the traffic flow model estimates and the iPeMS and PeMS data from the Utah Department of Transportation (DOT). Further using the PeMS data as the ground truth for model evaluation, the comparisons between the hybrid approach and pure machine learning models show that the hybrid approach can effectively capture the time-varying pattern of the traffic and help improve the estimation accuracy.


2017 ◽  
Vol 48 (5) ◽  
pp. 705-713 ◽  
Author(s):  
G. Perna ◽  
M. Grassi ◽  
D. Caldirola ◽  
C. B. Nemeroff

Personalized medicine (PM) aims to establish a new approach in clinical decision-making, based upon a patient's individual profile in order to tailor treatment to each patient's characteristics. Although this has become a focus of the discussion also in the psychiatric field, with evidence of its high potential coming from several proof-of-concept studies, nearly no tools have been developed by now that are ready to be applied in clinical practice. In this paper, we discuss recent technological advances that can make a shift toward a clinical application of the PM paradigm. We focus specifically on those technologies that allow both the collection of massive as much as real-time data, i.e., electronic medical records and smart wearable devices, and to achieve relevant predictions using these data, i.e. the application of machine learning techniques.


Sign in / Sign up

Export Citation Format

Share Document