Detecting Duplicate Bug Report Using Character N-Gram-Based Features

Author(s):  
Ashish Sureka ◽  
Pankaj Jalote
Keyword(s):  
Author(s):  
Pannavat Terdchanakul ◽  
Hideaki Hata ◽  
Passakorn Phannachitta ◽  
Kenichi Matsumoto
Keyword(s):  

Sensors ◽  
2019 ◽  
Vol 19 (13) ◽  
pp. 2964 ◽  
Author(s):  
Ashima Kukkar ◽  
Rajni Mohana ◽  
Anand Nayyar ◽  
Jeamin Kim ◽  
Byeong-Gwon Kang ◽  
...  

The accurate severity classification of a bug report is an important aspect of bug fixing. The bug reports are submitted into the bug tracking system with high speed, and owing to this, bug repository size has been increasing at an enormous rate. This increased bug repository size introduces biases in the bug triage process. Therefore, it is necessary to classify the severity of a bug report to balance the bug triaging process. Previously, many machine learning models were proposed for automation of bug severity classification. The accuracy of these models is not up to the mark because they do not extract the important feature patterns for learning the classifier. This paper proposes a novel deep learning model for multiclass severity classification called Bug Severity classification to address these challenges by using a Convolutional Neural Network and Random forest with Boosting (BCR). This model directly learns the latent and highly representative features. Initially, the natural language techniques preprocess the bug report text, and then n-gram is used to extract the features. Further, the Convolutional Neural Network extracts the important feature patterns of respective severity classes. Lastly, the random forest with boosting classifies the multiple bug severity classes. The average accuracy of the proposed model is 96.34% on multiclass severity of five open source projects. The average F-measures of the proposed BCR and the existing approach were 96.43% and 84.24%, respectively, on binary class severity classification. The results prove that the proposed BCR approach enhances the performance of bug severity classification over the state-of-the-art techniques.


Author(s):  
Vitaly Kuznetsov ◽  
Hank Liao ◽  
Mehryar Mohri ◽  
Michael Riley ◽  
Brian Roark

2020 ◽  
Author(s):  
Grant P. Strimel ◽  
Ariya Rastrow ◽  
Gautam Tiwari ◽  
Adrien Piérard ◽  
Jon Webb

2019 ◽  
Vol 1193 ◽  
pp. 012032
Author(s):  
D Purwantoro ◽  
H Akbar ◽  
A Hidayati ◽  
Sfenrianto
Keyword(s):  

2020 ◽  
Vol 12 (1) ◽  
pp. 1-24 ◽  
Author(s):  
Al Hafiz Akbar Maulana Siagian ◽  
Masayoshi Aritsugi
Keyword(s):  

2021 ◽  
pp. 1-14
Author(s):  
Hamed Zargari ◽  
Morteza Zahedi ◽  
Marziea Rahimi

Words are one of the most essential elements of expressing sentiments in context although they are not the only ones. Also, syntactic relationships between words, morphology, punctuation, and linguistic phenomena are influential. Merely considering the concept of words as isolated phenomena causes a lot of mistakes in sentiment analysis systems. So far, a large amount of research has been conducted on generating sentiment dictionaries containing only sentiment words. A number of these dictionaries have addressed the role of combinations of sentiment words, negators, and intensifiers, while almost none of them considered the heterogeneous effect of the occurrence of multiple linguistic phenomena in sentiment compounds. Regarding the weaknesses of the existing sentiment dictionaries, in addressing the heterogeneous effect of the occurrence of multiple intensifiers, this research presents a sentiment dictionary based on the analysis of sentiment compounds including sentiment words, negators, and intensifiers by considering the multiple intensifiers relative to the sentiment word and assigning a location-based coefficient to the intensifier, which increases the covered sentiment phrase in the dictionary, and enhanced efficiency of proposed dictionary-based sentiment analysis methods up to 7% compared to the latest methods.


Sign in / Sign up

Export Citation Format

Share Document