scholarly journals Stopping duplicate bug reports before they start with Continuous Querying for bug reports

Author(s):  
Abram Hindle

Bug deduplication is a hot topic in software engineering information retrieval research, but it is often not deployed. Typically to de-duplicate bug reports developers rely upon the search capabilities of the bug report software they employ, such as Bugzilla, Jira, or Github Issues. These search capabilities range from simple SQL string search to IR-based word indexing methods employed by search engines. Yet too often these searches do very little to stop the creation of duplicate bug reports. Some bug trackers have more than 10\% of their bug reports marked as duplicate. Perhaps these bug tracker search engines are not enough? In this paper we propose a method of attempting to prevent duplicate bug reports before they start: continuous querying. That is as the bug reporter types in their bug report their text is used to query the bug database to find duplicate or related bug reports. This continuous querying allows the reporter to be alerted to duplicate bug reports as they report the bug, rather than formulating queries to find the duplicate bug report. Thus this work ushers in a new way of evaluating bug report deduplication techniques, as well as a new kind of bug deduplication task. We show that simple IR measures show some promise for addressing this problem but also that further research is needed to refine this novel process that is integrate-able into modern bug report systems.

2016 ◽  
Author(s):  
Abram Hindle

Bug deduplication is a hot topic in software engineering information retrieval research, but it is often not deployed. Typically to de-duplicate bug reports developers rely upon the search capabilities of the bug report software they employ, such as Bugzilla, Jira, or Github Issues. These search capabilities range from simple SQL string search to IR-based word indexing methods employed by search engines. Yet too often these searches do very little to stop the creation of duplicate bug reports. Some bug trackers have more than 10\% of their bug reports marked as duplicate. Perhaps these bug tracker search engines are not enough? In this paper we propose a method of attempting to prevent duplicate bug reports before they start: continuous querying. That is as the bug reporter types in their bug report their text is used to query the bug database to find duplicate or related bug reports. This continuous querying allows the reporter to be alerted to duplicate bug reports as they report the bug, rather than formulating queries to find the duplicate bug report. Thus this work ushers in a new way of evaluating bug report deduplication techniques, as well as a new kind of bug deduplication task. We show that simple IR measures show some promise for addressing this problem but also that further research is needed to refine this novel process that is integrate-able into modern bug report systems.


2016 ◽  
Vol 29 (3) ◽  
pp. e1821 ◽  
Author(s):  
Karan Aggarwal ◽  
Finbarr Timbers ◽  
Tanner Rutgers ◽  
Abram Hindle ◽  
Eleni Stroulia ◽  
...  

Author(s):  
Marzie Rahmati ◽  
Mohammad Ali Zare Chahooki

Bug localization uses bug reports received from users, developers and testers to locate buggy files. Since finding a buggy file among thousands of files is time consuming and tedious for developers, various methods based on information retrieval is suggested to automate this process. In addition to information retrieval methods for error localization, machine learning methods are used too. Machine learning-based approach, improves methods of describing bug report and program code by representing them in feature vectors. Learning hypothesis on Extreme Learning Machine (ELM) has been recently effective in many areas. This paper shows effectiveness of none-linear kernel of ELM in bug localization. Furthermore the effectiveness of Different kernels in ELM compare to other kernel-based learning methods is analyzed. The experimental results for hypothesis evaluation on Mozilla Firefox dataset show effectiveness of Kernel ELM for bug localization in software projects.


2014 ◽  
Vol 5 (2) ◽  
pp. 54-60
Author(s):  
Ivransa Zuhdi Pane

Management of engineering activities based on information systems is expected to increase Engineer’s perfomances in executing the daily tasks. The software of such management information system should be built on the platform which is easy to use and adaptable to the dynamics of engineering activity management in the future. Software engineering, consisting of analysis, design and implementation, was carried out to realize a prototype which is ready to be applied in the further development stages. Index Terms - engineering activity, Engineering, information system, software engineering.


Author(s):  
Yu Zhou ◽  
Yanxiang Tong ◽  
Taolue Chen ◽  
Jin Han

Bug localization represents one of the most expensive, as well as time-consuming, activities during software maintenance and evolution. To alleviate the workload of developers, numerous methods have been proposed to automate this process and narrow down the scope of reviewing buggy files. In this paper, we present a novel buggy source-file localization approach, using the information from both the bug reports and the source files. We leverage the part-of-speech features of bug reports and the invocation relationship among source files. We also integrate an adaptive technique to further optimize the performance of the approach. The adaptive technique discriminates Top 1 and Top N recommendations for a given bug report and consists of two modules. One module is to maximize the accuracy of the first recommended file, and the other one aims at improving the accuracy of the fixed defect file list. We evaluate our approach on six large-scale open source projects, i.e. ASpectJ, Eclipse, SWT, Zxing, Birt and Tomcat. Compared to the previous work, empirical results show that our approach can improve the overall prediction performance in all of these cases. Particularly, in terms of the Top 1 recommendation accuracy, our approach achieves an enhancement from 22.73% to 39.86% for ASpectJ, from 24.36% to 30.76% for Eclipse, from 31.63% to 46.94% for SWT, from 40% to 55% for ZXing, from 7.97% to 21.99% for Birt, and from 33.37% to 38.90% for Tomcat.


Author(s):  
Bancha Luaphol ◽  
Jantima Polpinij ◽  
Manasawee Kaenampornpan

Most studies relating to bug reports aims to automatically identify necessary information from bug reports for software bug fixing. Unfortunately, the study of bug reports focuses only on one issue, but more complete and comprehensive software bug fixing would be facilitated by assessing multiple issues concurrently. This becomes a challenge in this study, where it aims to present a method of identifying bug reports at severe level from a bug report repository, together with assembling their related bug reports to visualize the overall picture of a software problem domain. The proposed method is called “mining bug report repositories”. Two techniques of text mining are applied as the main mechanisms in this method. First, classification is applied for identifying severe bug reports, called “bug severity classification”, while “threshold-based similarity analysis” is then applied to assemble bug reports that are related to a bug report at severe level. Our datasets are from three opensource namely SeaMonkey, Firefox, and Core:Layout downloaded from the Bugzilla. Finally, the best models from the proposed method are selected and compared with two baseline methods. For identifying severe bug reports using classification technique, the results show that our method improved accuracy, F1, and AUC scores over the baseline by 11.39, 11.63, and 19% respectively. Meanwhile, for assembling related bug reports using threshold-based similarity technique, the results show that our method improved precision, and likelihood scores over the other baseline by 15.76, and 9.14% respectively. This demonstrate that our proposed method may help increasing chance to fix bugs completely.


Sign in / Sign up

Export Citation Format

Share Document