Automatically Identifying the Quality of Developer Chats for Post Hoc Use

2021 ◽  
Vol 30 (4) ◽  
pp. 1-28
Author(s):  
Preetha Chatterjee ◽  
Kostadin Damevski ◽  
Nicholas A. Kraft ◽  
Lori Pollock

Software engineers are crowdsourcing answers to their everyday challenges on Q&A forums (e.g., Stack Overflow) and more recently in public chat communities such as Slack, IRC, and Gitter. Many software-related chat conversations contain valuable expert knowledge that is useful for both mining to improve programming support tools and for readers who did not participate in the original chat conversations. However, most chat platforms and communities do not contain built-in quality indicators (e.g., accepted answers, vote counts). Therefore, it is difficult to identify conversations that contain useful information for mining or reading, i.e., conversations of post hoc quality. In this article, we investigate automatically detecting developer conversations of post hoc quality from public chat channels. We first describe an analysis of 400 developer conversations that indicate potential characteristics of post hoc quality, followed by a machine learning-based approach for automatically identifying conversations of post hoc quality. Our evaluation of 2,000 annotated Slack conversations in four programming communities (python, clojure, elm, and racket) indicates that our approach can achieve precision of 0.82, recall of 0.90, F-measure of 0.86, and MCC of 0.57. To our knowledge, this is the first automated technique for detecting developer conversations of post hoc quality.

Author(s):  
Jelber Sayyad Shirabad ◽  
Timothy C. Lethbridge ◽  
Stan Matwin

This chapter presents the notion of relevance relations, an abstraction to represent relationships between software entities. Relevance relations map tuples of software entities to values that reflect how related they are to each other. Although there are no clear definitions for these relationships, software engineers can typically identify instances of these complex relationships. We show how a classifier can model a relevance relation. We also present the process of creating such models by using data mining and machine learning techniques. In a case study, we applied this process to a large legacy system; our system learned models of a relevance relation that predict whether a change in one file may require a change in another file. Our empirical evaluation shows that the predictive quality of such models makes them a viable choice for field deployment. We also show how by assigning different misclassification costs such models can be tuned to meet the needs of the user in terms of their precision and recall.


2018 ◽  
Author(s):  
Daniel Cañueto ◽  
Miriam Navarro ◽  
Mónica Bulló ◽  
Xavier Correig ◽  
Nicolau Cañellas

AbstractThe quality of automatic metabolite profiling in NMR datasets in complex matrices can be compromised by the multiple sources of variability in the samples. These sources cause uncertainty in the metabolite signal parameters and the presence of multiple low-intensity signals. Lineshape fitting approaches might produce suboptimal resolutions or distort the fitted signals to adapt them to the complex spectrum lineshape. As a result, tools tend to restrict their use to specific matrices and strict protocols to reduce this uncertainty. However, the analysis and modelling of the signal parameters collected during a first profiling iteration can further reduce the uncertainty by the generation of narrow and accurate predictions of the expected signal parameters. In this study, we show that, thanks to the predictions generated, better profiling quality indicators can be outputted and the performance of automatic profiling can be maximized. Thanks to the ability of our workflow to learn and model the sample properties, restrictions in the matrix or protocol and limitations of lineshape fitting approaches can be overcome.


2019 ◽  
pp. 3-8
Author(s):  
N.Yu. Bobrovskaya ◽  
M.F. Danilov

The criteria of the coordinate measurements quality at pilot-experimental production based on contemporary methods of quality management system and traditional methods of the measurements quality in Metrology are considered. As an additional criterion for quality of measurements, their duration is proposed. Analyzing the problem of assessing the quality of measurements, the authors pay particular attention to the role of technological heredity in the analysis of the sources of uncertainty of coordinate measurements, including not only the process of manufacturing the part, but all stages of the development of design and technological documentation. Along with such criteria as the degree of confidence in the results of measurements; the accuracy, convergence, reproducibility and speed of the results must take into account the correctness of technical specification, and such characteristics of the shape of the geometric elements to be controlled, such as flatness, roundness, cylindrical. It is noted that one of the main methods to reduce the uncertainty of coordinate measurements is to reduce the uncertainty in the initial data and measurement conditions, as well as to increase the stability of the tasks due to the reasonable choice of the basic geometric elements (measuring bases) of the part. A prerequisite for obtaining reliable quality indicators is a quantitative assessment of the conditions and organization of the measurement process. To plan and normalize the time of measurements, the authors propose to use analytical formulas, on the basis of which it is possible to perform quantitative analysis and optimization of quality indicators, including the speed of measurements.


2020 ◽  
Vol 29 (12) ◽  
pp. 52-58
Author(s):  
E.P. Meleshkina ◽  
◽  
S.N. Kolomiets ◽  
A.S. Cheskidova ◽  
◽  
...  

Objectively and reliably determined indicators of rheological properties of the dough were identified using the alveograph device to create a system of classifications of wheat and flour from it for the intended purpose in the future. The analysis of the relationship of standardized quality indicators, as well as newly developed indicators for identifying them, differentiating the quality of wheat flour for the intended purpose, i.e. for finished products. To do this, we use mathematical statistics methods.


Author(s):  
Feidu Akmel ◽  
Ermiyas Birihanu ◽  
Bahir Siraj

Software systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.


2020 ◽  
Author(s):  
Saeed Nosratabadi ◽  
Amir Mosavi ◽  
Puhong Duan ◽  
Pedram Ghamisi ◽  
Ferdinand Filip ◽  
...  

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.


2007 ◽  
Vol 7 (5-6) ◽  
pp. 53-60
Author(s):  
D. Inman ◽  
D. Simidchiev ◽  
P. Jeffrey

This paper examines the use of influence diagrams (IDs) in water demand management (WDM) strategy planning with the specific objective of exploring how IDs can be used in developing computer-based decision support tools (DSTs) to complement and support existing WDM decision processes. We report the results of an expert consultation carried out in collaboration with water industry specialists in Sofia, Bulgaria. The elicited information is presented as influence diagrams and the discussion looks at their usefulness in WDM strategy design and the specification of suitable modelling techniques. The paper concludes that IDs themselves are useful in developing model structures for use in evidence-based reasoning models such as Bayesian Networks, and this is in keeping with the objectives set out in the introduction of integrating DSTs into existing decision processes. The paper will be of interest to modellers, decision-makers and scientists involved in designing tools to support resource conservation strategy implementation.


Sign in / Sign up

Export Citation Format

Share Document