Automatically Identifying the Quality of Developer Chats for Post Hoc Use

Preetha Chatterjee; Kostadin Damevski; Nicholas A. Kraft; Lori Pollock

doi:10.1145/3450503

Automatically Identifying the Quality of Developer Chats for Post Hoc Use

ACM Transactions on Software Engineering and Methodology ◽

10.1145/3450503 ◽

2021 ◽

Vol 30 (4) ◽

pp. 1-28

Author(s):

Preetha Chatterjee ◽

Kostadin Damevski ◽

Nicholas A. Kraft ◽

Lori Pollock

Keyword(s):

Machine Learning ◽

Quality Indicators ◽

Expert Knowledge ◽

Support Tools ◽

Software Engineers ◽

Programming Support ◽

Automated Technique ◽

Post Hoc ◽

F Measure

Software engineers are crowdsourcing answers to their everyday challenges on Q&A forums (e.g., Stack Overflow) and more recently in public chat communities such as Slack, IRC, and Gitter. Many software-related chat conversations contain valuable expert knowledge that is useful for both mining to improve programming support tools and for readers who did not participate in the original chat conversations. However, most chat platforms and communities do not contain built-in quality indicators (e.g., accepted answers, vote counts). Therefore, it is difficult to identify conversations that contain useful information for mining or reading, i.e., conversations of post hoc quality. In this article, we investigate automatically detecting developer conversations of post hoc quality from public chat channels. We first describe an analysis of 400 developer conversations that indicate potential characteristics of post hoc quality, followed by a machine learning-based approach for automatically identifying conversations of post hoc quality. Our evaluation of 2,000 annotated Slack conversations in four programming communities (python, clojure, elm, and racket) indicates that our approach can achieve precision of 0.82, recall of 0.90, F-measure of 0.86, and MCC of 0.57. To our knowledge, this is the first automated technique for detecting developer conversations of post hoc quality.

Download Full-text

Modeling Relevance Relations Using Machine Learning Techniques

Advances in Machine Learning Applications in Software Engineering ◽

10.4018/978-1-59140-941-1.ch008 ◽

2011 ◽

pp. 168-207

Author(s):

Jelber Sayyad Shirabad ◽

Timothy C. Lethbridge ◽

Stan Matwin

Keyword(s):

Machine Learning ◽

Empirical Evaluation ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Software Engineers ◽

Field Deployment ◽

Complex Relationships ◽

Using Data

This chapter presents the notion of relevance relations, an abstraction to represent relationships between software entities. Relevance relations map tuples of software entities to values that reflect how related they are to each other. Although there are no clear definitions for these relationships, software engineers can typically identify instances of these complex relationships. We show how a classifier can model a relevance relation. We also present the process of creating such models by using data mining and machine learning techniques. In a case study, we applied this process to a large legacy system; our system learned models of a relevance relation that predict whether a change in one file may require a change in another file. Our empirical evaluation shows that the predictive quality of such models makes them a viable choice for field deployment. We also show how by assigning different misclassification costs such models can be tuned to meet the needs of the user in terms of their precision and recall.

Download Full-text

Maximizing the quality of NMR automatic metabolite profiling by a machine learning based prediction of signal parameters

10.1101/466235 ◽

2018 ◽

Author(s):

Daniel Cañueto ◽

Miriam Navarro ◽

Mónica Bulló ◽

Xavier Correig ◽

Nicolau Cañellas

Keyword(s):

Machine Learning ◽

Quality Indicators ◽

Metabolite Profiling ◽

Complex Spectrum ◽

Multiple Sources ◽

Low Intensity ◽

Signal Parameters ◽

The Matrix ◽

Expected Signal

AbstractThe quality of automatic metabolite profiling in NMR datasets in complex matrices can be compromised by the multiple sources of variability in the samples. These sources cause uncertainty in the metabolite signal parameters and the presence of multiple low-intensity signals. Lineshape fitting approaches might produce suboptimal resolutions or distort the fitted signals to adapt them to the complex spectrum lineshape. As a result, tools tend to restrict their use to specific matrices and strict protocols to reduce this uncertainty. However, the analysis and modelling of the signal parameters collected during a first profiling iteration can further reduce the uncertainty by the generation of narrow and accurate predictions of the expected signal parameters. In this study, we show that, thanks to the predictions generated, better profiling quality indicators can be outputted and the performance of automatic profiling can be maximized. Thanks to the ability of our workflow to learn and model the sample properties, restrictions in the matrix or protocol and limitations of lineshape fitting approaches can be overcome.

Download Full-text

Quality of life, effectiveness and patient reported outcomes of antiemetic prophylaxis with NEPA within the German NIS AkyPRO: post-hoc analysis of patients receiving carboplatin-based chemotherapy

10.1055/s-0040-1710726 ◽

2020 ◽

Author(s):

J Schilling ◽

R Lorenz ◽

D Rezek ◽

L Bauer ◽

D Hornung ◽

...

Keyword(s):

Quality Of Life ◽

Patient Reported Outcomes ◽

Life Effectiveness ◽

Post Hoc Analysis ◽

Antiemetic Prophylaxis ◽

Patient Reported ◽

Post Hoc

Download Full-text

Criteria for evaluation of the coordinate measurements quality of parts geometric parameters at pilot-experimental production

Izmeritel`naya Tekhnika ◽

10.32446/0368-1025it.2019-12-3-8 ◽

2019 ◽

pp. 3-8

Author(s):

N.Yu. Bobrovskaya ◽

M.F. Danilov

Keyword(s):

Quality Indicators ◽

Quality Management System ◽

Technical Specification ◽

Experimental Production ◽

Technological Heredity ◽

Coordinate Measurements ◽

Reasonable Choice ◽

The Stability ◽

Geometric Elements

The criteria of the coordinate measurements quality at pilot-experimental production based on contemporary methods of quality management system and traditional methods of the measurements quality in Metrology are considered. As an additional criterion for quality of measurements, their duration is proposed. Analyzing the problem of assessing the quality of measurements, the authors pay particular attention to the role of technological heredity in the analysis of the sources of uncertainty of coordinate measurements, including not only the process of manufacturing the part, but all stages of the development of design and technological documentation. Along with such criteria as the degree of confidence in the results of measurements; the accuracy, convergence, reproducibility and speed of the results must take into account the correctness of technical specification, and such characteristics of the shape of the geometric elements to be controlled, such as flatness, roundness, cylindrical. It is noted that one of the main methods to reduce the uncertainty of coordinate measurements is to reduce the uncertainty in the initial data and measurement conditions, as well as to increase the stability of the tasks due to the reasonable choice of the basic geometric elements (measuring bases) of the part. A prerequisite for obtaining reliable quality indicators is a quantitative assessment of the conditions and organization of the measurement process. To plan and normalize the time of measurements, the authors propose to use analytical formulas, on the basis of which it is possible to perform quantitative analysis and optimization of quality indicators, including the speed of measurements.

Download Full-text

Using the alveograph in developing requirements for the quality of flour for wafer sheet production

Khleboproducty ◽

10.32462/0235-2508-2020-29-12-52-58 ◽

2020 ◽

Vol 29 (12) ◽

pp. 52-58

Author(s):

E.P. Meleshkina ◽

◽

S.N. Kolomiets ◽

A.S. Cheskidova ◽

◽

...

Keyword(s):

Rheological Properties ◽

Quality Indicators ◽

Wheat Flour ◽

Mathematical Statistics ◽

Intended Purpose ◽

The Future ◽

Relationship Of ◽

The Relationship

Objectively and reliably determined indicators of rheological properties of the dough were identified using the alveograph device to create a system of classifications of wheat and flour from it for the intended purpose in the future. The analysis of the relationship of standardized quality indicators, as well as newly developed indicators for identifying them, differentiating the quality of wheat flour for the intended purpose, i.e. for finished products. To do this, we use mathematical statistics methods.

Download Full-text

A Literature Review Study of Software Defect Prediction using Machine Learning Techniques

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i6.286 ◽

2018 ◽

Vol 6 (6) ◽

pp. 300 ◽

Cited By ~ 3

Author(s):

Feidu Akmel ◽

Ermiyas Birihanu ◽

Bahir Siraj

Keyword(s):

Machine Learning ◽

Software Metrics ◽

Quality Standard ◽

Machine Learning Techniques ◽

Software Systems ◽

Health Care Insurance ◽

Software Defect ◽

Learning Techniques ◽

Software Product

Software systems are any software product or applications that support business domains such as Manufacturing,Aviation, Health care, insurance and so on.Software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from other for this reason it is better to apply the software metrics to measure the quality of software. Attributes that we gathered from source code through software metrics can be an input for software defect predictor. Software defect are an error that are introduced by software developer and stakeholders. Finally, in this study we discovered the application of machine learning on software defect that we gathered from the previous research works.

Download Full-text

BOWEL PREPARATION QUALITY OF NER1006 VERSUS ORAL TRISULFATE SOLUTION AS ASSESSED BY COLONOSCOPISTS AT SITE: A POST HOC ANALYSIS FROM A RANDOMISED CONTROLLED TRIAL

10.26226/morressier.59a6b343d462b80290b54419 ◽

2017 ◽

Author(s):

Richard Ng Kwet Shing

Keyword(s):

Randomised Controlled Trial ◽

Bowel Preparation ◽

Controlled Trial ◽

Post Hoc Analysis ◽

Randomised Controlled ◽

Post Hoc

Download Full-text

IRRITABLE BOWEL SYNDROME WITH CONSTIPATION: IMPACT OF SYMPTOM SEVERITY ON HEALTH-RELATED QUALITY OF LIFE: A POST HOC ANALYSIS OF DATA FROM TWO PHASE 3 TRIALS OF LINACLOTIDE

10.26226/morressier.59a6b34cd462b80290b55be9 ◽

2017 ◽

Author(s):

Anne Marciniak

Keyword(s):

Symptom Severity ◽

Health Related Quality ◽

Phase 3 ◽

Two Phase ◽

Post Hoc Analysis ◽

Related Quality ◽

Health Related ◽

Irritable Bowel ◽

Post Hoc

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.31232/osf.io/4pxq2 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Ferdinand Filip ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

State Of The Art ◽

Science Methods ◽

Learning Models ◽

Diverse Range ◽

Hybrid Machine ◽

Economics Research

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.

Download Full-text

Elicitation of expert knowledge to develop decision support tools for use in water demand management planning

Water Science & Technology Water Supply ◽

10.2166/ws.2007.110 ◽

2007 ◽

Vol 7 (5-6) ◽

pp. 53-60

Author(s):

D. Inman ◽

D. Simidchiev ◽

P. Jeffrey

Keyword(s):

Decision Support ◽

Water Demand ◽

Expert Knowledge ◽

Demand Management ◽

Decision Processes ◽

Conservation Strategy ◽

Influence Diagrams ◽

Decision Support Tools ◽

Water Demand Management ◽

Support Tools

This paper examines the use of influence diagrams (IDs) in water demand management (WDM) strategy planning with the specific objective of exploring how IDs can be used in developing computer-based decision support tools (DSTs) to complement and support existing WDM decision processes. We report the results of an expert consultation carried out in collaboration with water industry specialists in Sofia, Bulgaria. The elicited information is presented as influence diagrams and the discussion looks at their usefulness in WDM strategy design and the specification of suitable modelling techniques. The paper concludes that IDs themselves are useful in developing model structures for use in evidence-based reasoning models such as Bayesian Networks, and this is in keeping with the objectives set out in the introduction of integrating DSTs into existing decision processes. The paper will be of interest to modellers, decision-makers and scientists involved in designing tools to support resource conservation strategy implementation.

Download Full-text