Severity Prediction for Bug Reports Using Multi-Aspect Features: A Deep Learning Approach

Anh-Hien Dao; Cheng-Zen Yang

doi:10.3390/math9141644

Severity Prediction for Bug Reports Using Multi-Aspect Features: A Deep Learning Approach

Mathematics ◽

10.3390/math9141644 ◽

2021 ◽

Vol 9 (14) ◽

pp. 1644

Author(s):

Anh-Hien Dao ◽

Cheng-Zen Yang

Keyword(s):

Deep Learning ◽

Matthews Correlation Coefficient ◽

State Of The Art ◽

Textual Information ◽

Learning Framework ◽

Bug Reports ◽

Average Accuracy ◽

Severity Prediction ◽

Quality Aspect ◽

Software Bug

The severity of software bug reports plays an important role in maintaining software quality. Many approaches have been proposed to predict the severity of bug reports using textual information. In this research, we propose a deep learning framework called MASP that uses convolutional neural networks (CNN) and the content-aspect, sentiment-aspect, quality-aspect, and reporter-aspect features of bug reports to improve prediction performance. We have performed experiments on datasets collected from Eclipse and Mozilla. The results show that the MASP model outperforms the state-of-the-art CNN model in terms of average Accuracy, Precision, Recall, F1-measure, and the Matthews Correlation Coefficient (MCC) by 1.83%, 0.46%, 3.23%, 1.72%, and 6.61%, respectively.

Download Full-text

BEHRT-HF: an interpretable transformer-based, deep learning model for prediction of incident heart failure

European Heart Journal ◽

10.1093/ehjci/ehaa946.3553 ◽

2020 ◽

Vol 41 (Supplement_2) ◽

Author(s):

S Rao ◽

Y Li ◽

R Ramakrishnan ◽

A Hassaine ◽

D Canoy ◽

...

Keyword(s):

Heart Failure ◽

Deep Learning ◽

State Of The Art ◽

Failure Prediction ◽

Predictive Performance ◽

Learning Model ◽

Learning Framework ◽

Incident Heart Failure ◽

Ablation Study ◽

Deep Learning Model

Abstract Background/Introduction Predicting incident heart failure has been challenging. Deep learning models when applied to rich electronic health records (EHR) offer some theoretical advantages. However, empirical evidence for their superior performance is limited and they remain commonly uninterpretable, hampering their wider use in medical practice. Purpose We developed a deep learning framework for more accurate and yet interpretable prediction of incident heart failure. Methods We used longitudinally linked EHR from practices across England, involving 100,071 patients, 13% of whom had been diagnosed with incident heart failure during follow-up. We investigated the predictive performance of a novel transformer deep learning model, “Transformer for Heart Failure” (BEHRT-HF), and validated it using both an external held-out dataset and an internal five-fold cross-validation mechanism using area under receiver operating characteristic (AUROC) and area under the precision recall curve (AUPRC). Predictor groups included all outpatient and inpatient diagnoses within their temporal context, medications, age, and calendar year for each encounter. By treating diagnoses as anchors, we alternatively removed different modalities (ablation study) to understand the importance of individual modalities to the performance of incident heart failure prediction. Using perturbation-based techniques, we investigated the importance of associations between selected predictors and heart failure to improve model interpretability. Results BEHRT-HF achieved high accuracy with AUROC 0.932 and AUPRC 0.695 for external validation, and AUROC 0.933 (95% CI: 0.928, 0.938) and AUPRC 0.700 (95% CI: 0.682, 0.718) for internal validation. Compared to the state-of-the-art recurrent deep learning model, RETAIN-EX, BEHRT-HF outperformed it by 0.079 and 0.030 in terms of AUPRC and AUROC. Ablation study showed that medications were strong predictors, and calendar year was more important than age. Utilising perturbation, we identified and ranked the intensity of associations between diagnoses and heart failure. For instance, the method showed that established risk factors including myocardial infarction, atrial fibrillation and flutter, and hypertension all strongly associated with the heart failure prediction. Additionally, when population was stratified into different age groups, incident occurrence of a given disease had generally a higher contribution to heart failure prediction in younger ages than when diagnosed later in life. Conclusions Our state-of-the-art deep learning framework outperforms the predictive performance of existing models whilst enabling a data-driven way of exploring the relative contribution of a range of risk factors in the context of other temporal information. Funding Acknowledgement Type of funding source: Private grant(s) and/or Sponsorship. Main funding source(s): National Institute for Health Research, Oxford Martin School, Oxford Biomedical Research Centre

Download Full-text

Light Field Image Quality Enhancement by a Lightweight Deformable Deep Learning Framework for Intelligent Transportation Systems

Electronics ◽

10.3390/electronics10101136 ◽

2021 ◽

Vol 10 (10) ◽

pp. 1136

Author(s):

David Augusto Ribeiro ◽

Juan Casavílca Silva ◽

Renata Lopes Rosa ◽

Muhammad Saadi ◽

Shahid Mumtaz ◽

...

Keyword(s):

Deep Learning ◽

Image Quality ◽

Intelligent Transportation Systems ◽

Light Field ◽

State Of The Art ◽

Intelligent Transportation ◽

Transportation Systems ◽

Machine Learning Algorithms ◽

Learning Framework ◽

Reconstruction Methods

Light field (LF) imaging has multi-view properties that help to create many applications that include auto-refocusing, depth estimation and 3D reconstruction of images, which are required particularly for intelligent transportation systems (ITSs). However, cameras can present a limited angular resolution, becoming a bottleneck in vision applications. Thus, there is a challenge to incorporate angular data due to disparities in the LF images. In recent years, different machine learning algorithms have been applied to both image processing and ITS research areas for different purposes. In this work, a Lightweight Deformable Deep Learning Framework is implemented, in which the problem of disparity into LF images is treated. To this end, an angular alignment module and a soft activation function into the Convolutional Neural Network (CNN) are implemented. For performance assessment, the proposed solution is compared with recent state-of-the-art methods using different LF datasets, each one with specific characteristics. Experimental results demonstrated that the proposed solution achieved a better performance than the other methods. The image quality results obtained outperform state-of-the-art LF image reconstruction methods. Furthermore, our model presents a lower computational complexity, decreasing the execution time.

Download Full-text

PRST: A PageRank-Based Summarization Technique for Summarizing Bug Reports with Duplicates

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194017500322 ◽

2017 ◽

Vol 27 (06) ◽

pp. 869-896 ◽

Cited By ~ 14

Author(s):

He Jiang ◽

Najam Nazar ◽

Jingxuan Zhang ◽

Tao Zhang ◽

Zhilei Ren

Keyword(s):

Regression Model ◽

Software Maintenance ◽

State Of The Art ◽

Similarity Metrics ◽

Automatic Summarization ◽

Textual Information ◽

Additional Information ◽

Bug Reports ◽

Reference Process ◽

Duplicate Bug Reports

During software maintenance, bug reports are widely employed to improve the software project’s quality. A developer often refers to stowed bug reports in a repository for bug resolution. However, this reference process often requires a developer to pursue a substantial amount of textual information in bug reports which is lengthy and tedious. Automatic summarization of bug reports is one way to overcome this problem. Both supervised and unsupervised methods are effectively proposed for the automatic summary generation of bug reports. However, existing methods disregard the significance of duplicate bug reports in summarizing bug reports. In this study, we propose a PageRank-based Summarization Technique (PRST), which utilizes the textual information contained in bug reports and additional information in associated duplicate bug reports. PRST uses three variants of PageRank-based on Vector Space Model (VSM), Jaccard, and WordNet similarity metrics. These variants are utilized to calculate the textual similarity of the sentences between the master bug reports and their duplicates. PRST further trains a regression model and predicts the probability of sentences belonging to the summary. Finally, we combine the values of PageRank and regression model scores to rank the sentences and produce the summary for the master bug reports. In addition, we construct two corpora of bug reports and duplicates, i.e. MBRC and OSCAR. Empirical results suggest that PRST outperforms the state-of-the-art method BRC in terms of Precision, Recall, F-score, and Pyramid Precision. Meanwhile, PRST with WordNet achieves the best results against PRST with VSM and Jaccard.

Download Full-text

A Novel Object-Based Deep Learning Framework for Semantic Segmentation of Very High-Resolution Remote Sensing Data: Comparison with Convolutional and Fully Convolutional Networks

Remote Sensing ◽

10.3390/rs11060684 ◽

2019 ◽

Vol 11 (6) ◽

pp. 684 ◽

Cited By ~ 17

Author(s):

Maria Papadomanolaki ◽

Maria Vakalopoulou ◽

Konstantinos Karantzalos

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Semantic Segmentation ◽

Novel Object ◽

Convolutional Networks ◽

Learning Framework ◽

Fully Convolutional Networks ◽

Object Based ◽

Deep Networks ◽

Very High

Deep learning architectures have received much attention in recent years demonstrating state-of-the-art performance in several segmentation, classification and other computer vision tasks. Most of these deep networks are based on either convolutional or fully convolutional architectures. In this paper, we propose a novel object-based deep-learning framework for semantic segmentation in very high-resolution satellite data. In particular, we exploit object-based priors integrated into a fully convolutional neural network by incorporating an anisotropic diffusion data preprocessing step and an additional loss term during the training process. Under this constrained framework, the goal is to enforce pixels that belong to the same object to be classified at the same semantic category. We compared thoroughly the novel object-based framework with the currently dominating convolutional and fully convolutional deep networks. In particular, numerous experiments were conducted on the publicly available ISPRS WGII/4 benchmark datasets, namely Vaihingen and Potsdam, for validation and inter-comparison based on a variety of metrics. Quantitatively, experimental results indicate that, overall, the proposed object-based framework slightly outperformed the current state-of-the-art fully convolutional networks by more than 1% in terms of overall accuracy, while intersection over union results are improved for all semantic categories. Qualitatively, man-made classes with more strict geometry such as buildings were the ones that benefit most from our method, especially along object boundaries, highlighting the great potential of the developed approach.

Download Full-text

A structure-based deep learning framework for protein engineering

10.1101/833905 ◽

2019 ◽

Author(s):

Raghav Shroff ◽

Austin W. Cole ◽

Barrett R. Morrow ◽

Daniel J. Diaz ◽

Isaac Donnell ◽

...

Keyword(s):

Neural Network ◽

Amino Acids ◽

Deep Learning ◽

Protein Engineering ◽

Convolutional Neural Network ◽

State Of The Art ◽

A Priori ◽

Novel Proteins ◽

Learning Framework

AbstractWhile deep learning methods exist to guide protein optimization, examples of novel proteins generated with these techniques require a priori mutational data. Here we report a 3D convolutional neural network that associates amino acids with neighboring chemical microenvironments at state-of-the-art accuracy. This algorithm enables identification of novel gain-of-function mutations, and subsequent experiments confirm substantive phenotypic improvements in stability-associated phenotypes in vivo across three diverse proteins.

Download Full-text

A Deep Learning Framework to Predict Routability for FPGA Circuit Placement

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3465373 ◽

2021 ◽

Vol 14 (3) ◽

pp. 1-28

Author(s):

Abeer Al-Hyari ◽

Hannah Szentimrey ◽

Ahmed Shamli ◽

Timothy Martin ◽

Gary Gréwal ◽

...

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Parameter Tuning ◽

Learning Framework ◽

Proposed Model ◽

Field Programmable ◽

Circuit Placement ◽

Detailed Placement ◽

Place And Route ◽

Deep Learning Model

The ability to accurately and efficiently estimate the routability of a circuit based on its placement is one of the most challenging and difficult tasks in the Field Programmable Gate Array (FPGA) flow. In this article, we present a novel, deep learning framework based on a Convolutional Neural Network (CNN) model for predicting the routability of a placement. Since the performance of the CNN model is strongly dependent on the hyper-parameters selected for the model, we perform an exhaustive parameter tuning that significantly improves the model’s performance and we also avoid overfitting the model. We also incorporate the deep learning model into a state-of-the-art placement tool and show how the model can be used to (1) avoid costly, but futile, place-and-route iterations, and (2) improve the placer’s ability to produce routable placements for hard-to-route circuits using feedback based on routability estimates generated by the proposed model. The model is trained and evaluated using over 26K placement images derived from 372 benchmarks supplied by Xilinx Inc. We also explore several opportunities to further improve the reliability of the predictions made by the proposed DLRoute technique by splitting the model into two separate deep learning models for (a) global and (b) detailed placement during the optimization process. Experimental results show that the proposed framework achieves a routability prediction accuracy of 97% while exhibiting runtimes of only a few milliseconds.

Download Full-text

Automated labelling and severity prediction of software bug reports

International Journal of Computational Science and Engineering ◽

10.1504/ijcse.2019.10022718 ◽

2019 ◽

Vol 19 (3) ◽

pp. 330

Author(s):

Emad E. Abdallah ◽

Ashraf Aljammal ◽

Ahmed Fawzi Otoom ◽

Maen Hammad ◽

Doaa Al Shdaifat

Keyword(s):

Bug Reports ◽

Severity Prediction ◽

Software Bug

Download Full-text

Automated detection of colorectal tumors based on artificial intelligence

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01314-8 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Kwang-Sig Lee ◽

Sang-Hyuk Son ◽

Sang-Hyun Park ◽

Eun Sun Kim

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Automated Detection ◽

Single Shot ◽

Learning Framework ◽

Average Accuracy ◽

Decision Supporting ◽

Decision Supporting System ◽

Validation Set ◽

The Stability

Abstract Background This study developed a diagnostic tool to automatically detect normal, unclear and tumor images from colonoscopy videos using artificial intelligence. Methods For the creation of training and validation sets, 47,555 images in the jpg format were extracted from colonoscopy videos for 24 patients in Korea University Anam Hospital. A gastroenterologist with the clinical experience of 15 years divided the 47,555 images into three classes of Normal (25,895), Unclear (2038) and Tumor (19,622). A single shot detector, a deep learning framework designed for object detection, was trained using the 47,255 images and validated with two sets of 300 images—each validation set included 150 images (50 normal, 50 unclear and 50 tumor cases). Half of the 47,255 images were used for building the model and the other half were used for testing the model. The learning rate of the model was 0.0001 during 250 epochs (training cycles). Results The average accuracy, precision, recall, and F1 score over the category were 0.9067, 0.9744, 0.9067 and 0.9393, respectively. These performance measures had no change with respect to the intersection-over-union threshold (0.45, 0.50, and 0.55). This finding suggests the stability of the model. Conclusion Automated detection of normal, unclear and tumor images from colonoscopy videos is possible by using a deep learning framework. This is expected to provide an invaluable decision supporting system for clinical experts.

Download Full-text

A Deep Learning Framework for Malware Classification

International Journal of Digital Crime and Forensics ◽

10.4018/ijdcf.2020010105 ◽

2020 ◽

Vol 12 (1) ◽

pp. 90-108

Author(s):

Mahmoud Kalash ◽

Mrigank Rochan ◽

Noman Mohammed ◽

Neil Bruce ◽

Yang Wang ◽

...

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Learning Algorithms ◽

Superior Performance ◽

Traditional Learning ◽

Security Threats ◽

Learning Approaches ◽

Learning Framework ◽

Malware Classification ◽

New Strategies

In this article, the authors propose a deep learning framework for malware classification. There has been a huge increase in the volume of malware in recent years which poses serious security threats to financial institutions, businesses, and individuals. In order to combat the proliferation of malware, new strategies are essential to quickly identify and classify malware samples. Nowadays, machine learning approaches are becoming popular for malware classification. However, most of these approaches are based on shallow learning algorithms (e.g. SVM). Recently, convolutional neural networks (CNNs), a deep learning approach, have shown superior performance compared to traditional learning algorithms, especially in tasks such as image classification. Inspired by this, the authors propose a CNN-based architecture to classify malware samples. They convert malware binaries to grayscale images and subsequently train a CNN for classification. Experiments on two challenging malware classification datasets, namely Malimg and Microsoft, demonstrate that their method outperforms competing state-of-the-art algorithms.

Download Full-text

A Collaborative Deep and Shallow Semisupervised Learning Framework for Mobile App Classification

Mobile Information Systems ◽

10.1155/2020/4521723 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

MingQi Lv ◽

Chao Huang ◽

TieMing Chen ◽

Ting Wang

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Open Problem ◽

Classification Accuracy ◽

Mobile Apps ◽

Semisupervised Learning ◽

Mobile App ◽

Semantic Meaning ◽

Textual Information ◽

Learning Framework

With the rapid growth of mobile Apps, it is necessary to classify the mobile Apps into predefined categories. However, there are two problems that make this task challenging. First, the name of a mobile App is usually short and ambiguous to reflect its real semantic meaning. Second, it is usually difficult to collect adequate labeled samples to train a good classifier when a customized taxonomy of mobile Apps is required. For the first problem, we leverage Web knowledge to enrich the textual information of mobile Apps. For the second problem, the mostly utilized approach is the semisupervised learning, which exploits unlabeled samples in a cotraining scheme. However, how to enhance the diversity between base learners to maximize the power of the cotraining scheme is still an open problem. Aiming at this problem, we exploit totally different machine learning paradigms (i.e., shallow learning and deep learning) to ensure a greater degree of diversity. To this end, this paper proposes Co-DSL, a collaborative deep and shallow semisupervised learning framework, for mobile App classification using only a few labeled samples and a large number of unlabeled samples. The experiment results demonstrate the effectiveness of Co-DSL, which could achieve over 85% classification accuracy by using only two labeled samples from each mobile App category.

Download Full-text