scholarly journals Transfer Convolutional Neural Network for Cross-Project Defect Prediction

2019 ◽  
Vol 9 (13) ◽  
pp. 2660 ◽  
Author(s):  
Shaojian Qiu ◽  
Hao Xu ◽  
Jiehan Deng ◽  
Siyu Jiang ◽  
Lu Lu

Cross-project defect prediction (CPDP) is a practical solution that allows software defect prediction (SDP) to be used earlier in the software lifecycle. With the CPDP technique, the software defect predictor trained by labeled data of mature projects can be applied for the prediction task of a new project. Most previous CPDP approaches ignored the semantic information in the source code, and existing semantic-feature-based SDP methods do not take into account the data distribution divergence between projects. These limitations may weaken defect prediction performance. To solve these problems, we propose a novel approach, the transfer convolutional neural network (TCNN), to mine the transferable semantic (deep-learning (DL)-generated) features for CPDP tasks. Specifically, our approach first parses the source file into integer vectors as the network inputs. Next, to obtain the TCNN model, a matching layer is added into convolutional neural network where the hidden representations of the source and target project-specific data are embedded into a reproducing kernel Hilbert space for distribution matching. By simultaneously minimizing classification error and distribution divergence between projects, the constructed TCNN could extract the transferable DL-generated features. Finally, without losing the information contained in handcrafted features, we combine them with transferable DL-generated features to form the joint features for CPDP performing. Experiments based on 10 benchmark projects (with 90 pairs of CPDP tasks) showed that the proposed TCNN method is superior to the reference methods.

2021 ◽  
Vol 7 ◽  
pp. e739
Author(s):  
Ahmed Bahaa Farid ◽  
Enas Mohamed Fathy ◽  
Ahmed Sharaf Eldin ◽  
Laila A. Abd-Elmegid

In recent years, the software industry has invested substantial effort to improve software quality in organizations. Applying proactive software defect prediction will help developers and white box testers to find the defects earlier, and this will reduce the time and effort. Traditional software defect prediction models concentrate on traditional features of source code including code complexity, lines of code, etc. However, these features fail to extract the semantics of source code. In this research, we propose a hybrid model that is called CBIL. CBIL can predict the defective areas of source code. It extracts Abstract Syntax Tree (AST) tokens as vectors from source code. Mapping and word embedding turn integer vectors into dense vectors. Then, Convolutional Neural Network (CNN) extracts the semantics of AST tokens. After that, Bidirectional Long Short-Term Memory (Bi-LSTM) keeps key features and ignores other features in order to enhance the accuracy of software defect prediction. The proposed model CBIL is evaluated on a sample of seven open-source Java projects of the PROMISE dataset. CBIL is evaluated by applying the following evaluation metrics: F-measure and area under the curve (AUC). The results display that CBIL model improves the average of F-measure by 25% compared to CNN, as CNN accomplishes the top performance among the selected baseline models. In average of AUC, CBIL model improves AUC by 18% compared to Recurrent Neural Network (RNN), as RNN accomplishes the top performance among the selected baseline models used in the experiments.


2021 ◽  
Vol 11 (9) ◽  
pp. 4292
Author(s):  
Mónica Y. Moreno-Revelo ◽  
Lorena Guachi-Guachi ◽  
Juan Bernardo Gómez-Mendoza ◽  
Javier Revelo-Fuelagán ◽  
Diego H. Peluffo-Ordóñez

Automatic crop identification and monitoring is a key element in enhancing food production processes as well as diminishing the related environmental impact. Although several efficient deep learning techniques have emerged in the field of multispectral imagery analysis, the crop classification problem still needs more accurate solutions. This work introduces a competitive methodology for crop classification from multispectral satellite imagery mainly using an enhanced 2D convolutional neural network (2D-CNN) designed at a smaller-scale architecture, as well as a novel post-processing step. The proposed methodology contains four steps: image stacking, patch extraction, classification model design (based on a 2D-CNN architecture), and post-processing. First, the images are stacked to increase the number of features. Second, the input images are split into patches and fed into the 2D-CNN model. Then, the 2D-CNN model is constructed within a small-scale framework, and properly trained to recognize 10 different types of crops. Finally, a post-processing step is performed in order to reduce the classification error caused by lower-spatial-resolution images. Experiments were carried over the so-named Campo Verde database, which consists of a set of satellite images captured by Landsat and Sentinel satellites from the municipality of Campo Verde, Brazil. In contrast to the maximum accuracy values reached by remarkable works reported in the literature (amounting to an overall accuracy of about 81%, a f1 score of 75.89%, and average accuracy of 73.35%), the proposed methodology achieves a competitive overall accuracy of 81.20%, a f1 score of 75.89%, and an average accuracy of 88.72% when classifying 10 different crops, while ensuring an adequate trade-off between the number of multiply-accumulate operations (MACs) and accuracy. Furthermore, given its ability to effectively classify patches from two image sequences, this methodology may result appealing for other real-world applications, such as the classification of urban materials.


Sign in / Sign up

Export Citation Format

Share Document