scholarly journals Effect of dual-convolutional neural network model fusion for Aluminum profile surface defects classification and recognition

2021 ◽  
Vol 19 (1) ◽  
pp. 997-1025
Author(s):  
Xiaochen Liu ◽  
◽  
Weidong He ◽  
Yinghui Zhang ◽  
Shixuan Yao ◽  
...  

<abstract> <p>Classifying and identifying surface defects is essential during the production and use of aluminum profiles. Recently, the dual-convolutional neural network(CNN) model fusion framework has shown promising performance for defects classification and recognition. Spurred by this trend, this paper proposes an improved dual-CNN model fusion framework to classify and identify defects in aluminum profiles. Compared with traditional dual-CNN model fusion frameworks, the proposed architecture involves an improved fusion layer, fusion strategy, and classifier block. Specifically, the suggested method extracts the feature map of the aluminum profile RGB image from the pre-trained VGG16 model's <italic>pool5</italic> layer and the feature map of the maximum pooling layer of the suggested A4 network, which is added after the Alexnet model. then, weighted bilinear interpolation unsamples the feature maps extracted from the maximum pooling layer of the A4 part. The network layer and upsampling schemes ensure equal feature map dimensions ensuring feature map merging utilizing an improved wavelet transform. Finally, global average pooling is employed in the classifier block instead of dense layers to reduce the model's parameters and avoid overfitting. The fused feature map is then input into the classifier block for classification. The experimental setup involves data augmentation and transfer learning to prevent overfitting due to the small-sized data sets exploited, while the K cross-validation method is employed to evaluate the model's performance during the training process. The experimental results demonstrate that the proposed dual-CNN model fusion framework attains a classification accuracy higher than current techniques, and specifically 4.3% higher than Alexnet, 2.5% for VGG16, 2.9% for Inception v3, 2.2% for VGG19, 3.6% for Resnet50, 3% for Resnet101, and 0.7% and 1.2% than the conventional dual-CNN fusion framework 1 and 2, respectively, proving the effectiveness of the proposed strategy.</p> </abstract>

Author(s):  
В’ячеслав Васильович Москаленко ◽  
Альона Сергіївна Москаленко ◽  
Артем Геннадійович Коробов ◽  
Микола Олександрович Зарецький ◽  
Віктор Анатолійович Семашко

The efficient model and learning algorithm of the small object detection system for compact aerial vehicle under conditions of restricted computing resources and the limited volume of the labeled learning set are developed. The four-stage learning algorithm of the object detector is proposed. At the first stage, selecting the type of deep convolutional neural network and the number of low-level layers that is pretrained on the ImageNet dataset for reusing takes place. The second stage involves unsupervised learning of high-level convolutional sparse coding layers using the modification of growing neural gas to automatically determine the required number of neurons and provide optimal distributions of the neurons over the data. Its application makes it possible to utilize the unlabeled learning datasets for the adaptation of the high-level feature description to the domain application area. At the third stage, the output feature map is formed by concatenation of feature maps from the different level of the deep convolutional neural network. At that, there is a reduction of output feature map using principal component analysis and followed by the building of decision rules. In order to perform the classification analysis of output, feature map is proposed to use information-extreme classifier learning on principles of boosting. Besides that, the orthogonal incremental extreme learning machine is used to build the regression model for the predict bounding box of the detected small object. The last stage involves fine-tuning of high-level layers of deep network using simulated annealing metaheuristic algorithm in order to approximate the global optimum of the complex criterion of learning efficiency of detection model. As a result of the use of proposed approach has been achieved 96% correctly detection of objects on the images of the open test dataset which indicates the suitability of the model and learning algorithm for practical use. In this case, the size of the learning dataset that has been used to construct the model was 500 unlabeled and 200 labeled learning samples


2021 ◽  
Vol 61 (5) ◽  
pp. 1579-1583
Author(s):  
Wenyan Wang ◽  
Kun Lu ◽  
Ziheng Wu ◽  
Hongming Long ◽  
Jun Zhang ◽  
...  

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Bambang Tutuko ◽  
Siti Nurmaini ◽  
Alexander Edo Tondas ◽  
Muhammad Naufal Rachmatullah ◽  
Annisa Darmawahyuni ◽  
...  

Abstract Background Generalization model capacity of deep learning (DL) approach for atrial fibrillation (AF) detection remains lacking. It can be seen from previous researches, the DL model formation used only a single frequency sampling of the specific device. Besides, each electrocardiogram (ECG) acquisition dataset produces a different length and sampling frequency to ensure sufficient precision of the R–R intervals to determine the heart rate variability (HRV). An accurate HRV is the gold standard for predicting the AF condition; therefore, a current challenge is to determine whether a DL approach can be used to analyze raw ECG data in a broad range of devices. This paper demonstrates powerful results for end-to-end implementation of AF detection based on a convolutional neural network (AFibNet). The method used a single learning system without considering the variety of signal lengths and frequency samplings. For implementation, the AFibNet is processed with a computational cloud-based DL approach. This study utilized a one-dimension convolutional neural networks (1D-CNNs) model for 11,842 subjects. It was trained and validated with 8232 records based on three datasets and tested with 3610 records based on eight datasets. The predicted results, when compared with the diagnosis results indicated by human practitioners, showed a 99.80% accuracy, sensitivity, and specificity. Result Meanwhile, when tested using unseen data, the AF detection reaches 98.94% accuracy, 98.97% sensitivity, and 98.97% specificity at a sample period of 0.02 seconds using the DL Cloud System. To improve the confidence of the AFibNet model, it also validated with 18 arrhythmias condition defined as Non-AF-class. Thus, the data is increased from 11,842 to 26,349 instances for three-class, i.e., Normal sinus (N), AF and Non-AF. The result found 96.36% accuracy, 93.65% sensitivity, and 96.92% specificity. Conclusion These findings demonstrate that the proposed approach can use unknown data to derive feature maps and reliably detect the AF periods. We have found that our cloud-DL system is suitable for practical deployment


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 949
Author(s):  
Jiangyi Wang ◽  
Min Liu ◽  
Xinwu Zeng ◽  
Xiaoqiang Hua

Convolutional neural networks have powerful performances in many visual tasks because of their hierarchical structures and powerful feature extraction capabilities. SPD (symmetric positive definition) matrix is paid attention to in visual classification, because it has excellent ability to learn proper statistical representation and distinguish samples with different information. In this paper, a deep neural network signal detection method based on spectral convolution features is proposed. In this method, local features extracted from convolutional neural network are used to construct the SPD matrix, and a deep learning algorithm for the SPD matrix is used to detect target signals. Feature maps extracted by two kinds of convolutional neural network models are applied in this study. Based on this method, signal detection has become a binary classification problem of signals in samples. In order to prove the availability and superiority of this method, simulated and semi-physical simulated data sets are used. The results show that, under low SCR (signal-to-clutter ratio), compared with the spectral signal detection method based on the deep neural network, this method can obtain a gain of 0.5–2 dB on simulated data sets and semi-physical simulated data sets.


2018 ◽  
Vol 4 (9) ◽  
pp. 107 ◽  
Author(s):  
Mohib Ullah ◽  
Ahmed Mohammed ◽  
Faouzi Alaya Cheikh

Articulation modeling, feature extraction, and classification are the important components of pedestrian segmentation. Usually, these components are modeled independently from each other and then combined in a sequential way. However, this approach is prone to poor segmentation if any individual component is weakly designed. To cope with this problem, we proposed a spatio-temporal convolutional neural network named PedNet which exploits temporal information for spatial segmentation. The backbone of the PedNet consists of an encoder–decoder network for downsampling and upsampling the feature maps, respectively. The input to the network is a set of three frames and the output is a binary mask of the segmented regions in the middle frame. Irrespective of classical deep models where the convolution layers are followed by a fully connected layer for classification, PedNet is a Fully Convolutional Network (FCN). It is trained end-to-end and the segmentation is achieved without the need of any pre- or post-processing. The main characteristic of PedNet is its unique design where it performs segmentation on a frame-by-frame basis but it uses the temporal information from the previous and the future frame for segmenting the pedestrian in the current frame. Moreover, to combine the low-level features with the high-level semantic information learned by the deeper layers, we used long-skip connections from the encoder to decoder network and concatenate the output of low-level layers with the higher level layers. This approach helps to get segmentation map with sharp boundaries. To show the potential benefits of temporal information, we also visualized different layers of the network. The visualization showed that the network learned different information from the consecutive frames and then combined the information optimally to segment the middle frame. We evaluated our approach on eight challenging datasets where humans are involved in different activities with severe articulation (football, road crossing, surveillance). The most common CamVid dataset which is used for calculating the performance of the segmentation algorithm is evaluated against seven state-of-the-art methods. The performance is shown on precision/recall, F 1 , F 2 , and mIoU. The qualitative and quantitative results show that PedNet achieves promising results against state-of-the-art methods with substantial improvement in terms of all the performance metrics.


Entropy ◽  
2022 ◽  
Vol 24 (1) ◽  
pp. 102
Author(s):  
Michele Lo Giudice ◽  
Giuseppe Varone ◽  
Cosimo Ieracitano ◽  
Nadia Mammone ◽  
Giovanbattista Gaspare Tripodi ◽  
...  

The differential diagnosis of epileptic seizures (ES) and psychogenic non-epileptic seizures (PNES) may be difficult, due to the lack of distinctive clinical features. The interictal electroencephalographic (EEG) signal may also be normal in patients with ES. Innovative diagnostic tools that exploit non-linear EEG analysis and deep learning (DL) could provide important support to physicians for clinical diagnosis. In this work, 18 patients with new-onset ES (12 males, 6 females) and 18 patients with video-recorded PNES (2 males, 16 females) with normal interictal EEG at visual inspection were enrolled. None of them was taking psychotropic drugs. A convolutional neural network (CNN) scheme using DL classification was designed to classify the two categories of subjects (ES vs. PNES). The proposed architecture performs an EEG time-frequency transformation and a classification step with a CNN. The CNN was able to classify the EEG recordings of subjects with ES vs. subjects with PNES with 94.4% accuracy. CNN provided high performance in the assigned binary classification when compared to standard learning algorithms (multi-layer perceptron, support vector machine, linear discriminant analysis and quadratic discriminant analysis). In order to interpret how the CNN achieved this performance, information theoretical analysis was carried out. Specifically, the permutation entropy (PE) of the feature maps was evaluated and compared in the two classes. The achieved results, although preliminary, encourage the use of these innovative techniques to support neurologists in early diagnoses.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7468
Author(s):  
Yui-Kai Weng ◽  
Shih-Hsu Huang ◽  
Hsu-Yu Kao

In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous works, in this paper, we point out the similarity of activation values: (1) in the same layer of a CNN model, most feature maps are either highly dense or highly sparse; (2) in the same layer of a CNN model, feature maps in different channels are often similar. Based on the two observations, we propose a block-based compression approach, which utilizes both the sparsity and the similarity of activation values to further reduce the data volume. Moreover, we also design an encoder, a decoder and an indexing module to support the proposed approach. The encoder is used to translate output activations into the proposed block-based compression format, while both the decoder and the indexing module are used to align nonzero values for effectual computations. Compared with previous works, benchmark data consistently show that the proposed approach can greatly reduce both memory traffic and power consumption.


Author(s):  
Ranganath Singari ◽  
Karun Singla ◽  
Gangesh Chawla

Deep learning has offered new avenues in the field of industrial management. Traditional methods of quality inspection such as Acceptance Sampling relies on a probabilistic measure derived from inspecting a sample of finished products. Evaluating a fixed number of products to derive the quality level for the complete batch is not a robust approach. Visual inspection solutions based on deep learning can be employed in the large manufacturing units to improve the quality inspection units for steel surface defect detection. This leads to optimization of the human capital due to reduction in manual intervention and turnaround time in the overall supply chain of the industry. Consequently, the sample size in the Acceptance sampling can be increased with minimal effort vis-à-vis an increase in the overall accuracy of the inspection. The learning curve of this work is supported by Convolutional Neural Network which has been used to extract feature representations from grayscale images to classify theinputs into six types of surface defects. The neural network architecture is compiled in Keras framework using Tensorflow backend with state of the art Adam RMS Prop with Nesterov Momentum (NADAM) optimizer. The proposed classification algorithm holds the potential to identify the dominant flaws in the manufacturing system responsible for leaking costs.


Sign in / Sign up

Export Citation Format

Share Document