CF-CNN: Coarse-to-Fine Convolutional Neural Network

Jinho Park; Heegwang Kim; Joonki Paik

doi:10.3390/app11083722

CF-CNN: Coarse-to-Fine Convolutional Neural Network

Applied Sciences ◽

10.3390/app11083722 ◽

2021 ◽

Vol 11 (8) ◽

pp. 3722

Author(s):

Jinho Park ◽

Heegwang Kim ◽

Joonki Paik

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Class Group ◽

Feature Maps ◽

Baseline Model ◽

Classification Boundary ◽

Grouping Method ◽

Classification Tasks ◽

Upper Level ◽

Coarse To Fine

In this paper, we present a coarse-to-fine convolutional neural network (CF-CNN) for learning multilabel classes. The basis of the proposed CF-CNN is a disjoint grouping method that first creates a class group with hierarchical association, and then assigns a new label to a class belonging to each group so that each class acquires multiple labels. CF-CNN consists of one main network and two subnetworks. Each subnetwork performs coarse prediction using the group labels created by the disjoint grouping method. The main network includes a refine convolution layer and performs fine prediction to fuse the feature maps acquired from the subnetwork. The generated class set in the upper level has the same classification boundary to that in the lower level. Since the classes belonging to the upper level label are classified with a higher priority, parameter optimization becomes easier. In experimental results, the proposed method is applied to various classification tasks to show a higher classification accuracy by up to 3% with a much smaller number of parameters without modification of the baseline model.

Download Full-text

AFibNet: an implementation of atrial fibrillation detection with convolutional neural network

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01571-1 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Bambang Tutuko ◽

Siti Nurmaini ◽

Alexander Edo Tondas ◽

Muhammad Naufal Rachmatullah ◽

Annisa Darmawahyuni ◽

...

Keyword(s):

Neural Network ◽

Atrial Fibrillation ◽

Convolutional Neural Network ◽

Learning System ◽

Single Frequency ◽

Feature Maps ◽

Normal Sinus ◽

Unseen Data ◽

Specific Device ◽

Model Formation

Abstract Background Generalization model capacity of deep learning (DL) approach for atrial fibrillation (AF) detection remains lacking. It can be seen from previous researches, the DL model formation used only a single frequency sampling of the specific device. Besides, each electrocardiogram (ECG) acquisition dataset produces a different length and sampling frequency to ensure sufficient precision of the R–R intervals to determine the heart rate variability (HRV). An accurate HRV is the gold standard for predicting the AF condition; therefore, a current challenge is to determine whether a DL approach can be used to analyze raw ECG data in a broad range of devices. This paper demonstrates powerful results for end-to-end implementation of AF detection based on a convolutional neural network (AFibNet). The method used a single learning system without considering the variety of signal lengths and frequency samplings. For implementation, the AFibNet is processed with a computational cloud-based DL approach. This study utilized a one-dimension convolutional neural networks (1D-CNNs) model for 11,842 subjects. It was trained and validated with 8232 records based on three datasets and tested with 3610 records based on eight datasets. The predicted results, when compared with the diagnosis results indicated by human practitioners, showed a 99.80% accuracy, sensitivity, and specificity. Result Meanwhile, when tested using unseen data, the AF detection reaches 98.94% accuracy, 98.97% sensitivity, and 98.97% specificity at a sample period of 0.02 seconds using the DL Cloud System. To improve the confidence of the AFibNet model, it also validated with 18 arrhythmias condition defined as Non-AF-class. Thus, the data is increased from 11,842 to 26,349 instances for three-class, i.e., Normal sinus (N), AF and Non-AF. The result found 96.36% accuracy, 93.65% sensitivity, and 96.92% specificity. Conclusion These findings demonstrate that the proposed approach can use unknown data to derive feature maps and reliably detect the AF periods. We have found that our cloud-DL system is suitable for practical deployment

Download Full-text

Spectral Convolution Feature-Based SPD Matrix Representation for Signal Detection Using a Deep Neural Network

Entropy ◽

10.3390/e22090949 ◽

2020 ◽

Vol 22 (9) ◽

pp. 949

Author(s):

Jiangyi Wang ◽

Min Liu ◽

Xinwu Zeng ◽

Xiaoqiang Hua

Keyword(s):

Neural Network ◽

Signal Detection ◽

Convolutional Neural Network ◽

Deep Neural Network ◽

Detection Method ◽

Learning Algorithm ◽

Simulated Data ◽

Data Sets ◽

Feature Maps ◽

Simulated Data Sets

Convolutional neural networks have powerful performances in many visual tasks because of their hierarchical structures and powerful feature extraction capabilities. SPD (symmetric positive definition) matrix is paid attention to in visual classification, because it has excellent ability to learn proper statistical representation and distinguish samples with different information. In this paper, a deep neural network signal detection method based on spectral convolution features is proposed. In this method, local features extracted from convolutional neural network are used to construct the SPD matrix, and a deep learning algorithm for the SPD matrix is used to detect target signals. Feature maps extracted by two kinds of convolutional neural network models are applied in this study. Based on this method, signal detection has become a binary classification problem of signals in samples. In order to prove the availability and superiority of this method, simulated and semi-physical simulated data sets are used. The results show that, under low SCR (signal-to-clutter ratio), compared with the spectral signal detection method based on the deep neural network, this method can obtain a gain of 0.5–2 dB on simulated data sets and semi-physical simulated data sets.

Download Full-text

Research on Real-Time Multiple Single Garbage Classification Based on Convolutional Neural Network

Mathematical Problems in Engineering ◽

10.1155/2020/5795976 ◽

2020 ◽

Vol 2020 ◽

pp. 1-6

Author(s):

Jian-ye Yuan ◽

Xin-yuan Nan ◽

Cheng-rong Li ◽

Le-le Sun

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Real Time ◽

Environmental Science ◽

Fine Tuning ◽

Classification Model ◽

Classification Models ◽

Data Set ◽

Classification Tasks ◽

Accuracy Rates

Considering that the garbage classification is urgent, a 23-layer convolutional neural network (CNN) model is designed in this paper, with the emphasis on the real-time garbage classification, to solve the low accuracy of garbage classification and recycling and difficulty in manual recycling. Firstly, the depthwise separable convolution was used to reduce the Params of the model. Then, the attention mechanism was used to improve the accuracy of the garbage classification model. Finally, the model fine-tuning method was used to further improve the performance of the garbage classification model. Besides, we compared the model with classic image classification models including AlexNet, VGG16, and ResNet18 and lightweight classification models including MobileNetV2 and SuffleNetV2 and found that the model GAF_dense has a higher accuracy rate, fewer Params, and FLOPs. To further check the performance of the model, we tested the CIFAR-10 data set and found the accuracy rates of the model (GAF_dense) are 0.018 and 0.03 higher than ResNet18 and SufflenetV2, respectively. In the ImageNet data set, the accuracy rates of the model (GAF_dense) are 0.225 and 0.146 higher than Resnet18 and SufflenetV2, respectively. Therefore, the garbage classification model proposed in this paper is suitable for garbage classification and other classification tasks to protect the ecological environment, which can be applied to classification tasks such as environmental science, children’s education, and environmental protection.

Download Full-text

Mixup of Feature Maps in a Hidden Layer for Training of Convolutional Neural Network

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-04179-3_56 ◽

2018 ◽

pp. 635-644

Author(s):

Hideki Oki ◽

Takio Kurita

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Feature Maps ◽

Hidden Layer

Download Full-text

PedNet: A Spatio-Temporal Deep Convolutional Neural Network for Pedestrian Segmentation

Journal of Imaging ◽

10.3390/jimaging4090107 ◽

2018 ◽

Vol 4 (9) ◽

pp. 107 ◽

Cited By ~ 5

Author(s):

Mohib Ullah ◽

Ahmed Mohammed ◽

Faouzi Alaya Cheikh

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Performance Metrics ◽

State Of The Art ◽

Temporal Information ◽

Feature Maps ◽

Current Frame ◽

Low Level ◽

Art Methods ◽

Spatio Temporal

Articulation modeling, feature extraction, and classification are the important components of pedestrian segmentation. Usually, these components are modeled independently from each other and then combined in a sequential way. However, this approach is prone to poor segmentation if any individual component is weakly designed. To cope with this problem, we proposed a spatio-temporal convolutional neural network named PedNet which exploits temporal information for spatial segmentation. The backbone of the PedNet consists of an encoder–decoder network for downsampling and upsampling the feature maps, respectively. The input to the network is a set of three frames and the output is a binary mask of the segmented regions in the middle frame. Irrespective of classical deep models where the convolution layers are followed by a fully connected layer for classification, PedNet is a Fully Convolutional Network (FCN). It is trained end-to-end and the segmentation is achieved without the need of any pre- or post-processing. The main characteristic of PedNet is its unique design where it performs segmentation on a frame-by-frame basis but it uses the temporal information from the previous and the future frame for segmenting the pedestrian in the current frame. Moreover, to combine the low-level features with the high-level semantic information learned by the deeper layers, we used long-skip connections from the encoder to decoder network and concatenate the output of low-level layers with the higher level layers. This approach helps to get segmentation map with sharp boundaries. To show the potential benefits of temporal information, we also visualized different layers of the network. The visualization showed that the network learned different information from the consecutive frames and then combined the information optimally to segment the middle frame. We evaluated our approach on eight challenging datasets where humans are involved in different activities with severe articulation (football, road crossing, surveillance). The most common CamVid dataset which is used for calculating the performance of the segmentation algorithm is evaluated against seven state-of-the-art methods. The performance is shown on precision/recall, F 1 , F 2 , and mIoU. The qualitative and quantitative results show that PedNet achieves promising results against state-of-the-art methods with substantial improvement in terms of all the performance metrics.

Download Full-text

Permutation Entropy-Based Interpretability of Convolutional Neural Network Models for Interictal EEG Discrimination of Subjects with Epileptic Seizures vs. Psychogenic Non-Epileptic Seizures

Entropy ◽

10.3390/e24010102 ◽

2022 ◽

Vol 24 (1) ◽

pp. 102

Author(s):

Michele Lo Giudice ◽

Giuseppe Varone ◽

Cosimo Ieracitano ◽

Nadia Mammone ◽

Giovanbattista Gaspare Tripodi ◽

...

Keyword(s):

Neural Network ◽

Discriminant Analysis ◽

Convolutional Neural Network ◽

Epileptic Seizures ◽

Permutation Entropy ◽

Diagnostic Tools ◽

Support Vector ◽

Feature Maps ◽

Time Frequency ◽

Interictal Eeg

The differential diagnosis of epileptic seizures (ES) and psychogenic non-epileptic seizures (PNES) may be difficult, due to the lack of distinctive clinical features. The interictal electroencephalographic (EEG) signal may also be normal in patients with ES. Innovative diagnostic tools that exploit non-linear EEG analysis and deep learning (DL) could provide important support to physicians for clinical diagnosis. In this work, 18 patients with new-onset ES (12 males, 6 females) and 18 patients with video-recorded PNES (2 males, 16 females) with normal interictal EEG at visual inspection were enrolled. None of them was taking psychotropic drugs. A convolutional neural network (CNN) scheme using DL classification was designed to classify the two categories of subjects (ES vs. PNES). The proposed architecture performs an EEG time-frequency transformation and a classification step with a CNN. The CNN was able to classify the EEG recordings of subjects with ES vs. subjects with PNES with 94.4% accuracy. CNN provided high performance in the assigned binary classification when compared to standard learning algorithms (multi-layer perceptron, support vector machine, linear discriminant analysis and quadratic discriminant analysis). In order to interpret how the CNN achieved this performance, information theoretical analysis was carried out. Specifically, the permutation entropy (PE) of the feature maps was evaluated and compared in the two classes. The achieved results, although preliminary, encourage the use of these innovative techniques to support neurologists in early diagnoses.

Download Full-text

Block-Based Compression and Corresponding Hardware Circuits for Sparse Activations

Sensors ◽

10.3390/s21227468 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7468

Author(s):

Yui-Kai Weng ◽

Shih-Hsu Huang ◽

Hsu-Yu Kao

Keyword(s):

Neural Network ◽

Power Consumption ◽

Convolutional Neural Network ◽

Feature Maps ◽

Benchmark Data ◽

Data Volume ◽

Block Based

In a CNN (convolutional neural network) accelerator, to reduce memory traffic and power consumption, there is a need to exploit the sparsity of activation values. Therefore, some research efforts have been paid to skip ineffectual computations (i.e., multiplications by zero). Different from previous works, in this paper, we point out the similarity of activation values: (1) in the same layer of a CNN model, most feature maps are either highly dense or highly sparse; (2) in the same layer of a CNN model, feature maps in different channels are often similar. Based on the two observations, we propose a block-based compression approach, which utilizes both the sparsity and the similarity of activation values to further reduce the data volume. Moreover, we also design an encoder, a decoder and an indexing module to support the proposed approach. The encoder is used to translate output activations into the proposed block-based compression format, while both the decoder and the indexing module are used to align nonzero values for effectual computations. Compared with previous works, benchmark data consistently show that the proposed approach can greatly reduce both memory traffic and power consumption.

Download Full-text

Label Rectification Learning through Kernel Extreme Learning Machine

Wireless Communications and Mobile Computing ◽

10.1155/2021/6669081 ◽

2021 ◽

Vol 2021 ◽

pp. 1-6

Author(s):

Qiang Cai ◽

Fenghai Li ◽

Yifan Chen ◽

Haisheng Li ◽

Jian Cao ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Image Classification ◽

Extreme Learning Machine ◽

Classification Performance ◽

Considerable Progress ◽

Strong Representation ◽

Kernel Extreme Learning Machine ◽

Classification Tasks ◽

Learning Machine

Along with the strong representation of the convolutional neural network (CNN), image classification tasks have achieved considerable progress. However, majority of works focus on designing complicated and redundant architectures for extracting informative features to improve classification performance. In this study, we concentrate on rectifying the incomplete outputs of CNN. To be concrete, we propose an innovative image classification method based on Label Rectification Learning (LRL) through kernel extreme learning machine (KELM). It mainly consists of two steps: (1) preclassification, extracting incomplete labels through a pretrained CNN, and (2) label rectification, rectifying the generated incomplete labels by the KELM to obtain the rectified labels. Experiments conducted on publicly available datasets demonstrate the effectiveness of our method. Notably, our method is extensible which can be easily integrated with off-the-shelf networks for improving performance.

Download Full-text

Optimizing Convolutional Neural Network Accelerator on Low-Cost FPGA

Journal of Circuits System and Computers ◽

10.1142/s0218126621501930 ◽

2021 ◽

pp. 2150193

Author(s):

Truong Quang Vinh ◽

Dinh Viet Hai

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Low Cost ◽

Optimal Number ◽

Data Reuse ◽

Logic Element ◽

Input Buffer ◽

Classification Tasks ◽

Processing Engine ◽

Better Than

Convolutional neural network (CNN) is one of the most promising algorithms that outweighs other traditional methods in terms of accuracy in classification tasks. However, several CNNs, such as VGG, demand a huge computation in convolutional layers. Many accelerators implemented on powerful FPGAs have been introduced to address the problems. In this paper, we present a VGG-based accelerator which is optimized for a low-cost FPGA. In order to optimize the FPGA resource of logic element and memory, we propose a dedicated input buffer that maximizes the data reuse. In addition, we design a low resource processing engine with the optimal number of Multiply Accumulate (MAC) units. In the experiments, we use VGG16 model for inference to evaluate the performance of our accelerator and achieve a throughput of 38.8[Formula: see text]GOPS at a clock speed of 150[Formula: see text]MHz on Intel Cyclone V SX SoC. The experimental results show that our design is better than previous works in terms of resource efficiency.

Download Full-text

Classification and grading of diabetic retinopathy images using mixture of ensemble classifiers

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211364 ◽

2021 ◽

pp. 1-13

Author(s):

R. Bhuvaneswari ◽

S. Ganesh Vaidyanathan

Keyword(s):

Neural Network ◽

Diabetic Retinopathy ◽

Convolutional Neural Network ◽

Blood Vessels ◽

Support Vector ◽

Network Architectures ◽

Ensemble Classifiers ◽

Feature Maps ◽

Class Labels ◽

Hierarchical Features

Diabetic Retinopathy (DR) is one of the most common diabetic diseases that affect the retina’s blood vessels. Too much of the glucose level in blood leads to blockage of blood vessels in the retina, weakening and damaging the retina. Automatic classification of diabetic retinopathy is a challenging task in medical research. This work proposes a Mixture of Ensemble Classifiers (MEC) to classify and grade diabetic retinopathy images using hierarchical features. We use an ensemble of classifiers such as support vector machine, random forest, and Adaboost classifiers that use the hierarchical feature maps obtained at every pooling layer of a convolutional neural network (CNN) for training. The feature maps are generated by applying the filters to the output of the previous layer. Lastly, we predict the class label or the grade for the given test diabetic retinopathy image by considering the class labels of all the ensembled classifiers. We have tested our approaches on the E-ophtha dataset for the classification task and the Messidor dataset for the grading task. We achieved an accuracy of 95.8% and 96.2% for the E-ophtha and Messidor datasets, respectively. A comparison among prominent convolutional neural network architectures and the proposed approach is provided.

Download Full-text