scholarly journals A Full Stage Data Augmentation Method in Deep Convolutional Neural Network for Natural Image Classification

2020 ◽  
Vol 2020 ◽  
pp. 1-11 ◽  
Author(s):  
Qinghe Zheng ◽  
Mingqiang Yang ◽  
Xinyu Tian ◽  
Nan Jiang ◽  
Deqiang Wang

Nowadays, deep learning has achieved remarkable results in many computer vision related tasks, among which the support of big data is essential. In this paper, we propose a full stage data augmentation framework to improve the accuracy of deep convolutional neural networks, which can also play the role of implicit model ensemble without introducing additional model training costs. Simultaneous data augmentation during training and testing stages can ensure network optimization and enhance its generalization ability. Augmentation in two stages needs to be consistent to ensure the accurate transfer of specific domain information. Furthermore, this framework is universal for any network architecture and data augmentation strategy and therefore can be applied to a variety of deep learning based tasks. Finally, experimental results about image classification on the coarse-grained dataset CIFAR-10 (93.41%) and fine-grained dataset CIFAR-100 (70.22%) demonstrate the effectiveness of the framework by comparing with state-of-the-art results.

Author(s):  
Oleksii Denysenko

Image classification is one of the most fundamental applications in the field of computer vision, which continues to arouse enormous interest. People can recognize a large number of objects in images with little effort, even though a number of their characteristics can change. Objects can be recognized even when their detection is partially difficult. At the same time, algorithmic description of the recognition problem for computer implementation is still an urgent problem. Existing methods for solving this problem are effective only for certain cases (for example, for geometric objects, human faces, road signs, printed or handwritten symbols) and only under certain conditions. A model designed to identify and classify objects must be able to identify their location, as well as distinguish between various features of objects, such as edges, corners, color differences, etc. Deep convolutional neural networks have demonstrated the best performance when working with images, which sometimes exceeds the capabilities of human vision. However, even with this significant improvement, there are still some issues with overfitting and vanishing gradient. To solve them, some well-known methods are used: data augmentation, batch normalization and dropout; modern classification models do not perform color space conversion of original RGB images. The study of the use of different color spaces in the task of image classification is one of the topical problems related to deep learning, and it determines the relevance of this study. If this problem is solved, this will improve performance of the models used.


2020 ◽  
Vol 10 (3) ◽  
pp. 965 ◽  
Author(s):  
Ryosuke Sato ◽  
Yutaro Iwamoto ◽  
Kook Cho ◽  
Do-Young Kang ◽  
Yen-Wei Chen

Alzheimer’s disease (AD) is an irreversible progressive cerebral disease with most of its symptoms appearing after 60 years of age. Alzheimer’s disease has been largely attributed to accumulation of amyloid beta (Aβ), but a complete cure has remained elusive. 18F-Florbetaben amyloid positron emission tomography (PET) has been shown as a more powerful tool for understanding AD-related brain changes than magnetic resonance imaging and computed tomography. In this paper, we propose an accurate classification method for scoring brain amyloid plaque load (BAPL) based on deep convolutional neural networks. A joint discriminative loss function was formulated by adding a discriminative intra-loss function to the conventional (cross-entropy) loss function. The performance of the proposed joint loss function was compared with that of the conventional loss function in three state-of-the-art deep neural network architectures. The intra-loss function significantly improved the BAPL classification performance. In addition, we showed that the mix-up data augmentation method, originally proposed for natural image classification, was also useful for medical image classification.


Information ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 249
Author(s):  
Xin Jin ◽  
Yuanwen Zou ◽  
Zhongbing Huang

The cell cycle is an important process in cellular life. In recent years, some image processing methods have been developed to determine the cell cycle stages of individual cells. However, in most of these methods, cells have to be segmented, and their features need to be extracted. During feature extraction, some important information may be lost, resulting in lower classification accuracy. Thus, we used a deep learning method to retain all cell features. In order to solve the problems surrounding insufficient numbers of original images and the imbalanced distribution of original images, we used the Wasserstein generative adversarial network-gradient penalty (WGAN-GP) for data augmentation. At the same time, a residual network (ResNet) was used for image classification. ResNet is one of the most used deep learning classification networks. The classification accuracy of cell cycle images was achieved more effectively with our method, reaching 83.88%. Compared with an accuracy of 79.40% in previous experiments, our accuracy increased by 4.48%. Another dataset was used to verify the effect of our model and, compared with the accuracy from previous results, our accuracy increased by 12.52%. The results showed that our new cell cycle image classification system based on WGAN-GP and ResNet is useful for the classification of imbalanced images. Moreover, our method could potentially solve the low classification accuracy in biomedical images caused by insufficient numbers of original images and the imbalanced distribution of original images.


2021 ◽  
Vol 11 (15) ◽  
pp. 7148
Author(s):  
Bedada Endale ◽  
Abera Tullu ◽  
Hayoung Shi ◽  
Beom-Soo Kang

Unmanned aerial vehicles (UAVs) are being widely utilized for various missions: in both civilian and military sectors. Many of these missions demand UAVs to acquire artificial intelligence about the environments they are navigating in. This perception can be realized by training a computing machine to classify objects in the environment. One of the well known machine training approaches is supervised deep learning, which enables a machine to classify objects. However, supervised deep learning comes with huge sacrifice in terms of time and computational resources. Collecting big input data, pre-training processes, such as labeling training data, and the need for a high performance computer for training are some of the challenges that supervised deep learning poses. To address these setbacks, this study proposes mission specific input data augmentation techniques and the design of light-weight deep neural network architecture that is capable of real-time object classification. Semi-direct visual odometry (SVO) data of augmented images are used to train the network for object classification. Ten classes of 10,000 different images in each class were used as input data where 80% were for training the network and the remaining 20% were used for network validation. For the optimization of the designed deep neural network, a sequential gradient descent algorithm was implemented. This algorithm has the advantage of handling redundancy in the data more efficiently than other algorithms.


2020 ◽  
Author(s):  
B Wang ◽  
Y Sun ◽  
Bing Xue ◽  
Mengjie Zhang

© 2019, Springer Nature Switzerland AG. Image classification is a difficult machine learning task, where Convolutional Neural Networks (CNNs) have been applied for over 20 years in order to solve the problem. In recent years, instead of the traditional way of only connecting the current layer with its next layer, shortcut connections have been proposed to connect the current layer with its forward layers apart from its next layer, which has been proved to be able to facilitate the training process of deep CNNs. However, there are various ways to build the shortcut connections, it is hard to manually design the best shortcut connections when solving a particular problem, especially given the design of the network architecture is already very challenging. In this paper, a hybrid evolutionary computation (EC) method is proposed to automatically evolve both the architecture of deep CNNs and the shortcut connections. Three major contributions of this work are: Firstly, a new encoding strategy is proposed to encode a CNN, where the architecture and the shortcut connections are encoded separately; Secondly, a hybrid two-level EC method, which combines particle swarm optimisation and genetic algorithms, is developed to search for the optimal CNNs; Lastly, an adjustable learning rate is introduced for the fitness evaluations, which provides a better learning rate for the training process given a fixed number of epochs. The proposed algorithm is evaluated on three widely used benchmark datasets of image classification and compared with 12 peer Non-EC based competitors and one EC based competitor. The experimental results demonstrate that the proposed method outperforms all of the peer competitors in terms of classification accuracy.


Data ◽  
2018 ◽  
Vol 3 (3) ◽  
pp. 28 ◽  
Author(s):  
Kasthurirangan Gopalakrishnan

Deep learning, more specifically deep convolutional neural networks, is fast becoming a popular choice for computer vision-based automated pavement distress detection. While pavement image analysis has been extensively researched over the past three decades or so, recent ground-breaking achievements of deep learning algorithms in the areas of machine translation, speech recognition, and computer vision has sparked interest in the application of deep learning to automated detection of distresses in pavement images. This paper provides a narrative review of recently published studies in this field, highlighting the current achievements and challenges. A comparison of the deep learning software frameworks, network architecture, hyper-parameters employed by each study, and crack detection performance is provided, which is expected to provide a good foundation for driving further research on this important topic in the context of smart pavement or asset management systems. The review concludes with potential avenues for future research; especially in the application of deep learning to not only detect, but also characterize the type, extent, and severity of distresses from 2D and 3D pavement images.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1594
Author(s):  
Haifeng Li ◽  
Xin Dou ◽  
Chao Tao ◽  
Zhixiang Wu ◽  
Jie Chen ◽  
...  

Image classification is a fundamental task in remote sensing image processing. In recent years, deep convolutional neural networks (DCNNs) have experienced significant breakthroughs in natural image recognition. The remote sensing field, however, is still lacking a large-scale benchmark similar to ImageNet. In this paper, we propose a remote sensing image classification benchmark (RSI-CB) based on massive, scalable, and diverse crowdsourced data. Using crowdsourced data, such as Open Street Map (OSM) data, ground objects in remote sensing images can be annotated effectively using points of interest, vector data from OSM, or other crowdsourced data. These annotated images can, then, be used in remote sensing image classification tasks. Based on this method, we construct a worldwide large-scale benchmark for remote sensing image classification. This benchmark has large-scale geographical distribution and large total image number. It contains six categories with 35 sub-classes of more than 24,000 images of size 256 × 256 pixels. This classification system of ground objects is defined according to the national standard of land-use classification in China and is inspired by the hierarchy mechanism of ImageNet. Finally, we conduct numerous experiments to compare RSI-CB with the SAT-4, SAT-6, and UC-Merced data sets. The experiments show that RSI-CB is more suitable as a benchmark for remote sensing image classification tasks than other benchmarks in the big data era and has many potential applications.


2019 ◽  
Vol 1 (11) ◽  
Author(s):  
Chollette C. Olisah ◽  
Lyndon Smith

Abstract Deep convolutional neural networks have achieved huge successes in application domains like object and face recognition. The performance gain is attributed to different facets of the network architecture such as: depth of the convolutional layers, activation function, pooling, batch normalization, forward and back propagation and many more. However, very little emphasis is made on the preprocessor’s module of the network. Therefore, in this paper, the network’s preprocessing module is varied across different preprocessing approaches while keeping constant other facets of the deep network architecture, to investigate the contribution preprocessing makes to the network. Commonly used preprocessors are the data augmentation and normalization and are termed conventional preprocessors. Others are termed the unconventional preprocessors, they are: color space converters; grey-level resolution preprocessors; full-based and plane-based image quantization, Gaussian blur, illumination normalization and insensitive feature preprocessors. To achieve fixed network parameters, CNNs with transfer learning is employed. The aim is to transfer knowledge from the high-level feature vectors of the Inception-V3 network to offline preprocessed LFW target data; and features is trained using the SoftMax classifier for face identification. The experiments show that the discriminative capability of the deep networks can be improved by preprocessing RGB data with some of the unconventional preprocessors before feeding it to the CNNs. However, for best performance, the right setup of preprocessed data with augmentation and/or normalization is required. Summarily, preprocessing data before it is fed to the deep network is found to increase the homogeneity of neighborhood pixels even at reduced bit depth which serves for better storage efficiency.


Author(s):  
Nassima Dif ◽  
Zakaria Elberrichi

Deep learning methods are characterized by their capacity to learn data representation compared to the traditional machine learning algorithms. However, these methods are prone to overfitting on small volumes of data. The objective of this research is to overcome this limitation by improving the generalization in the proposed deep learning framework based on various techniques: data augmentation, small models, optimizer selection, and ensemble learning. For ensembling, the authors used selected models from different checkpoints and both voting and unweighted average methods for combination. The experimental study on the lymphomas histopathological dataset highlights the efficiency of the MobileNet2 network combined with the stochastic gradient descent (SGD) optimizer in terms of generalization. The best results have been achieved by the combination of the best three checkpoint models (98.67% of accuracy). These findings provide important insights into the efficiency of the checkpoint ensemble learning method for histopathological image classification.


Sign in / Sign up

Export Citation Format

Share Document