scholarly journals Grouped Pointwise Convolutions Significantly Reduces Parameters in EfficientNet

2021 ◽  
Author(s):  
Joao Paulo Schwarz Schuler ◽  
Santiago Romani ◽  
Mohamed Abdel-Nasser ◽  
Hatem Rashwan ◽  
Domenec Puig

EfficientNet is a recent Deep Convolutional Neural Network (DCNN) architecture intended to be proportionally extendible in depth, width and resolution. Through its variants, it can achieve state of the art accuracy on the ImageNet classification task as well as on other classical challenges. Although its name refers to its efficiency with respect to the ratio between outcome (accuracy) and needed resources (number of parameters, flops), we are studying a method to reduce the original number of trainable parameters by more than 84% while keeping a very similar degree of accuracy. Our proposal is to improve the pointwise (1x1) convolutions, whose number of parameters rapidly grows due to the multiplication of the number of filters by the number of input channels that come from the previous layer. Basically, our tweak consists in grouping filters into parallel branches, where each branch processes a fraction of the input channels. However, by doing so, the learning capability of the DCNN is degraded. To avoid this effect, we suggest interleaving the output of filters from different branches at intermediate layers of consecutive pointwise convolutions. Our experiments with the CIFAR-10 dataset show that our optimized EfficientNet has similar learning capacity to the original layout when training from scratch.

2021 ◽  
pp. 1-10
Author(s):  
Gayatri Pattnaik ◽  
Vimal K. Shrivastava ◽  
K. Parvathi

Pests are major threat to economic growth of a country. Application of pesticide is the easiest way to control the pest infection. However, excessive utilization of pesticide is hazardous to environment. The recent advances in deep learning have paved the way for early detection and improved classification of pest in tomato plants which will benefit the farmers. This paper presents a comprehensive analysis of 11 state-of-the-art deep convolutional neural network (CNN) models with three configurations: transfers learning, fine-tuning and scratch learning. The training in transfer learning and fine tuning initiates from pre-trained weights whereas random weights are used in case of scratch learning. In addition, the concept of data augmentation has been explored to improve the performance. Our dataset consists of 859 tomato pest images from 10 categories. The results demonstrate that the highest classification accuracy of 94.87% has been achieved in the transfer learning approach by DenseNet201 model with data augmentation.


2020 ◽  
Vol 7 ◽  
Author(s):  
Uttam U. Deshpande ◽  
V. S. Malemath ◽  
Shivanand M. Patil ◽  
Sushma V. Chaugule

Automatic Latent Fingerprint Identification Systems (AFIS) are most widely used by forensic experts in law enforcement and criminal investigations. One of the critical steps used in automatic latent fingerprint matching is to automatically extract reliable minutiae from fingerprint images. Hence, minutiae extraction is considered to be a very important step in AFIS. The performance of such systems relies heavily on the quality of the input fingerprint images. Most of the state-of-the-art AFIS failed to produce good matching results due to poor ridge patterns and the presence of background noise. To ensure the robustness of fingerprint matching against low quality latent fingerprint images, it is essential to include a good fingerprint enhancement algorithm before minutiae extraction and matching. In this paper, we have proposed an end-to-end fingerprint matching system to automatically enhance, extract minutiae, and produce matching results. To achieve this, we have proposed a method to automatically enhance the poor-quality fingerprint images using the “Automated Deep Convolutional Neural Network (DCNN)” and “Fast Fourier Transform (FFT)” filters. The Deep Convolutional Neural Network (DCNN) produces a frequency enhanced map from fingerprint domain knowledge. We propose an “FFT Enhancement” algorithm to enhance and extract the ridges from the frequency enhanced map. Minutiae from the enhanced ridges are automatically extracted using a proposed “Automated Latent Minutiae Extractor (ALME)”. Based on the extracted minutiae, the fingerprints are automatically aligned, and a matching score is calculated using a proposed “Frequency Enhanced Minutiae Matcher (FEMM)” algorithm. Experiments are conducted on FVC2002, FVC2004, and NIST SD27 latent fingerprint databases. The minutiae extraction results show significant improvement in precision, recall, and F1 scores. We obtained the highest Rank-1 identification rate of 100% for FVC2002/2004 and 84.5% for NIST SD27 fingerprint databases. The matching results reveal that the proposed system outperforms state-of-the-art systems.


2018 ◽  
Vol 232 ◽  
pp. 01061
Author(s):  
Danhua Li ◽  
Xiaofeng Di ◽  
Xuan Qu ◽  
Yunfei Zhao ◽  
Honggang Kong

Pedestrian detection aims to localize and recognize every pedestrian instance in an image with a bounding box. The current state-of-the-art method is Faster RCNN, which is such a network that uses a region proposal network (RPN) to generate high quality region proposals, while Fast RCNN is used to classifiers extract features into corresponding categories. The contribution of this paper is integrated low-level features and high-level features into a Faster RCNN-based pedestrian detection framework, which efficiently increase the capacity of the feature. Through our experiments, we comprehensively evaluate our framework, on the Caltech pedestrian detection benchmark and our methods achieve state-of-the-art accuracy and present a competitive result on Caltech dataset.


Author(s):  
Hao Zhu ◽  
Shenghua Gao

Deep Convolutional Neural Network (DCNN) based deep hashing has shown its success for fast and accurate image retrieval, however directly minimizing the quantization error in deep hashing will change the distribution of DCNN features, and consequently change the similarity between the query and the retrieved images in hashing. In this paper, we propose a novel Locality-Constrained Deep Supervised Hashing. By simultaneously learning discriminative DCNN features and preserving the similarity between image pairs, the hash codes of our scheme preserves the distribution of DCNN features thus favors the accurate image retrieval.The contributions of this paper are two-fold: i) Our analysis shows that minimizing quantization error in deep hashing makes the features less discriminative which is not desirable for image retrieval; ii) We propose a Locality-Constrained Deep Supervised Hashing which preserves the similarity between image pairs in hashing.Extensive experiments on the CIFARA-10 and NUS-WIDE datasets show that our method significantly boosts the accuracy of image retrieval, especially on the CIFAR-10 dataset, the improvement is usually more than 6% in terms of the MAP measurement. Further, our method demonstrates 10 times faster than state-of-the-art methods in the training phase.


2020 ◽  
Vol 3 (2) ◽  
pp. 177-178
Author(s):  
John Jowil D. Orquia ◽  
El Jireh Bibangco

Manual Fruit classification is the traditional way of classifying fruits. It is manual contact-labor that is time-consuming and often results in lesser productivity, inconsistency, and sometimes damaging the fruits (Prabha & Kumar, 2012). Thus, new technologies such as deep learning paved the way for a faster and more efficient method of fruit classification (Faridi & Aboonajmi, 2017). A deep convolutional neural network, or deep learning, is a machine learning algorithm that contains several layers of neural networks stacked together to create a more complex model capable of solving complex problems. The utilization of state-of-the-art pre-trained deep learning models such as AlexNet, GoogLeNet, and ResNet-50 was widely used. However, such models were not explicitly trained for fruit classification (Dyrmann, Karstoft, & Midtiby, 2016). The study aimed to create a new deep convolutional neural network and compared its performance to fine-tuned models based on accuracy, precision, sensitivity, and specificity.


Author(s):  
Rishipal Singh ◽  
Rajneesh Rani ◽  
Aman Kamboj

Fruits classification is one of the influential applications of computer vision. Traditional classification models are trained by considering various features such as color, shape, texture, etc. These features are common for different varieties of the same fruit. Therefore, a new set of features is required to classify the fruits belonging to the same class. In this paper, we have proposed an optimized method to classify intra-class fruits using deep convolutional layers. The proposed architecture is capable of solving the challenges of a commercial tray-based system in the supermarket. As the research in intra-class classification is still in its infancy, there are challenges that have not been tackled. So, the proposed method is specifically designed to overcome the challenges related to intra-class fruits classification. The proposed method showcases an impressive performance for intra-class classification, which is achieved using a few parameters than the existing methods. The proposed model consists of Inception block, Residual connections and various other layers in very precise order. To validate its performance, the proposed method is compared with state-of-the-art models and performs best in terms of accuracy, loss, parameters, and depth.


2021 ◽  
Vol 309 ◽  
pp. 01123
Author(s):  
Raju Yadav Mothukupally ◽  
P Chandra Sekhar Reddy

Face parsing methodology may be a one amongst the advancements in pc vision that analyses the surface synthesis of the external body part, to amass bits of information on options needs correct pixel segmentation of various components of face like (mouth, nose, eyes etc.). Same means the analysis on feeling recognition plays a eventful role in communication and interactions of humanity and additionally relevant to psychological activities. Considering the disadvantage that totally different completely different components of face contain different quantity of knowledge for face expression and also the weighted perform are not an equivalent for various faces. In keeping with analysis, the image classification task ordinarily drives North American country to the notable Convolutional Neural Network (CNN) during which we tend to ar victimization VGG19 model. beyond exploring around however CNN, sometimes performs for greyscale photos, we tend to selected to start from 3 consecutive convolutional layers followed by a most pooling layer, basic exploit work for convolutional layer and “relu” is used, even as an analogous artefact pattern. The highlights to be known victimization the convolutional layer distended to 128 layers from thirty-two, it is suggestable that multi-layered structure (with increasing layers) that performs and results the most effective outcomes for the DNN model. At last, the CNN layer is 1st smoothened and afterwards expertise 2 many dense layers to reach the yield layer during which SoftMax activation perform is used for multiclass classification. We tend to victimization Cohn-Kanadre face expression dataset of seven expressions like contempt, anger, disgust, happiness, fear, disappointment and surprise.


2020 ◽  
Vol 47 (10) ◽  
pp. 5158-5171 ◽  
Author(s):  
Abass Bahrami ◽  
Alireza Karimian ◽  
Emad Fatemizadeh ◽  
Hossein Arabi ◽  
Habib Zaidi

2020 ◽  
Vol 2020 (4) ◽  
pp. 4-14
Author(s):  
Vladimir Budak ◽  
Ekaterina Ilyina

The article proposes the classification of lenses with different symmetrical beam angles and offers a scale as a spot-light’s palette. A collection of spotlight’s images was created and classified according to the proposed scale. The analysis of 788 pcs of existing lenses and reflectors with different LEDs and COBs carried out, and the dependence of the axial light intensity from beam angle was obtained. A transfer training of new deep convolutional neural network (CNN) based on the pre-trained GoogleNet was performed using this collection. GradCAM analysis showed that the trained network correctly identifies the features of objects. This work allows us to classify arbitrary spotlights with an accuracy of about 80 %. Thus, light designer can determine the class of spotlight and corresponding type of lens with its technical parameters using this new model based on CCN.


Sign in / Sign up

Export Citation Format

Share Document