scholarly journals Learning Class-Specific Features with Class Regularization for Videos

2020 ◽  
Vol 10 (18) ◽  
pp. 6241
Author(s):  
Alexandros Stergiou ◽  
Ronald Poppe ◽  
Remco C. Veltkamp

One of the main principles of Deep Convolutional Neural Networks (CNNs) is the extraction of useful features through a hierarchy of kernels operations. The kernels are not explicitly tailored to address specific target classes but are rather optimized as general feature extractors. Distinction between classes is typically left until the very last fully-connected layers. Consequently, variances between classes that are relatively similar are treated the same way as variations between classes that exhibit great dissimilarities. In order to directly address this problem, we introduce Class Regularization, a novel method that can regularize feature map activations based on the classes of the examples used. Essentially, we amplify or suppress activations based on an educated guess of the given class. We can apply this step to each minibatch of activation maps, at different depths in the network. We demonstrate that this improves feature search during training, leading to systematic improvement gains on the Kinetics, UCF-101, and HMDB-51 datasets. Moreover, Class Regularization establishes an explicit correlation between features and class, which makes it a perfect tool to visualize class-specific features at various network depths.

Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2381
Author(s):  
Jaewon Lee ◽  
Hyeonjeong Lee ◽  
Miyoung Shin

Mental stress can lead to traffic accidents by reducing a driver’s concentration or increasing fatigue while driving. In recent years, demand for methods to detect drivers’ stress in advance to prevent dangerous situations increased. Thus, we propose a novel method for detecting driving stress using nonlinear representations of short-term (30 s or less) physiological signals for multimodal convolutional neural networks (CNNs). Specifically, from hand/foot galvanic skin response (HGSR, FGSR) and heart rate (HR) short-term input signals, first, we generate corresponding two-dimensional nonlinear representations called continuous recurrence plots (Cont-RPs). Second, from the Cont-RPs, we use multimodal CNNs to automatically extract FGSR, HGSR, and HR signal representative features that can effectively differentiate between stressed and relaxed states. Lastly, we concatenate the three extracted features into one integrated representation vector, which we feed to a fully connected layer to perform classification. For the evaluation, we use a public stress dataset collected from actual driving environments. Experimental results show that the proposed method demonstrates superior performance for 30-s signals, with an overall accuracy of 95.67%, an approximately 2.5–3% improvement compared with that of previous works. Additionally, for 10-s signals, the proposed method achieves 92.33% classification accuracy, which is similar to or better than the performance of other methods using long-term signals (over 100 s).


Author(s):  
Abeer Al-Hyari ◽  
Shawki Areibi

This paper proposes a framework for design space exploration ofConvolutional Neural Networks (CNNs) using Genetic Algorithms(GAs). CNNs have many hyperparameters that need to be tunedcarefully in order to achieve favorable results when used for imageclassification tasks or similar vision applications. Genetic Algorithmsare adopted to efficiently traverse the huge search spaceof CNNs hyperparameters, and generate the best architecture thatfits the given task. Some of the hyperparameters that were testedinclude the number of convolutional and fully connected layers, thenumber of filters for each convolutional layer, and the number ofnodes in the fully connected layers. The proposed approach wastested using MNIST dataset for handwritten digit classification andresults obtained indicate that the proposed approach is able to generatea CNN architecture with validation accuracy up to 96.66% onaverage.


2021 ◽  
Author(s):  
Guo Jiahui ◽  
Ma Feilong ◽  
Matteo Visconti di Oleggio Castello ◽  
Samuel A Nastase ◽  
James V Haxby ◽  
...  

Deep convolutional neural networks (DCNNs) trained for face identification can rival and even exceed human-level performance. The relationships between internal representations learned by DCNNs and those of the primate face processing system are not well understood, especially in naturalistic settings. We developed the largest naturalistic dynamic face stimulus set in human neuroimaging research (700+ naturalistic video clips of unfamiliar faces) and used representational similarity analysis to investigate how well the representations learned by high-performing DCNNs match human brain representations across the entire distributed face processing system. DCNN representational geometries were strikingly consistent across diverse architectures and captured meaningful variance among faces. Similarly, representational geometries throughout the human face network were highly consistent across subjects. Nonetheless, correlations between DCNN and neural representations were very weak overall—DCNNs captured 3% of variance in the neural representational geometries at best. Intermediate DCNN layers better matched visual and face-selective cortices than the final fully-connected layers. Behavioral ratings of face similarity were highly correlated with intermediate layers of DCNNs, but also failed to capture representational geometry in the human brain. Our results suggest that the correspondence between intermediate DCNN layers and neural representations of naturalistic human face processing is weak at best, and diverges even further in the later fully-connected layers. This poor correspondence can be attributed, at least in part, to the dynamic and cognitive information that plays an essential role in human face processing but is not modeled by DCNNs. These mismatches indicate that current DCNNs have limited validity as in silico models of dynamic, naturalistic face processing in humans.


2020 ◽  
Vol 7 (6) ◽  
pp. 1089
Author(s):  
Iwan Muhammad Erwin ◽  
Risnandar Risnandar ◽  
Esa Prakarsa ◽  
Bambang Sugiarto

<p class="Abstrak">Identifikasi kayu salah satu kebutuhan untuk mendukung pemerintah dan kalangan bisnis kayu untuk melakukan perdagangan kayu secara legal. Keahlian khusus dan waktu yang cukup dibutuhkan untuk memproses identifikasi kayu di laboratorium. Beberapa metodologi penelitian sebelumnya, proses identifikasi kayu masih dengan cara menggabungkan sistem manual menggunakan anatomi DNA kayu. Sedangkan penggunaan sistem komputer diperoleh dari citra penampamg melintang kayu secara proses mikrokopis dan makroskopis. Saat ini, telah berkembang teknologi computer vision dan machine learning untuk mengidentifikasi berbagai jenis objek, salah satunya citra kayu. Penelitian ini berkontribusi dalam mengklasifikasi beberapa spesies kayu yang diperdagangkan menggunakan Deep Convolutional Neural Networks (DCNN). Kebaruan penelitian ini terletak pada arsitektur DCNN yang bernama Kayu7Net. Arsitektur Kayu7Net yang diusulkan memiliki tiga lapisan konvolusi terhadap tujuh spesies dataset citra kayu. Pengujian dengan merubah citra input menjadi berukuran 600×600, 300×300, dan 128×128 piksel serta masing-masing diulang pada epoch 50 dan 100. DCNN yang diusulkan menggunakan fungsi aktivasi ReLU dengan batch size 32. ReLU bersifat lebih konvergen dan cepat saat proses iterasi. Sedangkan Fully-Connected (FC) berjumlah 4 lapisan akan menghasilkan proses training yang lebih efisien. Hasil eksperimen memperlihatkan bahwa Kayu7Net yang diusulkan memiliki nilai akurasi sebesar 95,54%, precision sebesar 95,99%, recall sebesar 95,54%, specificity sebesar 99,26% dan terakhir, nilai F-measure sebesar 95,46%. Hasil ini menunjukkan bahwa arsitektur Kayu7Net lebih unggul sebesar 1,49% pada akurasi, 2,49% pada precision, dan 5,26% pada specificity dibandingkan penelitian sebelumnya.</p><p class="Abstrak"> </p><p class="Abstrak"><em><strong>Abstract</strong></em></p><p class="Abstrak"><em>Wood identification is one of the needs to support the government and the wood business community for a legally wood trading system. Special expertise and sufficient time are needed to process wood identification in the laboratory. Some previous research works show that the process of identifying wood combines a manual system using a wood DNA anatomy. While, the use of a computer system is obtained from the wood image of microscopic and macroscopic process. Recently, the latest technology has developed by using the machine learning and computer vision to identify many objects, the one of them is wood image. This research contributes to classify several the traded wood species by using Deep Convolutional Neural Networks (DCNN). The novelty of this research is in the DCNN architecture, namely Kayu7Net. The proposed of Kayu7Net Architecture has three convolution layers of the seven species wood image dataset. The testing changes the wood image input to 600×600, 300×300, and 128×128 pixel, respectively, and each of them repeated until 50 and 100 epoches, respectively. The proposed DCNN uses the ReLU activation function and batch size 32. The ReLU is more convergent and faster during the iteration process. Whereas, the 4 layers of Fully-Connected (FC) will produce a more efficient training process. The experimental results show that the proposed Kayu7Net has an accuracy value of 95.54%, a precision of 95.99%, a recall of 95.54%, a specificity of 99.26% and finally, an F-measure value of 95.46%. These results indicate that Kayu7Net is superior by 1.49% of accuracy, 2.49% of precision, and 5.26% of specificity compared to the previous work. </em></p><p class="Abstrak"> </p>


Author(s):  
Tuan Hoang ◽  
Thanh-Toan Do ◽  
Tam V. Nguyen ◽  
Ngai-Man Cheung

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables direct updating of quantized weights with learnable quantization levels to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.


2018 ◽  
Vol 16 (06) ◽  
pp. 895-919 ◽  
Author(s):  
Ding-Xuan Zhou

Deep learning based on structured deep neural networks has provided powerful applications in various fields. The structures imposed on the deep neural networks are crucial, which makes deep learning essentially different from classical schemes based on fully connected neural networks. One of the commonly used deep neural network structures is generated by convolutions. The produced deep learning algorithms form the family of deep convolutional neural networks. Despite of their power in some practical domains, little is known about the mathematical foundation of deep convolutional neural networks such as universality of approximation. In this paper, we propose a family of new structured deep neural networks: deep distributed convolutional neural networks. We show that these deep neural networks have the same order of computational complexity as the deep convolutional neural networks, and we prove their universality of approximation. Some ideas of our analysis are from ridge approximation, wavelets, and learning theory.


2021 ◽  
Author(s):  
Jiaqi Huang ◽  
Peter Gerhardstein

Multiple theories of human object recognition argue for the importance of semantic parts in the formation of intermediate representations. However, the role of semantic parts in Deep Convolutional Neural Networks (DCNN), which encapsulate the most recent and successful computer vision models, is poorly examined. We extract representations of DCNNs corresponding to differential performance with stimuli in which different parts of the same exemplar are deleted, and then compare these representations with those of human observers obtained in a behavioral experiment, using representational similarity analysis (RSA). We find that DCNN representations correlate strongly with those of observers, while acknowledging that these DCNN representations may not be part-based given an equally high correlation between DCNN output and part size. Additionally, the exemplars incorrectly identified by DCNNs tend to have less “human-like” representations, which demonstrates RSA as a potential novel method for interpreting error in intermediate processes of recognition of DCNNs.


Sign in / Sign up

Export Citation Format

Share Document