scholarly journals Virtual Reality Video Image Classification Based on Texture Features

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Guofang Qin ◽  
Guoliang Qin

As one of the most widely used methods in deep learning technology, convolutional neural networks have powerful feature extraction capabilities and nonlinear data fitting capabilities. However, the convolutional neural network method still has disadvantages such as complex network model, too long training time and excessive consumption of computing resources, slow convergence speed, network overfitting, and classification accuracy that needs to be improved. Therefore, this article proposes a dense convolutional neural network classification algorithm based on texture features for images in virtual reality videos. First, the texture feature of the image is introduced as a priori information to reflect the spatial relationship between pixels and the unique characteristics of different types of ground features. Second, the grey level cooccurrence matrix (GLCM) is used to extract the grey level correlation features of the image in space. Then, Gauss Markov Random Field (GMRF) is used to establish the statistical correlation characteristics between neighbouring pixels, and the extracted GLCM-GMRF texture feature and image intensity vector are combined. Finally, based on DenseNet, an improved shallow layer dense convolutional neural network (L-DenseNet) is proposed, which can compress network parameters and improve the feature extraction ability of the network. The experimental results show that compared with the current classification method, this method can effectively suppress the influence of coherent speckle noise and obtain better classification results.

Author(s):  
Chang Liu ◽  
◽  
Kaoru Hirota ◽  
Bo Wang ◽  
Yaping Dai ◽  
...  

An emotion recognition framework based on a two-channel convolutional neural network (CNN) is proposed to detect the affective state of humans through facial expressions. The framework consists of three parts, i.e., the frontal face detection module, the feature extraction module, and the classification module. The feature extraction module contains two channels: one is for raw face images and the other is for texture feature images. The local binary pattern (LBP) images are utilized for texture feature extraction to enrich facial features and improve the network performance. The attention mechanism is adopted in both CNN feature extraction channels to highlight the features that are related to facial expressions. Moreover, arcface loss function is integrated into the proposed network to increase the inter-class distance and decrease the inner-class distance of facial features. The experiments conducted on the two public databases, FER2013 and CK+, demonstrate that the proposed method outperforms the previous methods, with the accuracies of 72.56% and 94.24%, respectively. The improvement in emotion recognition accuracy makes our approach applicable to service robots.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Guiyong Xu ◽  
Yang Xu ◽  
Sicong Zhang ◽  
Xiaoyao Xie

In the era of big data, convolutional neural network (CNN) has been widely used in the field of image classification and has achieved excellent performance. More and more researchers are beginning to combine deep neural networks with steganalysis to improve performance in recent years. However, most of the steganalysis algorithm based on the convolutional neural network has only run test against the WOW and S-UNIWARD algorithms; meanwhile, their versatility is insufficient due to long training time and the limit of image size. This paper proposes a new network architecture, called SFRNet, to solve these problems. The feature extraction and fusion layer can extract more features from the digital image. The RepVgg block is used to accelerate the inference and increase memory utilization. The SE block improves the detection accuracy rate because it can learn feature weights to make effective feature maps with significant weights and invalid or ineffective feature maps with small weights. Experimental results show that the SFRNet has achieved excellent performance in the detection accuracy rate against four state-of-the-art steganography algorithms in the spatial domain, e.g., HUGO, WOW, S-UNIWARD, and MiPOD, under different payloads. The SFRNet detection accuracy rate achieves 89.6% against S-UNIWARD algorithm with the payload of 0.4bpp and 72.5% at 0.2bpp. As the same time, the training time of our network is greatly reduced by 35% compared with Yedroudj-Net.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Seungmin Han ◽  
Seokju Oh ◽  
Jongpil Jeong

Bearings are one of the most important parts of a rotating machine. Bearing failure can lead to mechanical failure, financial loss, and even personal injury. In recent years, various deep learning techniques have been used to diagnose bearing faults in rotating machines. However, deep learning technology has a data imbalance problem because it requires huge amounts of data. To solve this problem, we used data augmentation techniques. In addition, Convolutional Neural Network, one of the deep learning models, is a method capable of performing feature learning without prior knowledge. However, since conventional fault diagnosis based on CNN can only extract single-scale features, not only useful information may be lost but also domain shift problems may occur. In this paper, we proposed a Multiscale Convolutional Neural Network (MSCNN) to extract more powerful and differentiated features from raw signals. MSCNN can learn more powerful feature expression than conventional CNN through multiscale convolution operation and reduce the number of parameters and training time. The proposed model proved better results and validated the effectiveness of the model compared to 2D-CNN and 1D-CNN.


2019 ◽  
Vol 11 (10) ◽  
pp. 1202 ◽  
Author(s):  
Min Ji ◽  
Lanfa Liu ◽  
Runlin Du ◽  
Manfred F. Buchroithner

The accurate and quick derivation of the distribution of damaged building must be considered essential for the emergency response. With the success of deep learning, there is an increasing interest to apply it for earthquake-induced building damage mapping, and its performance has not been compared with conventional methods in detecting building damage after the earthquake. In the present study, the performance of grey-level co-occurrence matrix texture and convolutional neural network (CNN) features were comparatively evaluated with the random forest classifier. Pre- and post-event very high-resolution (VHR) remote sensing imagery were considered to identify collapsed buildings after the 2010 Haiti earthquake. Overall accuracy (OA), allocation disagreement (AD), quantity disagreement (QD), Kappa, user accuracy (UA), and producer accuracy (PA) were used as the evaluation metrics. The results showed that the CNN feature with random forest method had the best performance, achieving an OA of 87.6% and a total disagreement of 12.4%. CNNs have the potential to extract deep features for identifying collapsed buildings compared to the texture feature with random forest method by increasing Kappa from 61.7% to 69.5% and reducing the total disagreement from 16.6% to 14.1%. The accuracy for identifying buildings was improved by combining CNN features with random forest compared with the CNN approach. OA increased from 85.9% to 87.6%, and the total disagreement reduced from 14.1% to 12.4%. The results indicate that the learnt CNN features can outperform texture features for identifying collapsed buildings using VHR remotely sensed space imagery.


2021 ◽  
Vol 11 (6) ◽  
pp. 2838
Author(s):  
Nikitha Johnsirani Venkatesan ◽  
Dong Ryeol Shin ◽  
Choon Sung Nam

In the pharmaceutical field, early detection of lung nodules is indispensable for increasing patient survival. We can enhance the quality of the medical images by intensifying the radiation dose. High radiation dose provokes cancer, which forces experts to use limited radiation. Using abrupt radiation generates noise in CT scans. We propose an optimal Convolutional Neural Network model in which Gaussian noise is removed for better classification and increased training accuracy. Experimental demonstration on the LUNA16 dataset of size 160 GB shows that our proposed method exhibit superior results. Classification accuracy, specificity, sensitivity, Precision, Recall, F1 measurement, and area under the ROC curve (AUC) of the model performance are taken as evaluation metrics. We conducted a performance comparison of our proposed model on numerous platforms, like Apache Spark, GPU, and CPU, to depreciate the training time without compromising the accuracy percentage. Our results show that Apache Spark, integrated with a deep learning framework, is suitable for parallel training computation with high accuracy.


2021 ◽  
pp. 1-10
Author(s):  
Chien-Cheng Leea ◽  
Zhongjian Gao ◽  
Xiu-Chi Huanga

This paper proposes a Wi-Fi-based indoor human detection system using a deep convolutional neural network. The system detects different human states in various situations, including different environments and propagation paths. The main improvements proposed by the system is that there is no cameras overhead and no sensors are mounted. This system captures useful amplitude information from the channel state information and converts this information into an image-like two-dimensional matrix. Next, the two-dimensional matrix is used as an input to a deep convolutional neural network (CNN) to distinguish human states. In this work, a deep residual network (ResNet) architecture is used to perform human state classification with hierarchical topological feature extraction. Several combinations of datasets for different environments and propagation paths are used in this study. ResNet’s powerful inference simplifies feature extraction and improves the accuracy of human state classification. The experimental results show that the fine-tuned ResNet-18 model has good performance in indoor human detection, including people not present, people still, and people moving. Compared with traditional machine learning using handcrafted features, this method is simple and effective.


Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2540
Author(s):  
Zhipeng Yu ◽  
Jianghai Zhao ◽  
Yucheng Wang ◽  
Linglong He ◽  
Shaonan Wang

In recent years, surface electromyography (sEMG)-based human–computer interaction has been developed to improve the quality of life for people. Gesture recognition based on the instantaneous values of sEMG has the advantages of accurate prediction and low latency. However, the low generalization ability of the hand gesture recognition method limits its application to new subjects and new hand gestures, and brings a heavy training burden. For this reason, based on a convolutional neural network, a transfer learning (TL) strategy for instantaneous gesture recognition is proposed to improve the generalization performance of the target network. CapgMyo and NinaPro DB1 are used to evaluate the validity of our proposed strategy. Compared with the non-transfer learning (non-TL) strategy, our proposed strategy improves the average accuracy of new subject and new gesture recognition by 18.7% and 8.74%, respectively, when up to three repeated gestures are employed. The TL strategy reduces the training time by a factor of three. Experiments verify the transferability of spatial features and the validity of the proposed strategy in improving the recognition accuracy of new subjects and new gestures, and reducing the training burden. The proposed TL strategy provides an effective way of improving the generalization ability of the gesture recognition system.


2021 ◽  
Vol 21 (01) ◽  
pp. 2150005
Author(s):  
ARUN T NAIR ◽  
K. MUTHUVEL

Nowadays, analysis on retinal image exists as one of the challenging area for study. Numerous retinal diseases could be recognized by analyzing the variations taking place in retina. However, the main disadvantage among those studies is that, they do not have higher recognition accuracy. The proposed framework includes four phases namely, (i) Blood Vessel Segmentation (ii) Feature Extraction (iii) Optimal Feature Selection and (iv) Classification. Initially, the input fundus image is subjected to blood vessel segmentation from which two binary thresholded images (one from High Pass Filter (HPF) and other from top-hat reconstruction) are acquired. These two images are differentiated and the areas that are common to both are said to be the major vessels and the left over regions are fused to form vessel sub-image. These vessel sub-images are classified with Gaussian Mixture Model (GMM) classifier and the resultant is summed up with the major vessels to form the segmented blood vessels. The segmented images are subjected to feature extraction process, where the features like proposed Local Binary Pattern (LBP), Gray-Level Co-Occurrence Matrix (GLCM) and Gray Level Run Length Matrix (GLRM) are extracted. As the curse of dimensionality seems to be the greatest issue, it is important to select the appropriate features from the extracted one for classification. In this paper, a new improved optimization algorithm Moth Flame with New Distance Formulation (MF-NDF) is introduced for selecting the optimal features. Finally, the selected optimal features are subjected to Deep Convolutional Neural Network (DCNN) model for classification. Further, in order to make the precise diagnosis, the weights of DCNN are optimally tuned by the same optimization algorithm. The performance of the proposed algorithm will be compared against the conventional algorithms in terms of positive and negative measures.


Sign in / Sign up

Export Citation Format

Share Document