convolution structure Latest Research Papers

Walnut Ripeness Detection Based on Coupling Information and Lightweight YOLOv4

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2022.16.29 ◽

2022 ◽

Vol 16 ◽

pp. 239-247

Author(s):

Kaixuan Cui ◽

Shuchai Su ◽

Jiawei Cai ◽

Fengjun Chen

Keyword(s):

Oil Content ◽

Pearson Correlation ◽

Mean Average Precision ◽

Average Precision ◽

Dynamic Learning ◽

Training Strategy ◽

Kernel Oil ◽

Convolution Structure ◽

Validation Set ◽

Detection Speed

To realize rapid and accurate ripeness detection for walnut on mobile terminals such as mobile phones, we propose a method based on coupling information and lightweight YOLOv4. First, we collected 50 walnuts at each ripeness (Unripe, Mid-ripe, Ripe, Over-ripe) to determine the kernel oil content. Pearson correlation analysis and one-way analysis of variance (ANOVA) prove that the division of walnut ripeness reflects the change in kernel oil content. It is feasible to estimate the kernel oil content by detecting the ripeness of walnut. Next, we achieve ripeness detection based on lightweight YOLOv4. We adopt MobileNetV3 as the backbone feature extractor and adopt depthwise separable convolution to replace the traditional convolution. We design a parallel convolution structure with depthwise convolution stacking (PCSDCS) to reduce parameters and improve feature extraction ability. To enhance the model’s detection ability for walnuts in the growth-intensive areas, we design a Gaussian Soft DIoU non-maximum suppression (GSDIoU-NMS) algorithm. The dataset used for model optimization contains 3600 images, of which 2880 images in the training set, 320 images in the validation set, and 400 images in the test set. We adopt a multi-training strategy based on dynamic learning rate and transfer learning to get training weights. The lightweight YOLOv4 model achieves 94.05%, 90.72%, 88.30%, 76.92 FPS, and 38.14 MB in mean average precision, precision, recall, average detection speed, and weight capacity, respectively. Compared with the Faster R-CNN model, EfficientDet-D1 model, YOLOv3 model, and YOLOv4 model, the lightweight YOLOv4 model improves 8.77%, 4.84%, 5.43%, and 0.06% in mean average precision, 74.60 FPS, 55.60 FPS, 38.83 FPS, and 46.63 FPS in detection speed, respectively. And the lightweight YOLOv4 is 84.4% smaller than the original YOLOv4 model in terms of weight capacity. This paper provides a theoretical reference for the rapid ripeness detection of walnut and exploration for the model’s lightweight.

Download Full-text

An Effective Approach in Fusion of Multispectral Medical Images Using Convolution Structure Sparse Coding

2021 6th International Conference on Communication and Electronics Systems (ICCES) ◽

10.1109/icces51350.2021.9489232 ◽

2021 ◽

Author(s):

S.PradeepKumar Reddy ◽

R V Krishnaiah ◽

Y Rajasree Rao

Keyword(s):

Sparse Coding ◽

Medical Images ◽

Convolution Structure

Download Full-text

Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure

Multimedia Tools and Applications ◽

10.1007/s11042-021-11136-z ◽

2021 ◽

Author(s):

Yi Cao ◽

Chen Liu ◽

Zilong Huang ◽

Yongjian Sheng ◽

Yongjian Ju

Keyword(s):

Action Recognition ◽

Convolution Structure ◽

Temporal Action

Download Full-text

Music Feature Classification Based on Recurrent Neural Networks with Channel Attention Mechanism

Mobile Information Systems ◽

10.1155/2021/7629994 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Jie Gan

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Traditional Music ◽

Attention Mechanism ◽

Feature Classification ◽

Data Set ◽

Hard Drives ◽

Music Classification ◽

Online Music ◽

Convolution Structure

With the advancement of multimedia and digital technologies, music resources are rapidly increasing over the Internet, which changed listeners’ habits from hard drives to online music platforms. It has allowed the researchers to use classification technologies for efficient storage, organization, retrieval, and recommendation of music resources. The traditional music classification methods use many artificially designed acoustic features, which require knowledge in the music field. The features of different classification tasks are often not universal. This paper provides a solution to this problem by proposing a novel recurrent neural network method with a channel attention mechanism for music feature classification. The music classification method based on a convolutional neural network ignores the timing characteristics of the audio itself. Therefore, this paper combines convolution structure with the bidirectional recurrent neural network and uses the attention mechanism to assign different attention weights to the output of the recurrent neural network at different times; the weights are assigned for getting a better representation of the overall characteristics of the music. The classification accuracy of the model on the GTZAN data set has increased to 93.1%. The AUC on the multilabel labeling data set MagnaTagATune has reached 92.3%, surpassing other comparison methods. The labeling of different music labels has been analyzed. This method has good labeling ability for most of the labels of music genres. Also, it has good performance on some labels of musical instruments, singing, and emotion categories.

Download Full-text

Modeling 3D Convolution Architecture for Actions Recognition

ASME 2021 30th Conference on Information Storage and Processing Systems ◽

10.1115/isps2021-65036 ◽

2021 ◽

Author(s):

Bogdan Alexandru Radulescu ◽

Victorita Radulescu

Keyword(s):

Behavior Analysis ◽

Action Recognition ◽

Activation Function ◽

Training Time ◽

Temporal Features ◽

Proposed Model ◽

Convolution Model ◽

Convolution Structure ◽

Spatio Temporal ◽

Behavior Characteristics

Abstract Action recognition infrastructure can be applied anywhere behavior analysis is required and represents presently a domain of maximum actuality in security and surveillance. The model based on 3D Convolutions is a middle ground between simple key-frame approaches based on 2D convolutions, and other more complex approaches based on Recurrent Neural Networks. Behavior analysis represents a domain greatly improved by action recognition. By placing human actions in different categories it is possible to extract statistics regarding a person’s behavior, characteristics, abilities and preferences which can be processed later by specialized personnel, depending on the selected domain. The proposed model follows simple 3D convolution architecture. Hidden layers are composed of a convolution operation, an activation function and, sometimes, a pooling layer. Leaky ReLU was used as activation function to alleviate the problem of vanishing gradients. Batch Normalization is a technique used for scaling and adjusting the output of an activation layer, and it has been used to reduce over-fitting and decrease the training time. The 3D Convolution structure has the advantage of learning spatio-temporal features, because the convolution is applied over a sequence of frames. In the present paper is presented a proposed 3D convolution model that has average results, with an accuracy of approximately 55% on the NTU RGB+D dataset.

Download Full-text

Quantification of Cognitive Function in Alzheimer’s Disease Based on Deep Learning

Frontiers in Neuroscience ◽

10.3389/fnins.2021.651920 ◽

2021 ◽

Vol 15 ◽

Author(s):

Yanxian He ◽

Jun Wu ◽

Li Zhou ◽

Yi Chen ◽

Fang Li ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Graph Theory ◽

Visual Analysis ◽

Three Dimensional ◽

Brain Network ◽

Test Method ◽

Node Degree ◽

Convolution Process ◽

Convolution Structure

Alzheimer disease (AD) is mainly manifested as insidious onset, chronic progressive cognitive decline and non-cognitive neuropsychiatric symptoms, which seriously affects the quality of life of the elderly and causes a very large burden on society and families. This paper uses graph theory to analyze the constructed brain network, and extracts the node degree, node efficiency, and node betweenness centrality parameters of the two modal brain networks. The T test method is used to analyze the difference of graph theory parameters between normal people and AD patients, and brain regions with significant differences in graph theory parameters are selected as brain network features. By analyzing the calculation principles of the conventional convolutional layer and the depth separable convolution unit, the computational complexity of them is compared. The depth separable convolution unit decomposes the traditional convolution process into spatial convolution for feature extraction and point convolution for feature combination, which greatly reduces the number of multiplication and addition operations in the convolution process, while still being able to obtain comparisons. Aiming at the special convolution structure of the depth separable convolution unit, this paper proposes a channel pruning method based on the convolution structure and explains its pruning process. Multimodal neuroimaging can provide complete information for the quantification of Alzheimer’s disease. This paper proposes a cascaded three-dimensional neural network framework based on single-modal and multi-modal images, using MRI and PET images to distinguish AD and MCI from normal samples. Multiple three-dimensional CNN networks are used to extract recognizable information in local image blocks. The high-level two-dimensional CNN network fuses multi-modal features and selects the features of discriminative regions to perform quantitative predictions on samples. The algorithm proposed in this paper can automatically extract and fuse the features of multi-modality and multi-regions layer by layer, and the visual analysis results show that the abnormally changed regions affected by Alzheimer’s disease provide important information for clinical quantification.

Download Full-text

Research on multi-path dense networks for MRI spinal segmentation

PLoS ONE ◽

10.1371/journal.pone.0248303 ◽

2021 ◽

Vol 16 (3) ◽

pp. e0248303

Author(s):

ShuFen Liang ◽

Huilin Liu ◽

Chen Chen ◽

Chuanbo Qin ◽

FangChen Yang ◽

...

Keyword(s):

Feature Fusion ◽

Recognition Rate ◽

Semantic Segmentation ◽

Magnetic Resonance Images ◽

Anatomical Structures ◽

Dense Networks ◽

Robust Segmentation ◽

Level Information ◽

Convolution Structure ◽

New Type

Accurate and robust segmentation of anatomical structures from magnetic resonance images is valuable in many computer-aided clinical tasks. Traditional codec networks are not satisfactory because of their low accuracy of edge segmentation, the low recognition rate of the target, and loss of detailed information. To address these problems, this study proposes a series of improved models for semantic segmentation and progressively optimizes them from the three aspects of convolution module, codec unit, and feature fusion. Instead of the standard convolution structure, we apply a new type of convolution module for the feature extraction. The networks integrate a multi-path method to obtain richer-detail edge information. Finally, a dense network is utilized to strengthen the ability of the feature fusion and integrate more different-level information. The evaluation of the Accuracy, Dice coefficient, and Jaccard index led to values of 0.9855, 0.9185, and 0.8507, respectively. These metrics of the best network increased by 1.0%, 4.0%, and 6.1%, respectively. Boundary F1-Score reached 0.9124 indicating that the proposed networks can segment smaller targets to obtain smoother edges. Our methods obtain more key information than traditional methods and achieve superiority in segmentation performance.

Download Full-text

High Accuracy Interpolation of DEM Using Generative Adversarial Network

Remote Sensing ◽

10.3390/rs13040676 ◽

2021 ◽

Vol 13 (4) ◽

pp. 676

Author(s):

Li Yan ◽

Xingfen Tang ◽

Yi Zhang

Keyword(s):

Generative Adversarial Networks ◽

Traditional Methods ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks ◽

Interpolation Accuracy ◽

Quantitative Accuracy ◽

Digital Elevation ◽

Convolution Structure ◽

Elevation Model

Digital elevation model (DEM) interpolation is aimed at predicting the elevation values of unobserved locations, given a series of collected points. Over the years, the traditional interpolation methods have been widely used but can easily lead to accuracy degradation. In recent years, generative adversarial networks (GANs) have been proven to be more efficient than the traditional methods. However, the interpolation accuracy is not guaranteed. In this paper, we propose a GAN-based network named gated and symmetric-dilated U-net GAN (GSUGAN) for improved DEM interpolation, which performs visibly and quantitatively better than the traditional methods and the conditional encoder-decoder GAN (CEDGAN). We also discuss combinations of new techniques in the generator. This shows that the gated convolution and symmetric dilated convolution structure perform slightly better. Furthermore, based on the performance of the different methods, it was concluded that the Convolutional Neural Network (CNN)-based method has an advantage in the quantitative accuracy but the GAN-based method can obtain a better visual quality, especially in complex terrains. In summary, in this paper, we propose a GAN-based network for improved DEM interpolation and we further illustrate the GAN-based method’s performance compared to that of the CNN-based method.

Download Full-text

Convolution structures for an Orlicz space with respect to vector measures on a compact group

Proceedings of the Edinburgh Mathematical Society ◽

10.1017/s0013091521000018 ◽

2021 ◽

Vol 64 (1) ◽

pp. 87-98

Author(s):

Manoj Kumar ◽

N. Shravan Kumar

Keyword(s):

Abelian Group ◽

Compact Group ◽

Orlicz Space ◽

Vector Measure ◽

Young Function ◽

Natural Conditions ◽

Vector Measures ◽

Convolution Structure

The aim of this paper is to present some results about the space $L^{\varPhi }(\nu ),$ where $\nu$ is a vector measure on a compact (not necessarily abelian) group and $\varPhi$ is a Young function. We show that under natural conditions, the space $L^{\varPhi }(\nu )$ becomes an $L^{1}(G)$-module with respect to the usual convolution of functions. We also define one more convolution structure on $L^{\varPhi }(\nu ).$

Download Full-text

An Antinoise Fault Diagnosis Method Based on Multiscale 1DCNN

Shock and Vibration ◽

10.1155/2020/8819313 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Jie Cao ◽

Zhidong He ◽

Jinhua Wang ◽

Ping Yu

Keyword(s):

Activation Function ◽

Vibration Sensor ◽

Training Set ◽

Convolutional Network ◽

Convolution Structure ◽

Diagnosis Method ◽

Intense Noise ◽

Comparison Algorithms ◽

Classification Information ◽

Improved Methods

The bearing state signal collected by the vibration sensor contains a large amount of environmental noise in actual processes, which leads to a reduction in the accuracy of the convolutional network in identifying bearing faults. To solve this problem, a one-dimensional convolutional neural network with a multiscale kernel (MSK-1DCNN) is proposed for the classification information enhancement of the input. A two-layer multiscale convolution structure (MSK) is used at the front of the network. MSK has five convolutional kernels with different sizes, and those kernels are used to extract features with varying resolutions in the original signal. In the multiscale convolution structure, the ELU activation function is used instead of the ReLU function to improve the antinoise ability of MSK-1DCNN, also by adding pepper noise to the training set data to destroy the input data and forcing the network to learn more representative features to improve the robustness of the network. Experimental results illustrate that the improved methods proposed in this paper effectively enhance the diagnostic performance of MSK-1DCNN under intense noise, and the diagnostic accuracy is higher than that of other comparison algorithms.

Download Full-text

convolution structure
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Walnut Ripeness Detection Based on Coupling Information and Lightweight YOLOv4

An Effective Approach in Fusion of Multispectral Medical Images Using Convolution Structure Sparse Coding

Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure

Music Feature Classification Based on Recurrent Neural Networks with Channel Attention Mechanism

Modeling 3D Convolution Architecture for Actions Recognition

Quantification of Cognitive Function in Alzheimer’s Disease Based on Deep Learning

Research on multi-path dense networks for MRI spinal segmentation

High Accuracy Interpolation of DEM Using Generative Adversarial Network

Convolution structures for an Orlicz space with respect to vector measures on a compact group

An Antinoise Fault Diagnosis Method Based on Multiscale 1DCNN

Export Citation Format

convolution structureRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Walnut Ripeness Detection Based on Coupling Information and Lightweight YOLOv4

An Effective Approach in Fusion of Multispectral Medical Images Using Convolution Structure Sparse Coding

Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure

Music Feature Classification Based on Recurrent Neural Networks with Channel Attention Mechanism

Modeling 3D Convolution Architecture for Actions Recognition

Quantification of Cognitive Function in Alzheimer’s Disease Based on Deep Learning

Research on multi-path dense networks for MRI spinal segmentation

High Accuracy Interpolation of DEM Using Generative Adversarial Network

Convolution structures for an Orlicz space with respect to vector measures on a compact group

An Antinoise Fault Diagnosis Method Based on Multiscale 1DCNN

convolution structure
Recently Published Documents