scholarly journals Multiscale Efficient Channel Attention for Fusion Lane Line Segmentation

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Kang Liu ◽  
Xin Gao

The use of multimodal sensors for lane line segmentation has become a growing trend. To achieve robust multimodal fusion, we introduced a new multimodal fusion method and proved its effectiveness in an improved fusion network. Specifically, a multiscale fusion module is proposed to extract effective features from data of different modalities, and a channel attention module is used to adaptively calculate the contribution of the fused feature channels. We verified the effect of multimodal fusion on the KITTI benchmark dataset and A2D2 dataset and proved the effectiveness of the proposed method on the enhanced KITTI dataset. Our method achieves robust lane line segmentation, which is 4.53% higher than the direct fusion on the precision index, and obtains the highest F2 score of 79.72%. We believe that our method introduces an optimization idea of modal data structure level for multimodal fusion.

Author(s):  
Zhenhong Zou ◽  
Xinyu Zhang ◽  
Huaping Liu ◽  
Zhiwei Li ◽  
Amir Hussain ◽  
...  

Symmetry ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 679
Author(s):  
Muhammad Anwar Ma’sum

Classification in multi-modal data is one of the challenges in the machine learning field. The multi-modal data need special treatment as its features are distributed in several areas. This study proposes multi-codebook fuzzy neural networks by using intelligent clustering and dynamic incremental learning for multi-modal data classification. In this study, we utilized intelligent K-means clustering based on anomalous patterns and intelligent K-means clustering based on histogram information. In this study, clustering is used to generate codebook candidates before the training process, while incremental learning is utilized when the condition to generate a new codebook is sufficient. The condition to generate a new codebook in incremental learning is based on the similarity of the winner class and other classes. The proposed method was evaluated in synthetic and benchmark datasets. The experiment results showed that the proposed multi-codebook fuzzy neural networks that use dynamic incremental learning have significant improvements compared to the original fuzzy neural networks. The improvements were 15.65%, 5.31% and 11.42% on the synthetic dataset, the benchmark dataset, and the average of all datasets, respectively, for incremental version 1. The incremental learning version 2 improved by 21.08% 4.63%, and 14.35% on the synthetic dataset, the benchmark dataset, and the average of all datasets, respectively. The multi-codebook fuzzy neural networks that use intelligent clustering also had significant improvements compared to the original fuzzy neural networks, achieving 23.90%, 2.10%, and 15.02% improvements on the synthetic dataset, the benchmark dataset, and the average of all datasets, respectively.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Hu Zhu ◽  
Ze Wang ◽  
Yu Shi ◽  
Yingying Hua ◽  
Guoxia Xu ◽  
...  

Multimodal fusion is one of the popular research directions of multimodal research, and it is also an emerging research field of artificial intelligence. Multimodal fusion is aimed at taking advantage of the complementarity of heterogeneous data and providing reliable classification for the model. Multimodal data fusion is to transform data from multiple single-mode representations to a compact multimodal representation. In previous multimodal data fusion studies, most of the research in this field used multimodal representations of tensors. As the input is converted into a tensor, the dimensions and computational complexity increase exponentially. In this paper, we propose a low-rank tensor multimodal fusion method with an attention mechanism, which improves efficiency and reduces computational complexity. We evaluate our model through three multimodal fusion tasks, which are based on a public data set: CMU-MOSI, IEMOCAP, and POM. Our model achieves a good performance while flexibly capturing the global and local connections. Compared with other multimodal fusions represented by tensors, experiments show that our model can achieve better results steadily under a series of attention mechanisms.


Author(s):  
Yinhuan ZHANG ◽  
Qinkun XIAO ◽  
Chaoqin CHU ◽  
Heng XING

The multi-modal data fusion method based on IA-net and CHMM technical proposed is designed to solve the problem that the incompleteness of target behavior information in complex family environment leads to the low accuracy of human behavior recognition.The two improved neural networks(STA-ResNet50、STA-GoogleNet)are combined with LSTM to form two IA-Nets respectively to extract RGB and skeleton modal behavior features in video. The two modal feature sequences are input CHMM to construct the probability fusion model of multi-modal behavior recognition.The experimental results show that the human behavior recognition model proposed in this paper has higher accuracy than the previous fusion methods on HMDB51 and UCF101 datasets. New contributions: attention mechanism is introduced to improve the efficiency of video target feature extraction and utilization. A skeleton based feature extraction framework is proposed, which can be used for human behavior recognition in complex environment. In the field of human behavior recognition, probability theory and neural network are cleverly combined and applied, which provides a new method for multi-modal information fusion.


2021 ◽  
Vol 3 ◽  
Author(s):  
Juan Song ◽  
Jian Zheng ◽  
Ping Li ◽  
Xiaoyuan Lu ◽  
Guangming Zhu ◽  
...  

Alzheimer's disease (AD) is an irreversible brain disease that severely damages human thinking and memory. Early diagnosis plays an important part in the prevention and treatment of AD. Neuroimaging-based computer-aided diagnosis (CAD) has shown that deep learning methods using multimodal images are beneficial to guide AD detection. In recent years, many methods based on multimodal feature learning have been proposed to extract and fuse latent representation information from different neuroimaging modalities including magnetic resonance imaging (MRI) and 18-fluorodeoxyglucose positron emission tomography (FDG-PET). However, these methods lack the interpretability required to clearly explain the specific meaning of the extracted information. To make the multimodal fusion process more persuasive, we propose an image fusion method to aid AD diagnosis. Specifically, we fuse the gray matter (GM) tissue area of brain MRI and FDG-PET images by registration and mask coding to obtain a new fused modality called “GM-PET.” The resulting single composite image emphasizes the GM area that is critical for AD diagnosis, while retaining both the contour and metabolic characteristics of the subject's brain tissue. In addition, we use the three-dimensional simple convolutional neural network (3D Simple CNN) and 3D Multi-Scale CNN to evaluate the effectiveness of our image fusion method in binary classification and multi-classification tasks. Experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset indicate that the proposed image fusion method achieves better overall performance than unimodal and feature fusion methods, and that it outperforms state-of-the-art methods for AD diagnosis.


Sign in / Sign up

Export Citation Format

Share Document