scholarly journals Extreme Image Classification Algorithm Based on Multicore Dense Connection Network

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Daolei Wang ◽  
Tianyu Zhang ◽  
Rui Zhu ◽  
Mingshan Li ◽  
Jiajun Sun

Extreme images refer to low-quality images taken under extreme environmental conditions such as haze, heavy rain, strong light, and shaking, which will lead to the failure of the visual system to effectively recognize the target. Most of the existing extreme image restoration algorithms only handle the restoration work of a certain type of image; how to effectively recognize all kinds of extreme images is still a challenge. Therefore, this paper proposes a classification and restoration algorithm for extreme images. Due to the large differences in the features on extreme images, it is difficult for the existing models such as DenseNet to effectively extract depth features. In order to solve the classification problem in the algorithm, we propose a Multicore Dense Connection Network (MDCNet). MDCNet is composed of dense part, attention part, and classification part. Dense Part uses two dense blocks with different convolution kernel sizes to extract features of different sizes; attention part uses channel attention mechanism and spatial attention mechanism to amplify the effective information in the feature map; classification part is mainly composed of two convolutional layers and two fully connected layers to extract and classify feature images. Experiments have shown that the recall of MDCNet can reach 92.75% on extreme image dataset. At the same time, the mAP value of target detection can be improved by about 16% after the image is processed by the classification and recovery algorithm.

2020 ◽  
Vol 34 (04) ◽  
pp. 5742-5749
Author(s):  
Xiaoshuang Shi ◽  
Fuyong Xing ◽  
Yuanpu Xie ◽  
Zizhao Zhang ◽  
Lei Cui ◽  
...  

Although attention mechanisms have been widely used in deep learning for many tasks, they are rarely utilized to solve multiple instance learning (MIL) problems, where only a general category label is given for multiple instances contained in one bag. Additionally, previous deep MIL methods firstly utilize the attention mechanism to learn instance weights and then employ a fully connected layer to predict the bag label, so that the bag prediction is largely determined by the effectiveness of learned instance weights. To alleviate this issue, in this paper, we propose a novel loss based attention mechanism, which simultaneously learns instance weights and predictions, and bag predictions for deep multiple instance learning. Specifically, it calculates instance weights based on the loss function, e.g. softmax+cross-entropy, and shares the parameters with the fully connected layer, which is to predict instance and bag predictions. Additionally, a regularization term consisting of learned weights and cross-entropy functions is utilized to boost the recall of instances, and a consistency cost is used to smooth the training process of neural networks for boosting the model generalization performance. Extensive experiments on multiple types of benchmark databases demonstrate that the proposed attention mechanism is a general, effective and efficient framework, which can achieve superior bag and image classification performance over other state-of-the-art MIL methods, with obtaining higher instance precision and recall than previous attention mechanisms. Source codes are available on https://github.com/xsshi2015/Loss-Attention.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Bo Liang ◽  
Xin-xin Jia ◽  
Yuan Lu

Image restoration is a research hotspot in computer vision and computer graphics. It uses the effective information in the image to fill in the information of the designated damaged area. This has high application value in environmental design, film and television special effects production, old photo restoration, and removal of text or obstacles in images. In traditional sparse representation image restoration algorithms, the size of dictionary atoms is often fixed. When repairing the texture area, the dictionary atom will be too large to cause blurring. When repairing a smooth area, the dictionary atom is too small to cause the extension of the area, which affects the image repair effect. In this paper, the structural sparsity of the block to be repaired is used to adjust the repair priority. By analyzing the structure information of the repair block located in different regions such as texture, edge, and smoothing, the size of the dictionary atom is adaptively determined. This paper proposes a color image restoration method that adaptively determines the size of dictionary atoms and discusses a model based on the partial differential equation restoration method. Through simulation experiments combined with subjective and objective standards, the repair results are evaluated and analyzed. The simulation results show that the algorithm can effectively overcome the shortcomings of blurred details and region extension in fixed dictionary restoration, and the restoration effect has been significantly improved. Compared with the results of several other classic algorithms, it shows the effectiveness of the algorithm in this paper.


2021 ◽  
Vol 11 (24) ◽  
pp. 12019
Author(s):  
Chia-Chun Chuang ◽  
Chien-Ching Lee ◽  
Chia-Hong Yeng ◽  
Edmund-Cheung So ◽  
Yeou-Jiunn Chen

Monitoring people’s blood pressure can effectively prevent blood pressure-related diseases. Therefore, providing a convenient and comfortable approach can effectively help patients in monitoring blood pressure. In this study, an attention mechanism-based convolutional long short-term memory (LSTM) neural network is proposed to easily estimate blood pressure. To easily and comfortably estimate blood pressure, electrocardiogram (ECG) and photoplethysmography (PPG) signals are acquired. To precisely represent the characteristics of ECG and PPG signals, the signals in the time and frequency domain are selected as the inputs of the proposed NN structure. To automatically extract the features, the convolutional neural networks (CNNs) are adopted as the first part of neural networks. To identify the meaningful features, the attention mechanism is used in the second part of neural networks. To model the characteristic of time series, the long short-term memory (LSTM) is adopted in the third part of neural networks. To integrate the information of previous neural networks, the fully connected networks are used to estimate blood pressure. The experimental results show that the proposed approach outperforms CNN and CNN-LSTM and complies with the Association for the Advancement of Medical Instrumentation standard.


Author(s):  
BRENT FERGUSON ◽  
RANADHIR GHOSH ◽  
JOHN YEARWOOD

This paper reports on an experimental approach to find a modularized artificial neural network solution for the UCI letters recognition problem. Our experiments have been carried out in two parts. We investigate directed task decomposition using expert knowledge and clustering approaches to find the subtasks for the modules of the network. We next investigate processes to combine the modules effectively in a single decision process. After having found suitable modules through task decomposition we have found through further experimentation that when the modules are combined with decision tree supervision, their functional error is reduced significantly to improve their combination through the decision process that has been implemented as a small multilayered perceptron. The experiments conclude with a modularized neural network design for this classification problem that has increased learning and generalization characteristics. The test results for this network are markedly better than a single or stand alone network that has a fully connected topology.


1990 ◽  
Vol 15 (12) ◽  
pp. 688 ◽  
Author(s):  
J. B. Abbiss ◽  
C. L. Byrne ◽  
M. A. Fiddy ◽  
B. J. Brames

2021 ◽  
Vol 7 ◽  
pp. e822
Author(s):  
Zhisheng Yang ◽  
Jinyong Cheng

In the field of deep learning, the processing of large network models on billions or even tens of billions of nodes and numerous edge types is still flawed, and the accuracy of recommendations is greatly compromised when large network embeddings are applied to recommendation systems. To solve the problem of inaccurate recommendations caused by processing deficiencies in large networks, this paper combines the attributed multiplex heterogeneous network with the attention mechanism that introduces the softsign and sigmoid function characteristics and derives a new framework SSN_GATNE-T (S represents the softsign function, SN represents the attention mechanism introduced by the Softsign function, and GATNE-T represents the transductive embeddings learning for attribute multiple heterogeneous networks). The attributed multiplex heterogeneous network can help obtain more user-item information with more attributes. No matter how many nodes and types are included in the model, our model can handle it well, and the improved attention mechanism can help annotations to obtain more useful information via a combination of the two. This can help to mine more potential information to improve the recommendation effect; in addition, the application of the softsign function in the fully connected layer of the model can better reduce the loss of potential user information, which can be used for accurate recommendation by the model. Using the Adam optimizer to optimize the model can not only make our model converge faster, but it is also very helpful for model tuning. The proposed framework SSN_GATNE-T was tested for two different types of datasets, Amazon and YouTube, using three evaluation indices, ROC-AUC (receiver operating characteristic-area under curve), PR-AUC (precision recall-area under curve) and F1 (F1-score), and found that SSN_GATNE-T improved on all three evaluation indices compared to the mainstream recommendation models currently in existence. This not only demonstrates that the framework can deal well with the shortcomings of obtaining accurate interaction information due to the presence of a large number of nodes and edge types of the embedding of large network models, but also demonstrates the effectiveness of addressing the shortcomings of large networks to improve recommendation performance. In addition, the model is also a good solution to the cold start problem.


Symmetry ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 1296
Author(s):  
Wenfang Ma ◽  
Ying Hu ◽  
Hao Huang

The task of pitch estimation is an essential step in many audio signal processing applications. In this paper, we propose a data-driven pitch estimation network, the Dual Attention Network (DA-Net), which processes directly on the time-domain samples of monophonic music. DA-Net includes six Dual Attention Modules (DA-Modules), and each of them includes two kinds of attention: element-wise and channel-wise attention. DA-Net is to perform element attention and channel attention operations on convolution features, which reflects the idea of "symmetry". DA-Modules can model the semantic interdependencies between element-wise and channel-wise features. In the DA-Module, the element-wise attention mechanism is realized by a Convolutional Gated Linear Unit (ConvGLU), and the channel-wise attention mechanism is realized by a Squeeze-and-Excitation (SE) block. We explored three kinds of combination modes (serial mode, parallel mode, and tightly coupled mode) of the element-wise attention and channel-wise attention. Element-wise attention selectively emphasizes useful features by re-weighting the features at all positions. Channel-wise attention can learn to use global information to selectively emphasize the informative feature maps and suppress the less useful ones. Therefore, DA-Net adaptively integrates the local features with their global dependencies. The outputs of DA-Net are fed into a fully connected layer to generate a 360-dimensional vector corresponding to 360 pitches. We trained the proposed network on the iKala and MDB-stem-synth datasets, respectively. According to the experimental results, our proposed dual attention network with tightly coupled mode achieved the best performance.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Yali Peng ◽  
Ting Liang ◽  
Xiaojiang Hao ◽  
Yu Chen ◽  
Shicheng Li ◽  
...  

The demand forecast of shared bicycles directly determines the utilization rate of vehicles and projects operation benefits. Accurate prediction based on the existing operating data can reduce unnecessary delivery. Since the use of shared bicycles is susceptible to time dependence and external factors, most of the existing works only consider some of the attributes of shared bicycles, resulting in insufficient modeling and unsatisfactory prediction performance. In order to address the aforementioned limitations, this paper establishes a novelty prediction model based on convolutional recurrent neural network with the attention mechanism named as CNN-GRU-AM. There are four parts in the proposed CNN-GRU-AM model. First, a convolutional neural network (CNN) with two layers is used to extract local features from the multiple sources data. Second, the gated recurrent unit (GRU) is employed to capture the time-series relationships of the output data of CNN. Third, the attention mechanism (AM) is introduced to mining the potential relationships of the series features, in which different weights will be assigned to the corresponding features according to their importance. At last, a fully connected layer with three layers is added to learn features and output the prediction results. To evaluate the performance of the proposed method, we conducted massive experiments on two datasets including a real mobile bicycle data and a public shared bicycle data. The experimental results show that the prediction performance of the proposed model is better than other prediction models, indicating the significance of the social benefits.


SIMULATION ◽  
2021 ◽  
pp. 003754972199603
Author(s):  
Khatereh Davoudi ◽  
Parimala Thulasiraman

Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer mortality in women around the world. However, it can be controlled effectively by early diagnosis, followed by effective treatment. Clinical specialists take the advantages of computer-aided diagnosis (CAD) systems to make their diagnosis as accurate as possible. Deep learning techniques, such as the convolutional neural network (CNN), due to their classification capabilities on learned feature methods and ability of working with complex images, have been widely adopted in CAD systems. The parameters of the network, including the weights of the convolution filters and the weights of the fully connected layers, play a crucial role in the classification accuracy of any CNN model. The back-propagation technique is the most frequently used approach for training the CNN. However, this technique has some disadvantages, such as getting stuck in local minima. In this study, we propose to optimize the weights of the CNN using the genetic algorithm (GA). The work consists of designing a CNN model to facilitate the classification process, training the model using three different optimizers (mini-batch gradient descent, Adam, and GA), and evaluating the model through various experiments on the BreakHis dataset. We show that the CNN model trained through the GA performs as well as the Adam optimizer with a classification accuracy of 85%.


Symmetry ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1352
Author(s):  
Semih Yavuzkilic ◽  
Abdulkadir Sengur ◽  
Zahid Akhtar ◽  
Kamran Siddique

Deepfake is one of the applications that is deemed harmful. Deepfakes are a sort of image or video manipulation in which a person’s image is changed or swapped with that of another person’s face using artificial neural networks. Deepfake manipulations may be done with a variety of techniques and applications. A quintessential countermeasure against deepfake or face manipulation is deepfake detection method. Most of the existing detection methods perform well under symmetric data distributions, but are still not robust to asymmetric datasets variations and novel deepfake/manipulation types. In this paper, for the identification of fake faces in videos, a new multi-stream deep learning algorithm is developed, where three streams are merged at the feature level using the fusion layer. After the fusion layer, the fully connected, Softmax, and classification layers are used to classify the data. The pre-trained VGG16 model is adopted for transferred CNN1stream. In transfer learning, the weights of the pre-trained CNN model are further used for training the new classification problem. In the second stream (transferred CNN2), the pre-trained VGG19 model is used. Whereas, in the third stream, the pre-trained ResNet18 model is considered. In this paper, a new large-scale dataset (i.e., World Politicians Deepfake Dataset (WPDD)) is introduced to improve deepfake detection systems. The dataset was created by downloading videos of 20 different politicians from YouTube. Over 320,000 frames were retrieved after dividing the downloaded movie into little sections and extracting the frames. Finally, various manipulations were performed to these frames, resulting in seven separate manipulation classes for men and women. In the experiments, three fake face detection scenarios are investigated. First, fake and real face discrimination is studied. Second, seven face manipulations are performed, including age, beard, face swap, glasses, hair color, hairstyle, smiling, and genuine face discrimination. Third, performance of deepfake detection system under novel type of face manipulation is analyzed. The proposed strategy outperforms the prior existing methods. The calculated performance metrics are over 99%.


Sign in / Sign up

Export Citation Format

Share Document