scholarly journals Spatial attention model‐modulated bi‐directional long short‐term memory for unsupervised video summarisation

2021 ◽  
Vol 57 (6) ◽  
pp. 252-254
Author(s):  
Rui Zhong ◽  
Diyang Xiao ◽  
Shi Dong ◽  
Min Hu
Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3678
Author(s):  
Dongwon Lee ◽  
Minji Choi ◽  
Joohyun Lee

In this paper, we propose a prediction algorithm, the combination of Long Short-Term Memory (LSTM) and attention model, based on machine learning models to predict the vision coordinates when watching 360-degree videos in a Virtual Reality (VR) or Augmented Reality (AR) system. Predicting the vision coordinates while video streaming is important when the network condition is degraded. However, the traditional prediction models such as Moving Average (MA) and Autoregression Moving Average (ARMA) are linear so they cannot consider the nonlinear relationship. Therefore, machine learning models based on deep learning are recently used for nonlinear predictions. We use the Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) neural network methods, originated in Recurrent Neural Networks (RNN), and predict the head position in the 360-degree videos. Therefore, we adopt the attention model to LSTM to make more accurate results. We also compare the performance of the proposed model with the other machine learning models such as Multi-Layer Perceptron (MLP) and RNN using the root mean squared error (RMSE) of predicted and real coordinates. We demonstrate that our model can predict the vision coordinates more accurately than the other models in various videos.


2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Canlin Zhang ◽  
Daniel Biś ◽  
Xiuwen Liu ◽  
Zhe He

Abstract Background In recent years, deep learning methods have been applied to many natural language processing tasks to achieve state-of-the-art performance. However, in the biomedical domain, they have not out-performed supervised word sense disambiguation (WSD) methods based on support vector machines or random forests, possibly due to inherent similarities of medical word senses. Results In this paper, we propose two deep-learning-based models for supervised WSD: a model based on bi-directional long short-term memory (BiLSTM) network, and an attention model based on self-attention architecture. Our result shows that the BiLSTM neural network model with a suitable upper layer structure performs even better than the existing state-of-the-art models on the MSH WSD dataset, while our attention model was 3 or 4 times faster than our BiLSTM model with good accuracy. In addition, we trained “universal” models in order to disambiguate all ambiguous words together. That is, we concatenate the embedding of the target ambiguous word to the max-pooled vector in the universal models, acting as a “hint”. The result shows that our universal BiLSTM neural network model yielded about 90 percent accuracy. Conclusion Deep contextual models based on sequential information processing methods are able to capture the relative contextual information from pre-trained input word embeddings, in order to provide state-of-the-art results for supervised biomedical WSD tasks.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Kaicheng Feng ◽  
Xiaobing Liu

To improve the movie box office prediction accuracy, this paper proposes an adaptive attention with consumer sentinel (LSTM-AACS) for movie box office prediction. First, the influencing factors of the movie box office are analyzed. Tackling the problem of ignoring consumer groups in existing prediction models, we add consumer features and then quantitatively analyze and normalize the box office influence factors. Second, we establish an LSTM (Long Short-Term Memory) box office prediction model and inject the attention mechanism to construct an adaptive attention with consumer sentinel for movie box office prediction. Finally, 10,398 pieces of movie box office dataset are used in the Kaggle competition to compare the prediction results with the LSTM-AACS model, LSTM-Attention model, and LSTM model. The results show that the relative error of LSTM-AACS prediction is 6.58%, which is lower than other models used in the experiment.


Author(s):  
Xiaocheng Feng ◽  
Ming Liu ◽  
Jiahao Liu ◽  
Bing Qin ◽  
Yibo Sun ◽  
...  

 We focus on essay generation, which is a challenging task that generates a paragraph-level text with multiple topics.Progress towards understanding different topics and expressing diversity in this task requires more powerful generators and richer training and evaluation resources. To address this,  we develop a multi-topic aware long short-term memory (MTA-LSTM) network.In this model, we maintain a novel multi-topic coverage vector, which learns the weight of each topic and is sequentially updated during the decoding process.Afterwards this vector is fed to an attention model to guide the generator.Moreover, we automatically construct two paragraph-level Chinese essay corpora, 305,000 essay paragraphs and 55,000 question-and-answer pairs.Empirical results show that our approach obtains much better BLEU score compared to various baselines.Furthermore, human judgment shows that MTA-LSTM has the ability to generate essays that are not only coherent but also closely related to the input topics.


2020 ◽  
Author(s):  
Abdolreza Nazemi ◽  
Johannes Jakubik ◽  
Andreas Geyer-Schulz ◽  
Frank J. Fabozzi

Sign in / Sign up

Export Citation Format

Share Document