scholarly journals A Bayesian Network-Based Information Fusion Combined with DNNs for Robust Video Fire Detection

2021 ◽  
Vol 11 (16) ◽  
pp. 7624
Author(s):  
Byoungjun Kim ◽  
Joonwhoan Lee

Fire is an abnormal event that can cause significant damage to lives and property. Deep learning approach has made large progress in vision-based fire detection. However, there is still the problem of false detections due to the objects which have similar fire-like visual properties such as colors or textures. In the previous video-based approach, Faster Region-based Convolutional Neural Network (R-CNN) is used to detect the suspected regions of fire (SRoFs), and long short-term memory (LSTM) accumulates the local features within the bounding boxes to decide a fire in a short-term period. Then, majority voting of the short-term decisions is taken to make the decision reliable in a long-term period. To ensure that the final fire decision is more robust, however, this paper proposes to use a Bayesian network to fuse various types of information. Because there are so many types of Bayesian network according to the situations or domains where the fire detection is needed, we construct a simple Bayesian network as an example which combines environmental information (e.g., humidity) with visual information including the results of location recognition and smoke detection, and long-term video-based majority voting. Our experiments show that the Bayesian network successfully improves the fire detection accuracy when compared against the previous video-based method and the state of art performance has been achieved with a public dataset. The proposed method also reduces the latency for perfect fire decisions, as compared with the previous video-based method.

2019 ◽  
Vol 9 (14) ◽  
pp. 2862 ◽  
Author(s):  
Byoungjun Kim ◽  
Joonwhoan Lee

Fire is an abnormal event which can cause significant damage to lives and property. In this paper, we propose a deep learning-based fire detection method using a video sequence, which imitates the human fire detection process. The proposed method uses Faster Region-based Convolutional Neural Network (R-CNN) to detect the suspected regions of fire (SRoFs) and of non-fire based on their spatial features. Then, the summarized features within the bounding boxes in successive frames are accumulated by Long Short-Term Memory (LSTM) to classify whether there is a fire or not in a short-term period. The decisions for successive short-term periods are then combined in the majority voting for the final decision in a long-term period. In addition, the areas of both flame and smoke are calculated and their temporal changes are reported to interpret the dynamic fire behavior with the final fire decision. Experiments show that the proposed long-term video-based method can successfully improve the fire detection accuracy compared with the still image-based or short-term video-based method by reducing both the false detections and the misdetections.


2017 ◽  
Vol 26 (1) ◽  
pp. 3-9 ◽  
Author(s):  
Stephen Darling ◽  
Richard J. Allen ◽  
Jelena Havelka

Visuospatial bootstrapping is the name given to a phenomenon whereby performance on visually presented verbal serial-recall tasks is better when stimuli are presented in a spatial array rather than a single location. However, the display used has to be a familiar one. This phenomenon implies communication between cognitive systems involved in storing short-term memory for verbal and visual information, alongside connections to and from knowledge held in long-term memory. Bootstrapping is a robust, replicable phenomenon that should be incorporated in theories of working memory and its interaction with long-term memory. This article provides an overview of bootstrapping, contextualizes it within research on links between long-term knowledge and short-term memory, and addresses how it can help inform current working memory theory.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yuhua Gao ◽  
Yong Mo ◽  
Heng Zhang ◽  
Ruiyin Huang ◽  
Zilong Chen

With the development of computer technology, video description, which combines the key technologies in the field of natural language processing and computer vision, has attracted more and more researchers’ attention. Among them, how to objectively and efficiently describe high-speed and detailed sports videos is the key to the development of the video description field. In view of the problems of sentence errors and loss of visual information in the generation of the video description text due to the lack of language learning information in the existing video description methods, a multihead model combining the long-term and short-term memory network and attention mechanism is proposed for the intelligent description of the volleyball video. Through the introduction of the attention mechanism, the model pays much attention to the significant areas in the video when generating sentences. Through the comparative experiment with different models, the results show that the model with the attention mechanism can effectively solve the loss of visual information. Compared with the LSTM and base model, the multihead model proposed in this paper, which combines the long-term and short-term memory network and attention mechanism, has higher scores in all evaluation indexes and significantly improved the quality of the intelligent text description of the volleyball video.


2016 ◽  
Vol 39 ◽  
Author(s):  
Mary C. Potter

AbstractRapid serial visual presentation (RSVP) of words or pictured scenes provides evidence for a large-capacity conceptual short-term memory (CSTM) that momentarily provides rich associated material from long-term memory, permitting rapid chunking (Potter 1993; 2009; 2012). In perception of scenes as well as language comprehension, we make use of knowledge that briefly exceeds the supposed limits of working memory.


2020 ◽  
Vol 29 (4) ◽  
pp. 710-727
Author(s):  
Beula M. Magimairaj ◽  
Naveen K. Nagaraj ◽  
Alexander V. Sergeev ◽  
Natalie J. Benafield

Objectives School-age children with and without parent-reported listening difficulties (LiD) were compared on auditory processing, language, memory, and attention abilities. The objective was to extend what is known so far in the literature about children with LiD by using multiple measures and selective novel measures across the above areas. Design Twenty-six children who were reported by their parents as having LiD and 26 age-matched typically developing children completed clinical tests of auditory processing and multiple measures of language, attention, and memory. All children had normal-range pure-tone hearing thresholds bilaterally. Group differences were examined. Results In addition to significantly poorer speech-perception-in-noise scores, children with LiD had reduced speed and accuracy of word retrieval from long-term memory, poorer short-term memory, sentence recall, and inferencing ability. Statistically significant group differences were of moderate effect size; however, standard test scores of children with LiD were not clinically poor. No statistically significant group differences were observed in attention, working memory capacity, vocabulary, and nonverbal IQ. Conclusions Mild signal-to-noise ratio loss, as reflected by the group mean of children with LiD, supported the children's functional listening problems. In addition, children's relative weakness in select areas of language performance, short-term memory, and long-term memory lexical retrieval speed and accuracy added to previous research on evidence-based areas that need to be evaluated in children with LiD who almost always have heterogenous profiles. Importantly, the functional difficulties faced by children with LiD in relation to their test results indicated, to some extent, that commonly used assessments may not be adequately capturing the children's listening challenges. Supplemental Material https://doi.org/10.23641/asha.12808607


2020 ◽  
Author(s):  
John J Shaw ◽  
Zhisen Urgolites ◽  
Padraic Monaghan

Visual long-term memory has a large and detailed storage capacity for individual scenes, objects, and actions. However, memory for combinations of actions and scenes is poorer, suggesting difficulty in binding this information together. Sleep can enhance declarative memory of information, but whether sleep can also boost memory for binding information and whether the effect is general across different types of information is not yet known. Experiments 1 to 3 tested effects of sleep on binding actions and scenes, and Experiments 4 and 5 tested binding of objects and scenes. Participants viewed composites and were tested 12-hours later after a delay consisting of sleep (9pm-9am) or wake (9am-9pm), on an alternative forced choice recognition task. For action-scene composites, memory was relatively poor with no significant effect of sleep. For object-scene composites sleep did improve memory. Sleep can promote binding in memory, depending on the type of information to be combined.


2021 ◽  
Vol 13 (12) ◽  
pp. 6866
Author(s):  
Haoru Li ◽  
Jinliang Xu ◽  
Xiaodong Zhang ◽  
Fangchen Ma

Recently, subways have become an important part of public transportation and have developed rapidly in China. In the subway station setting, pedestrians mainly rely on visual short-term memory to obtain information on how to travel. This research aimed to explore the short-term memory capacities and the difference in short-term memory for different information for Chinese passengers regarding subway signs. Previous research has shown that people’s general short-term memory capacity is approximately four objects and that, the more complex the information, the lower people’s memory capacity. However, research on the short-term memory characteristics of pedestrians for subway signs is scarce. Hence, based on the STM theory and using 32 subway signs as stimuli, we recruited 120 subjects to conduct a cognitive test. The results showed that passengers had a different memory accuracy for different types of information in the signs. They were more accurate regarding line number and arrow, followed by location/text information, logos, and orientation. Meanwhile, information type, quantity, and complexity had significant effects on pedestrians’ short-term memory capacity. Finally, according to our results that outline the characteristics of short-term memory for subway signs, we put forward some suggestions for subway signs. The findings will be effective in helping designers and managers improve the quality of subway station services as well as promoting the development of pedestrian traffic in such a setting.


2021 ◽  
Vol 13 (2) ◽  
pp. 164
Author(s):  
Chuyao Luo ◽  
Xutao Li ◽  
Yongliang Wen ◽  
Yunming Ye ◽  
Xiaofeng Zhang

The task of precipitation nowcasting is significant in the operational weather forecast. The radar echo map extrapolation plays a vital role in this task. Recently, deep learning techniques such as Convolutional Recurrent Neural Network (ConvRNN) models have been designed to solve the task. These models, albeit performing much better than conventional optical flow based approaches, suffer from a common problem of underestimating the high echo value parts. The drawback is fatal to precipitation nowcasting, as the parts often lead to heavy rains that may cause natural disasters. In this paper, we propose a novel interaction dual attention long short-term memory (IDA-LSTM) model to address the drawback. In the method, an interaction framework is developed for the ConvRNN unit to fully exploit the short-term context information by constructing a serial of coupled convolutions on the input and hidden states. Moreover, a dual attention mechanism on channels and positions is developed to recall the forgotten information in the long term. Comprehensive experiments have been conducted on CIKM AnalytiCup 2017 data sets, and the results show the effectiveness of the IDA-LSTM in addressing the underestimation drawback. The extrapolation performance of IDA-LSTM is superior to that of the state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document