scholarly journals Structured event memory: a neuro-symbolic model of event cognition

2019 ◽  
Author(s):  
Nicholas T. Franklin ◽  
Kenneth A. Norman ◽  
Charan Ranganath ◽  
Jeffrey M. Zacks ◽  
Samuel J. Gershman

AbstractHumans spontaneously organize a continuous experience into discrete events and use the learned structure of these events to generalize and organize memory. We introduce theStructured Event Memory(SEM) model of event cognition, which accounts for human abilities in event segmentation, memory, and generalization. SEM is derived from a probabilistic generative model of event dynamics defined over structured symbolic scenes. By embedding symbolic scene representations in a vector space and parametrizing the scene dynamics in this continuous space, SEM combines the advantages of structured and neural network approaches to high-level cognition. Using probabilistic reasoning over this generative model, SEM can infer event boundaries, learn event schemata, and use event knowledge to reconstruct past experience. We show that SEM can scale up to high-dimensional input spaces, producing human-like event segmentation for naturalistic video data, and accounts for a wide array of memory phenomena.

2020 ◽  
Vol 127 (3) ◽  
pp. 327-361 ◽  
Author(s):  
Nicholas T. Franklin ◽  
Kenneth A. Norman ◽  
Charan Ranganath ◽  
Jeffrey M. Zacks ◽  
Samuel J. Gershman

2021 ◽  
Vol 11 (9) ◽  
pp. 3730
Author(s):  
Aniqa Dilawari ◽  
Muhammad Usman Ghani Khan ◽  
Yasser D. Al-Otaibi ◽  
Zahoor-ur Rehman ◽  
Atta-ur Rahman ◽  
...  

After the September 11 attacks, security and surveillance measures have changed across the globe. Now, surveillance cameras are installed almost everywhere to monitor video footage. Though quite handy, these cameras produce videos in a massive size and volume. The major challenge faced by security agencies is the effort of analyzing the surveillance video data collected and generated daily. Problems related to these videos are twofold: (1) understanding the contents of video streams, and (2) conversion of the video contents to condensed formats, such as textual interpretations and summaries, to save storage space. In this paper, we have proposed a video description framework on a surveillance dataset. This framework is based on the multitask learning of high-level features (HLFs) using a convolutional neural network (CNN) and natural language generation (NLG) through bidirectional recurrent networks. For each specific task, a parallel pipeline is derived from the base visual geometry group (VGG)-16 model. Tasks include scene recognition, action recognition, object recognition and human face specific feature recognition. Experimental results on the TRECViD, UET Video Surveillance (UETVS) and AGRIINTRUSION datasets depict that the model outperforms state-of-the-art methods by a METEOR (Metric for Evaluation of Translation with Explicit ORdering) score of 33.9%, 34.3%, and 31.2%, respectively. Our results show that our framework has distinct advantages over traditional rule-based models for the recognition and generation of natural language descriptions.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4045
Author(s):  
Alessandro Sassu ◽  
Jose Francisco Saenz-Cogollo ◽  
Maurizio Agelli

Edge computing is the best approach for meeting the exponential demand and the real-time requirements of many video analytics applications. Since most of the recent advances regarding the extraction of information from images and video rely on computation heavy deep learning algorithms, there is a growing need for solutions that allow the deployment and use of new models on scalable and flexible edge architectures. In this work, we present Deep-Framework, a novel open source framework for developing edge-oriented real-time video analytics applications based on deep learning. Deep-Framework has a scalable multi-stream architecture based on Docker and abstracts away from the user the complexity of cluster configuration, orchestration of services, and GPU resources allocation. It provides Python interfaces for integrating deep learning models developed with the most popular frameworks and also provides high-level APIs based on standard HTTP and WebRTC interfaces for consuming the extracted video data on clients running on browsers or any other web-based platform.


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1275
Author(s):  
Changdao Du ◽  
Yoshiki Yamaguchi

Due to performance and energy requirements, FPGA-based accelerators have become a promising solution for high-performance computations. Meanwhile, with the help of high-level synthesis (HLS) compilers, FPGA can be programmed using common programming languages such as C, C++, or OpenCL, thereby improving design efficiency and portability. Stencil computations are significant kernels in various scientific applications. In this paper, we introduce an architecture design for implementing stencil kernels on state-of-the-art FPGA with high bandwidth memory (HBM). Traditional FPGAs are usually equipped with external memory, e.g., DDR3 or DDR4, which limits the design space exploration in the spatial domain of stencil kernels. Therefore, many previous studies mainly relied on exploiting parallelism in the temporal domain to eliminate the bandwidth limitations. In our approach, we scale-up the design performance by considering both the spatial and temporal parallelism of the stencil kernel equally. We also discuss the design portability among different HLS compilers. We use typical stencil kernels to evaluate our design on a Xilinx U280 FPGA board and compare the results with other existing studies. By adopting our method, developers can take broad parallelization strategies based on specific FPGA resources to improve performance.


2021 ◽  
Vol 11 (12) ◽  
pp. 1555
Author(s):  
Gianpaolo Alvari ◽  
Luca Coviello ◽  
Cesare Furlanello

The high level of heterogeneity in Autism Spectrum Disorder (ASD) and the lack of systematic measurements complicate predicting outcomes of early intervention and the identification of better-tailored treatment programs. Computational phenotyping may assist therapists in monitoring child behavior through quantitative measures and personalizing the intervention based on individual characteristics; still, real-world behavioral analysis is an ongoing challenge. For this purpose, we designed EYE-C, a system based on OpenPose and Gaze360 for fine-grained analysis of eye-contact episodes in unconstrained therapist-child interactions via a single video camera. The model was validated on video data varying in resolution and setting, achieving promising performance. We further tested EYE-C on a clinical sample of 62 preschoolers with ASD for spectrum stratification based on eye-contact features and age. By unsupervised clustering, three distinct sub-groups were identified, differentiated by eye-contact dynamics and a specific clinical phenotype. Overall, this study highlights the potential of Artificial Intelligence in categorizing atypical behavior and providing translational solutions that might assist clinical practice.


2021 ◽  
Vol 25 (5) ◽  
pp. 382-387
Author(s):  
S. Satyanarayana ◽  
V. Bhatia ◽  
P. P. Mandal ◽  
A. Kanchar ◽  
D. Falzon ◽  
...  

In September 2018, all countries made a commitment at the first ever United Nations High‐Level Meeting (UNHLM) on TB, to provide TB preventive treatment (TPT) to at least 30 million people at high‐risk of TB disease between 2018 and 2022. In the WHO South‐East Asia Region (SEA Region), which accounts for 44% of the global TB burden, only 1.2 million high‐risk individuals (household contacts and people living with HIV) were provided TPT (11% of the 10.8 million regional UNHLM TPT target) in 2018 and 2019. By 2020, almost all 11 countries of the SEA Region had revised their policies on TPT target groups and criteria to assess TPT eligibility, and had adopted at least one shorter TPT regimen recommended in the latest WHO TPT guidelines. The major challenges for TPT scale‐up in the SEA Region are resource shortages, knowledge and service delivery/uptake gaps among providers and service recipients, and the lack of adequate quantities of rifapentine for use in shorter TPT regimens. There are several regional opportunities to address these gaps and countries of the SEA Region must make use of these opportunities to scale up TPT services rapidly to reduce the TB burden in the SEA Region.


Author(s):  
Min Chen

The fast proliferation of video data archives has increased the need for automatic video content analysis and semantic video retrieval. Since temporal information is critical in conveying video content, in this chapter, an effective temporal-based event detection framework is proposed to support high-level video indexing and retrieval. The core is a temporal association mining process that systematically captures characteristic temporal patterns to help identify and define interesting events. This framework effectively tackles the challenges caused by loose video structure and class imbalance issues. One of the unique characteristics of this framework is that it offers strong generality and extensibility with the capability of exploring representative event patterns with little human interference. The temporal information and event detection results can then be input into our proposed distributed video retrieval system to support the high-level semantic querying, selective video browsing and event-based video retrieval.


Author(s):  
Min Chen

The fast proliferation of video data archives has increased the need for automatic video content analysis and semantic video retrieval. Since temporal information is critical in conveying video content, in this chapter, an effective temporal-based event detection framework is proposed to support high-level video indexing and retrieval. The core is a temporal association mining process that systematically captures characteristic temporal patterns to help identify and define interesting events. This framework effectively tackles the challenges caused by loose video structure and class imbalance issues. One of the unique characteristics of this framework is that it offers strong generality and extensibility with the capability of exploring representative event patterns with little human interference. The temporal information and event detection results can then be input into our proposed distributed video retrieval system to support the high-level semantic querying, selective video browsing and event-based video retrieval.


Vaccines ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 777
Author(s):  
Andrew Lees ◽  
Jackson F. Barr ◽  
Samson Gebretnsae

CDAP (1-cyano-4-dimethylaminopyridine tetrafluoroborate) is employed in the synthesis of conjugate vaccines as a cyanylating reagent. In the published method, which used pH 9 activation at 20 °C (Vaccine, 14:190, 1996), the rapid reaction made the process difficult to control. Here, we describe optimizing CDAP activation using dextran as a model polysaccharide. CDAP stability and reactivity were determined as a function of time, pH and temperature. While the rate of dextran activation was slower at lower pH and temperature, it was balanced by the increased stability of CDAP, which left more reagent available for reaction. Whereas maximal activation took less than 2.5 min at pH 9 and 20 °C, it took 10–15 min at 0 °C. At pH 7 and 0 °C, the optimal time increased to >3 h to achieve a high level of activation. Many buffers interfered with CDAP activation, but DMAP could be used to preadjust the pH of polysaccharide solutions so that the pH only needed to be maintained. We found that the stability of the activated dextran was relatively independent of pH over the range of pH 1–9, with the level of activation decreased by 40–60% over 2 h. The use of low temperature and a less basic pH, with an optimum reaction time, requires less CDAP, improving activation levels while making the process more reliable and easier to scale up.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ritaban Dutta ◽  
Cherry Chen ◽  
David Renshaw ◽  
Daniel Liang

AbstractExtraordinary shape recovery capabilities of shape memory alloys (SMAs) have made them a crucial building block for the development of next-generation soft robotic systems and associated cognitive robotic controllers. In this study we desired to determine whether combining video data analysis techniques with machine learning techniques could develop a computer vision based predictive system to accurately predict force generated by the movement of a SMA body that is capable of a multi-point actuation performance. We identified that rapid video capture of the bending movements of a SMA body while undergoing external electrical excitements and adapting that characterisation using computer vision approach into a machine learning model, can accurately predict the amount of actuation force generated by the body. This is a fundamental area for achieving a superior control of the actuation of SMA bodies. We demonstrate that a supervised machine learning framework trained with Restricted Boltzmann Machine (RBM) inspired features extracted from 45,000 digital thermal infrared video frames captured during excitement of various SMA shapes, is capable to estimate and predict force and stress with 93% global accuracy with very low false negatives and high level of predictive generalisation.


Sign in / Sign up

Export Citation Format

Share Document