Using temporal information in input features of neural networks

Using spatial-temporal ensembles of convolutional neural networks for lumen segmentation in ureteroscopy

International Journal of Computer Assisted Radiology and Surgery ◽

10.1007/s11548-021-02376-3 ◽

2021 ◽

Author(s):

Jorge F. Lazo ◽

Aldo Marzullo ◽

Sara Moccia ◽

Michele Catellani ◽

Benoit Rosa ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Automatic Segmentation ◽

Temporal Information ◽

Invasive Technique ◽

Dice Similarity Coefficient ◽

Specular Reflections ◽

Lumen Segmentation ◽

Previous State

Abstract Purpose Ureteroscopy is an efficient endoscopic minimally invasive technique for the diagnosis and treatment of upper tract urothelial carcinoma. During ureteroscopy, the automatic segmentation of the hollow lumen is of primary importance, since it indicates the path that the endoscope should follow. In order to obtain an accurate segmentation of the hollow lumen, this paper presents an automatic method based on convolutional neural networks (CNNs). Methods The proposed method is based on an ensemble of 4 parallel CNNs to simultaneously process single and multi-frame information. Of these, two architectures are taken as core-models, namely U-Net based in residual blocks ($$m_1$$ m 1 ) and Mask-RCNN ($$m_2$$ m 2 ), which are fed with single still-frames I(t). The other two models ($$M_1$$ M 1 , $$M_2$$ M 2 ) are modifications of the former ones consisting on the addition of a stage which makes use of 3D convolutions to process temporal information. $$M_1$$ M 1 , $$M_2$$ M 2 are fed with triplets of frames ($$I(t-1)$$ I ( t - 1 ) , I(t), $$I(t+1)$$ I ( t + 1 ) ) to produce the segmentation for I(t). Results The proposed method was evaluated using a custom dataset of 11 videos (2673 frames) which were collected and manually annotated from 6 patients. We obtain a Dice similarity coefficient of 0.80, outperforming previous state-of-the-art methods. Conclusion The obtained results show that spatial-temporal information can be effectively exploited by the ensemble model to improve hollow lumen segmentation in ureteroscopic images. The method is effective also in the presence of poor visibility, occasional bleeding, or specular reflections.

Download Full-text

Accurate and Efficient Video De-Fencing Using Convolutional Neural Networks and Temporal Information

2018 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme.2018.8486522 ◽

2018 ◽

Cited By ~ 4

Author(s):

Chen Du ◽

Byeongkeun Kang ◽

Zheng Xu ◽

Ji Dai ◽

Truong Nguyen

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Temporal Information ◽

Efficient Video

Download Full-text

Distributed Deterministic Temporal Information Propagated by Feedforward Neural Networks

Lecture Notes in Computer Science - Artificial Neural Networks and Machine Learning – ICANN 2011 ◽

10.1007/978-3-642-21735-7_32 ◽

2011 ◽

pp. 258-265

Author(s):

Yoshiyuki Asai ◽

Alessandro E. P. Villa

Keyword(s):

Neural Networks ◽

Feedforward Neural Networks ◽

Temporal Information

Download Full-text

Convolutional Nonlinear Differential Recurrent Neural Networks for Crowd Scene Understanding

International Journal of Semantic Computing ◽

10.1142/s1793351x18400196 ◽

2018 ◽

Vol 12 (04) ◽

pp. 481-500 ◽

Cited By ~ 1

Author(s):

Naifan Zhuang ◽

The Duc Kieu ◽

Jun Ye ◽

Kien A. Hua

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Scene Understanding ◽

Image Data ◽

High Density ◽

Temporal Information ◽

Deep Model ◽

End To End ◽

The Individual

With the growth of crowd phenomena in the real world, crowd scene understanding is becoming an important task in anomaly detection and public security. Visual ambiguities and occlusions, high density, low mobility, and scene semantics, however, make this problem a great challenge. In this paper, we propose an end-to-end deep architecture, convolutional nonlinear differential recurrent neural networks (CNDRNNs), for crowd scene understanding. CNDRNNs consist of GoogleNet Inception V3 convolutional neural networks (CNNs) and nonlinear differential recurrent neural networks (RNNs). Different from traditional non-end-to-end solutions which separate the steps of feature extraction and parameter learning, CNDRNN utilizes a unified deep model to optimize the parameters of CNN and RNN hand in hand. It thus has the potential of generating a more harmonious model. The proposed architecture takes sequential raw image data as input, and does not rely on tracklet or trajectory detection. It thus has clear advantages over the traditional flow-based and trajectory-based methods, especially in challenging crowd scenarios of high density and low mobility. Taking advantage of CNN and RNN, CNDRNN can effectively analyze the crowd semantics. Specifically, CNN is good at modeling the semantic crowd scene information. On the other hand, nonlinear differential RNN models the motion information. The individual and increasing orders of derivative of states (DoS) in differential RNN can progressively build up the ability of the long short-term memory (LSTM) gates to detect different levels of salient dynamical patterns in deeper stacked layers modeling higher orders of DoS. Lastly, existing LSTM-based crowd scene solutions explore deep temporal information and are claimed to be “deep in time.” Our proposed method CNDRNN, however, models the spatial and temporal information in a unified architecture and achieves “deep in space and time.” Extensive performance studies on the Violent-Flows, CUHK Crowd, and NUS-HGA datasets show that the proposed technique significantly outperforms state-of-the-art methods.

Download Full-text

Transformation of Spatio-temporal Information by Somatosensory Neural Networks

Information Processing in the Somatosensory System ◽

10.1007/978-1-349-11597-6_23 ◽

1991 ◽

pp. 319-327

Author(s):

Susan Warren ◽

Heikki A. Hämäläinen ◽

Claude I. Palmer ◽

Esther P. Gardner

Keyword(s):

Neural Networks ◽

Temporal Information ◽

Spatio Temporal

Download Full-text

Hierarchical dynamic depth projected difference images–based action recognition in videos with convolutional neural networks

International Journal of Advanced Robotic Systems ◽

10.1177/1729881418825093 ◽

2019 ◽

Vol 16 (1) ◽

pp. 172988141882509 ◽

Cited By ~ 3

Author(s):

Hanbo Wu ◽

Xin Ma ◽

Yibin Li

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Temporal Information ◽

Superior Performance ◽

Video Sequences ◽

Depth Video ◽

Difference Images

Temporal information plays a significant role in video-based human action recognition. How to effectively extract the spatial–temporal characteristics of actions in videos has always been a challenging problem. Most existing methods acquire spatial and temporal cues in videos individually. In this article, we propose a new effective representation for depth video sequences, called hierarchical dynamic depth projected difference images that can aggregate the action spatial and temporal information simultaneously at different temporal scales. We firstly project depth video sequences onto three orthogonal Cartesian views to capture the 3D shape and motion information of human actions. Hierarchical dynamic depth projected difference images are constructed with the rank pooling in each projected view to hierarchically encode the spatial–temporal motion dynamics in depth videos. Convolutional neural networks can automatically learn discriminative features from images and have been extended to video classification because of their superior performance. To verify the effectiveness of hierarchical dynamic depth projected difference images representation, we construct a hierarchical dynamic depth projected difference images–based action recognition framework where hierarchical dynamic depth projected difference images in three views are fed into three identical pretrained convolutional neural networks independently for finely retuning. We design three classification schemes in the framework and different schemes utilize different convolutional neural network layers to compare their effects on action recognition. Three views are combined to describe the actions more comprehensively in each classification scheme. The proposed framework is evaluated on three challenging public human action data sets. Experiments indicate that our method has better performance and can provide discriminative spatial–temporal information for human action recognition in depth videos.

Download Full-text

A Stochastic Multivariate Irregularly Sampled Time Series Imputation Method for Electronic Health Records

BioMedInformatics ◽

10.3390/biomedinformatics1030011 ◽

2021 ◽

Vol 1 (3) ◽

pp. 166-181

Author(s):

Muhammad Adib Uz Zaman ◽

Dongping Du

Keyword(s):

Neural Networks ◽

Time Series ◽

Electronic Health Records ◽

Missing Values ◽

Time Series Data ◽

Temporal Information ◽

Series Data ◽

High Dimensional ◽

Health Records ◽

Electronic Health

Electronic health records (EHRs) can be very difficult to analyze since they usually contain many missing values. To build an efficient predictive model, a complete dataset is necessary. An EHR usually contains high-dimensional longitudinal time series data. Most commonly used imputation methods do not consider the importance of temporal information embedded in EHR data. Besides, most time-dependent neural networks such as recurrent neural networks (RNNs) inherently consider the time steps to be equal, which in many cases, is not appropriate. This study presents a method using the gated recurrent unit (GRU), neural ordinary differential equations (ODEs), and Bayesian estimation to incorporate the temporal information and impute sporadically observed time series measurements in high-dimensional EHR data.

Download Full-text

Human Action Recognition by Fusion of Convolutional Neural Networks and spatial-temporal Information

Proceedings of the International Conference on Internet Multimedia Computing and Service - ICIMCS'16 ◽

10.1145/3007669.3007702 ◽

2016 ◽

Cited By ~ 1

Author(s):

Weisheng Li ◽

Yahui Ding

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Temporal Information

Download Full-text

Computing of temporal information in spiking neural networks with ReRAM synapses

Faraday Discussions ◽

10.1039/c8fd00097b ◽

2019 ◽

Vol 213 ◽

pp. 453-469 ◽

Cited By ~ 14

Author(s):

W. Wang ◽

G. Pedretti ◽

V. Milo ◽

R. Carboni ◽

A. Calderoni ◽

...

Keyword(s):

Neural Networks ◽

Spike Timing ◽

Temporal Information ◽

Spiking Neural Networks ◽

Spike Timing Dependent Plasticity ◽

Dependent Plasticity ◽

Neural Spikes

This work addresses the methodology and implementation of a neuromorphic SNN system to compute the temporal information among neural spikes using ReRAM synapses capable of spike-timing dependent plasticity (STDP).

Download Full-text

Brundlefly at SemEval-2016 Task 12: Recurrent Neural Networks vs. Joint Inference for Clinical Temporal Information Extraction

10.18653/v1/s16-1198 ◽

2016 ◽

Cited By ~ 11

Author(s):

Jason Fries

Keyword(s):

Neural Networks ◽

Information Extraction ◽

Recurrent Neural Networks ◽

Temporal Information ◽

Joint Inference ◽

Temporal Information Extraction

Download Full-text