4,189 Hits in 20.8 sec

Convolutional Gated Recurrent Networks for Video Segmentation [article]

Mennatullah Siam, Sepehr Valipour, Martin Jagersand, Nilanjan Ray
2016 arXiv   pre-print
To our knowledge, no prior work has made use of temporal video information in a recurrent network.  ...  Convolutional gated recurrent networks are used for the recurrent part to preserve spatial connectivities in the image. Our proposed method can be applied in both online and batch segmentation.  ...  In [2] convolutional GRU is introduced for learning spatio-temporal features from videos and used for video captioning and action recognition.  ... 
arXiv:1611.05435v2 fatcat:hhlpubj2mzazzjqt6r77frmgcy

Recurrent Fully Convolutional Networks for Video Segmentation [article]

Sepehr Valipour, Mennatullah Siam, Martin Jagersand, Nilanjan Ray
2016 arXiv   pre-print
The network is built from fully convolutional element and recurrent unit that works on a sliding window over the temporal data.  ...  We also introduce a novel convolutional gated recurrent unit that preserves the spatial information and reduces the parameters learned.  ...  Then a convolutional gated recurrent unit is used followed by one convolutional layer and then deconvolution for up-sampling.  ... 
arXiv:1606.00487v3 fatcat:5uh63wn37rbshk6nriec3iu23y

Recurrent Convolutions for Causal 3D CNNs [article]

Gurkirt Singh, Fabio Cuzzolin
2019 arXiv   pre-print
the temporal reasoning horizon to the size of the temporal convolution kernel, and are not temporal resolution-preserving for video sequence-to-sequence modelling, as, for instance, in action detection  ...  Recently, three dimensional (3D) convolutional neural networks (CNNs) have emerged as dominant methods to capture spatiotemporal representations in videos, by adding to pre-existing 2D CNNs a third, temporal  ...  Recurrent Convolutional Unit A pictorial illustration of our proposed Recurrent Convolutional Unit (RCU) is given in Figure 1 (c).  ... 
arXiv:1811.07157v2 fatcat:z3owwyj4ijdi5gcaypcjj6nx5y

Recurrent Convolutions for Causal 3D CNNs

Gurkirt SingH, Fabio Cuzzolin
2019 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)  
Such 3D CNNs, however,., they exploit information from both the past and the future frames to produce feature representations, thus preventing their use in online settings), constrain the temporal reasoning  ...  horizon to the size of the temporal convolution kernel, and are not temporal resolution-preserving for video sequence-to-sequence modelling, as, for instance, in action detection.  ...  Recurrent Convolutional Unit A pictorial illustration of our proposed Recurrent Convolutional Unit (RCU) is given in Figure 1 (c).  ... 
doi:10.1109/iccvw.2019.00183 dblp:conf/iccvw/SinghC19 fatcat:immlxkuktvfohibr7czex52q4q

Delving Deeper into Convolutional Networks for Learning Video Representations [article]

Nicolas Ballas, Li Yao, Chris Pal, Aaron Courville
2016 arXiv   pre-print
We propose an approach to learn spatio-temporal features in videos from intermediate visual representations we call "percepts" using Gated-Recurrent-Unit Recurrent Networks (GRUs).Our method relies on  ...  Using low-level percepts can leads to high-dimensionality video representations.  ...  GRU: GATED RECURRENT UNIT NETWORKS In this section, we review Gated-Recurrent-Unit (GRU) networks which are a particular type of RNN.  ... 
arXiv:1511.06432v4 fatcat:sdta2srvbjddtgbgztjeswwbt4

Semantic Video Segmentation by Gated Recurrent Flow Propagation [article]

David Nilsson, Cristian Sminchisescu
2017 arXiv   pre-print
Our model combines a convolutional architecture and a spatio-temporal transformer recurrent layer that are able to temporally propagate labeling information by means of optical flow, adaptively gated based  ...  The temporal, gated recurrent flow propagation component of our model can be plugged into any static semantic segmentation architecture and turn it into a weakly supervised video processing one.  ...  Acknowledgements: This work was funded in part by the European Research Council, ERC, Consolidator Grant SEED.  ... 
arXiv:1612.08871v2 fatcat:mcef73ca4zdufbkbzdqsa6siwq

PredCNN: Predictive Learning with Cascade Convolutions

Ziru Xu, Yunbo Wang, Mingsheng Long, Jianmin Wang
2018 Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence  
Mainstream recurrent models suffer from huge memory usage and computation cost, while convolutional models are unable to effectively capture the temporal dependencies between consecutive video frames.  ...  Predicting future frames in videos remains an unsolved but challenging problem.  ...  Different from these CNN architectures, our proposed PredCNN model exploits a novel gated cascade convolutional structure to capture temporal dependencies underlying video frames in a logical way.  ... 
doi:10.24963/ijcai.2018/408 dblp:conf/ijcai/XuWLW18 fatcat:5ccca7yghfhxlppl4js754h4cm

Comparative Analysis of CNN-based Spatiotemporal Reasoning in Videos [article]

Okan Köpüklü, Fabian Herzog, Gerhard Rigoll
2021 arXiv   pre-print
Understanding actions and gestures in video streams requires temporal reasoning of the spatial content from different time instants, i.e., spatiotemporal (ST) modeling.  ...  The common characteristic of these two benchmarks is that the designed architectures need to capture the full temporal content of videos in order to correctly classify actions/gestures.  ...  techniques such as vanilla RNN, gated recurrent unit (GRU), long short-term memory (LSTM) and bidirectional LSTM (B-LSTM) techniques, and finally fully convolutional technique.  ... 
arXiv:1909.05165v2 fatcat:szfegicdfrd63jgeh5g5dmi76a

Combining Sequential Geometry and Texture Features for Distinguishing Genuine and Deceptive Emotions

Liandong Li, Tadas Baltrusaitis, Bo Sun, Louis-Philippe Morency
2017 2017 IEEE International Conference on Computer Vision Workshops (ICCVW)  
To utilize the temporal information, we introduce temporal attention gated model for this emotion recognition task.  ...  In this paper, we explore a new type of automatic emotion recognition task -distinguishing genuine and deceptive emotions from video clips.  ...  The TAGM has a temporal attention module and a recurrent attention-gated unit.  ... 
doi:10.1109/iccvw.2017.372 dblp:conf/iccvw/LiBSM17 fatcat:nzgb2ztd2vhvnkssdk2k2nadey

Adaptive Detrending to Accelerate Convolutional Gated Recurrent Unit Training for Contextual Video Recognition [article]

Minju Jung, Haanvid Lee, Jun Tani
2017 arXiv   pre-print
gated recurrent unit (ConvGRU).  ...  To address contextual video recognition, we use convolutional recurrent neural networks (ConvRNNs) having a rich spatio-temporal information processing capability, but ConvRNNs requires extensive computation  ...  Convolutional Gated Recurrent Unit Convolutional gated recurrent unit (ConvGRU) is naturally extended from GRU by following the convolutional property of CNNs defined as follows: r t = σ(W r * x t + U  ... 
arXiv:1705.08764v1 fatcat:i6x6serafndpdm67ac2pyvt5ey

2D CNN and Gated Recurrent Network for Dynamic Hand Gesture Recognition with A Fusion of RGB-D and Optical Flow Data

We have also added a newest Gated recurrent network for temporal recognition of frame and minimize training time with improved accuracy.  ...  To obtain enough and useful information we have converted each RGB-D video to 30-frame and 45-frame for input.  ...  Temporal recognition using gated recurrent unit The input of this layer is fusion features with size 4096-dim after the convolution.  ... 
doi:10.35940/ijitee.j9185.0881019 fatcat:kqx3cymemjb3lkt2ix7lmndogm

Automated Bridge Component Recognition using Video Data [article]

Yasutaka Narazaki, Vedhus Hoskere, Tu A. Hoang, Billie F. Spencer Jr
2018 arXiv   pre-print
Inspired by the significant progress in video processing techniques, this study investigates automated bridge component recognition using video data, where the information from the past frames is used  ...  Then, convolutional Neural Networks (CNNs) with recurrent architectures are designed and applied to implement the automated bridge component recognition task.  ...  [15] used similar recurrent unit (Gated Recurrent Unit [14] ) with FCNs to get improved semantic segmentation of video data.  ... 
arXiv:1806.06820v2 fatcat:xcbnkm6w4vcrhicuu3neavkfvy

HetEmotionNet: Two-Stream Heterogeneous Graph Recurrent Neural Network for Multi-modal Emotion Recognition [article]

Ziyu Jia, Youfang Lin, Jing Wang, Zhiyang Feng, Xiangheng Xie, Caijie Chen
2021 arXiv   pre-print
Each stream is composed of the graph transformer network for modeling the heterogeneity, the graph convolutional network for modeling the correlation, and the gated recurrent unit for capturing the temporal  ...  Specifically, HetEmotionNet consists of the spatial-temporal stream and the spatial-spectral stream, which can fuse spatial-spectral-temporal domain features in a unified framework.  ...  Graph recurrent neural network is composed of graph convolutional network and gated recurrent unit. Graph Transformer Network.  ... 
arXiv:2108.03354v1 fatcat:o3esloogcfewddsmyzu2gv3tu4

Recurrent Residual Learning for Action Recognition [article]

Ahsan Iqbal, Alexander Richard, Hilde Kuehne, Juergen Gall
2017 arXiv   pre-print
as well as limiting the temporal context to a reasonable local range around each frame.  ...  Given pre-segmented videos, the task is to recognize actions happening within videos. Historically, hand crafted video features were used to address the task of action recognition.  ...  Similarly, in the other set, we use a recurrent neural network with 128 gated recurrent units (GRUs) in order to evaluate the performance of a classical recurrent network.  ... 
arXiv:1706.08807v1 fatcat:rhmcnmnogbda7kotzzwrtoh3ey

Long-term recurrent convolutional networks for visual recognition and description

Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Trevor Darrell, Kate Saenko
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
In contrast to current models which assume a fixed spatio-temporal receptive field or simple temporal averaging for sequential processing, recurrent convolutional models are "doubly deep" in that they  ...  can be compositional in spatial and temporal "layers".  ...  This work was supported in part by DARPA's MSEE and SMISC programs, NSF awards IIS-1427425, and IIS-1212798, IIS-1116411, Toyota, and the Berkeley Vision and Learning Center.  ... 
doi:10.1109/cvpr.2015.7298878 dblp:conf/cvpr/DonahueHGRVDS15 fatcat:5w4eeyesm5hipiieav2nzc4et4
« Previous Showing results 1 — 15 out of 4,189 results