Filters








167 Hits in 4.7 sec

3D Layout Propagation to Improve Object Recognition in Egocentric Videos [chapter]

Alejandro Rituerto, Ana C. Murillo, José J. Guerrero
2015 Lecture Notes in Computer Science  
A common initial step in those settings is the estimation of the 3D layout of the scene.  ...  Our experiments demonstrate how this layout information can be used to improve detection tasks useful for a human user, in particular sign detection, by easily rejecting false positives.  ...  Method evaluation Improving object recognition tasks This subsection shows results on object recognition tasks, poster detection in this case, using an egocentric vision dataset.  ... 
doi:10.1007/978-3-319-16199-0_58 fatcat:skj2nuyddvc27gx7jxtcqzz35m

Egocentric Activity Recognition and Localization on a 3D Map [article]

Miao Liu, Lingni Ma, Kiran Somasundaram, Yin Li, Kristen Grauman, James M. Rehg, Chao Li
2021 arXiv   pre-print
We believe our work points to an exciting research direction in the intersection of egocentric vision, and 3D scene understanding.  ...  To evaluate our model, we conduct extensive experiments on a newly collected egocentric video dataset, in which both human naturalistic actions and photo-realistic 3D environment reconstructions are captured  ...  This is the first activity recognition dataset to include both egocentric videos and high-quality 3D environment reconstructions.  ... 
arXiv:2105.09544v2 fatcat:vcil5wq36bavzbne6hmfysb26a

ECO: Egocentric Cognitive Mapping [article]

Jayant Sharma, Zixing Wang, Alberto Speranzon, Vijay Venkataraman, Hyun Soo Park
2018 arXiv   pre-print
To enable such a capability, we design a new egocentric representation, which we call ECO (Egocentric COgnitive map).  ...  As a proof-of-concept, we use ECO to localize a camera within real-world scenes---various grocery stores---and demonstrate performance improvements when compared to existing semantic localization approaches  ...  Since the bounding box's cor- ners are specified in 3D, these labels are easily propagated to the rest of the images in the reconstructed batch. Thus, 1 image labels 200 others.  ... 
arXiv:1812.00312v1 fatcat:4xmcfbgtc5gqrn2q4jj6w2xbbi

Learning 3D-aware Egocentric Spatial-Temporal Interaction via Graph Convolutional Networks [article]

Chengxi Li, Yue Meng, Stanley H. Chan, Yi-Ting Chen
2020 arXiv   pre-print
Second, objects' 3D locations are explicitly incorporated into GCN to better model egocentric interactions.  ...  Third, to implement ego-stuff interaction in GCN, we propose a MaskAlign operation to extract features for irregular objects.  ...  Given a video segment, our model applies 3D convolutions to extract visual features followed by two branches: RoIAlign is employed to extract object features from object bounding boxes and MaskAlign is  ... 
arXiv:1909.09272v3 fatcat:treyx4u7ojcjbfc2yrwnn34w2q

Modeling the environment with egocentric vision systems

Alejandro Rituerto
2015 ELCVIA Electronic Letters on Computer Vision and Image Analysis  
Improving object recognition tasks This subsection shows results on object recognition tasks, poster detection in this case, using an egocentric vision dataset.  ...  Additionally, we have shown how a basic 3D spatial layout can improve the results in tasks such as object recognition, particularly for poster detection.  ... 
doi:10.5565/rev/elcvia.739 fatcat:ts63ewdowrat7a2lqvp7h2uzeq

Revisiting spatio-temporal layouts for compositional action recognition [article]

Gorjan Radevski, Marie-Francine Moens, Tinne Tuytelaars
2021 arXiv   pre-print
Motivated by this hypothesis, in this work, we take an object-centric approach to action recognition.  ...  On the Something-Else and Action Genome datasets, we demonstrate (i) how to extend multi-head attention for spatio-temporal layout-based action recognition, (ii) how to improve the performance of appearance-based  ...  plug-in components to improve the temporal reasoning [58] .  ... 
arXiv:2111.01936v1 fatcat:q3l3m7nj7jadflecmeulmwefcy

LSTA: Long Short-Term Attention for Egocentric Action Recognition

Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Egocentric activity recognition is one of the most challenging tasks in video analysis. It requires a fine-grained discrimination of small objects and their manipulation.  ...  In this paper we propose LSTA as a mechanism to focus on features from relevant spatial parts while attention is being tracked smoothly across the video sequence.  ...  However, the performance of deep learning action recognition from videos is still not comparable to the advances made in object recognition from still images [12] .  ... 
doi:10.1109/cvpr.2019.01019 dblp:conf/cvpr/SudhakaranEL19 fatcat:numtqwnpdjgijhao3s4e7optii

HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction [article]

Yunze Liu, Yun Liu, Che Jiang, Kangbo Lyu, Weikang Wan, Hao Shen, Boqiang Liang, Zhoujie Fu, He Wang, Li Yi
2022 arXiv   pre-print
We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction.  ...  HOI4D consists of 2.4M RGB-D egocentric video frames over 4000 sequences collected by 4 participants interacting with 800 different object instances from 16 categories over 610 different indoor rooms.  ...  In a simple scene, the object interacted with is not obscured by surrounding objects, and the variety of camera views in the video is small to improve consistency across frames.  ... 
arXiv:2203.01577v3 fatcat:kkwisjhrkbgzfp764bt26hd2ra

Egocentric scene context for human-centric environment understanding from video [article]

Tushar Nagarajan, Santhosh Kumar Ramakrishnan, Ruta Desai, James Hillis, Kristen Grauman
2022 arXiv   pre-print
We train such models using videos from agents in simulated 3D environments where the environment is fully observable, and test them on real-world videos of house tours from unseen environments.  ...  We present an approach that links egocentric video and camera pose over time by learning representations that are predictive of the camera-wearer's (potentially unseen) local surroundings to facilitate  ...  For egocentric video, prior work has used structure from motion (SfM) to map people and objects for trajectory [55] and activity forecasting [29] and action grounding in 3D [62, 14] .  ... 
arXiv:2207.11365v1 fatcat:qhcezflvxzekdoagfzo2oyzpq4

LSTA: Long Short-Term Attention for Egocentric Action Recognition [article]

Swathikiran Sudhakaran and Sergio Escalera and Oswald Lanz
2019 arXiv   pre-print
Egocentric activity recognition is one of the most challenging tasks in video analysis. It requires a fine-grained discrimination of small objects and their manipulation.  ...  In this paper we propose LSTA as a mechanism to focus on features from spatial relevant parts while attention is being tracked smoothly across the video sequence.  ...  However, the performance of deep learning action recognition from videos is still not comparable to the advances made in object recognition from still images [12] .  ... 
arXiv:1811.10698v3 fatcat:ix72yqsyfnhhfcwcsvffr4xnxq

A Sequential Classifier for Hand Detection in the Framework of Egocentric Vision

Alejandro Betancourt
2014 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops  
Hand detection is one of the most explored areas in Egocentric Vision Video Analysis for wearable devices.  ...  Experimental results show a considerable improvement in the detection of true negatives, without compromising the performance of the true positives.  ...  According to the seminal work proposed in [16] , known for being the first public dataset for egocentric object recognition, existing methods for detecting hands in a scene can be divided into two groups  ... 
doi:10.1109/cvprw.2014.92 dblp:conf/cvpr/Betancourt14 fatcat:drxt6iamrzg2vdr3gg7zyqggva

cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey [article]

Hirokatsu Kataoka and Yudai Miyashita and Tomoaki Yamabe and Soma Shirakabe and Shin'ichi Sato and Hironori Hoshino and Ryo Kato and Kaori Abe and Takaaki Imanari and Naomichi Kobayashi and Shinichiro Morita and Akio Nakamura
2016 arXiv   pre-print
(TDU), and Univ. of Tsukuba that aims to systematically summarize papers on computer vision, pattern recognition, and related fields.  ...  For this particular review, we focused on reading the ALL 602 conference papers presented at the CVPR2015, the premier annual computer vision event held in June 2015, in order to grasp the trends in the  ...  to improvement of accuracy in object detection.  ... 
arXiv:1605.08247v1 fatcat:cd4mc7uor5f2rpu3aketd6naf4

2020 Index IEEE Transactions on Image Processing Vol. 29

2020 IEEE Transactions on Image Processing  
Fan, B., +, TIP 2020 8120-8133 Scene Recognition With Prototype-Agnostic Scene Layout. Chen, G., +, TIP 2020 5877-5888 Semantics-Preserving Graph Propagation for Zero-Shot Object Detection.  ...  Scene Recognition With Prototype-Agnostic Scene Layout. Chen, G., +, TIP 2020 5877-5888 Screen Content Video Quality Assessment: Subjective and Objective Study.  ... 
doi:10.1109/tip.2020.3046056 fatcat:24m6k2elprf2nfmucbjzhvzk3m

A Metaverse: taxonomy, components, applications, and open challenges

Sang-Min Park, Young-Gab Kim
2022 IEEE Access  
Furthermore, we describe essential methods based on three components and techniques to Metaverse's representative Ready Player One, Roblox, and Facebook research in the domain of films, games, and studies  ...  access to connectivity with reality using virtual currency.  ...  Most video processing uses third-person video data sets, so egocentric video data is not enough.  ... 
doi:10.1109/access.2021.3140175 fatcat:fnraeaz74vh33knfvhzrynesli

2021 Index IEEE Transactions on Image Processing Vol. 30

2021 IEEE Transactions on Image Processing  
-that appeared in this periodical during 2021, and items from previous years that were commented upon or corrected in 2021.  ...  Note that the item title is found only under the primary entry in the Author Index.  ...  ., +, TIP 2021 207-219 Together Recognizing, Localizing and Summarizing Actions in Egocentric Videos.  ... 
doi:10.1109/tip.2022.3142569 fatcat:z26yhwuecbgrnb2czhwjlf73qu
« Previous Showing results 1 — 15 out of 167 results