A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
3D Layout Propagation to Improve Object Recognition in Egocentric Videos
[chapter]
2015
Lecture Notes in Computer Science
A common initial step in those settings is the estimation of the 3D layout of the scene. ...
Our experiments demonstrate how this layout information can be used to improve detection tasks useful for a human user, in particular sign detection, by easily rejecting false positives. ...
Method evaluation
Improving object recognition tasks This subsection shows results on object recognition tasks, poster detection in this case, using an egocentric vision dataset. ...
doi:10.1007/978-3-319-16199-0_58
fatcat:skj2nuyddvc27gx7jxtcqzz35m
Egocentric Activity Recognition and Localization on a 3D Map
[article]
2021
arXiv
pre-print
We believe our work points to an exciting research direction in the intersection of egocentric vision, and 3D scene understanding. ...
To evaluate our model, we conduct extensive experiments on a newly collected egocentric video dataset, in which both human naturalistic actions and photo-realistic 3D environment reconstructions are captured ...
This is the first activity recognition dataset to include both egocentric videos and high-quality 3D environment reconstructions. ...
arXiv:2105.09544v2
fatcat:vcil5wq36bavzbne6hmfysb26a
ECO: Egocentric Cognitive Mapping
[article]
2018
arXiv
pre-print
To enable such a capability, we design a new egocentric representation, which we call ECO (Egocentric COgnitive map). ...
As a proof-of-concept, we use ECO to localize a camera within real-world scenes---various grocery stores---and demonstrate performance improvements when compared to existing semantic localization approaches ...
Since the bounding box's cor- ners are specified in 3D, these labels are easily propagated to the rest of the images in the reconstructed batch. Thus, 1 image labels 200 others. ...
arXiv:1812.00312v1
fatcat:4xmcfbgtc5gqrn2q4jj6w2xbbi
Learning 3D-aware Egocentric Spatial-Temporal Interaction via Graph Convolutional Networks
[article]
2020
arXiv
pre-print
Second, objects' 3D locations are explicitly incorporated into GCN to better model egocentric interactions. ...
Third, to implement ego-stuff interaction in GCN, we propose a MaskAlign operation to extract features for irregular objects. ...
Given a video segment, our model applies 3D convolutions to extract visual features followed by two branches: RoIAlign is employed to extract object features from object bounding boxes and MaskAlign is ...
arXiv:1909.09272v3
fatcat:treyx4u7ojcjbfc2yrwnn34w2q
Modeling the environment with egocentric vision systems
2015
ELCVIA Electronic Letters on Computer Vision and Image Analysis
Improving object recognition tasks This subsection shows results on object recognition tasks, poster detection in this case, using an egocentric vision dataset. ...
Additionally, we have shown how a basic 3D spatial layout can improve the results in tasks such as object recognition, particularly for poster detection. ...
doi:10.5565/rev/elcvia.739
fatcat:ts63ewdowrat7a2lqvp7h2uzeq
Revisiting spatio-temporal layouts for compositional action recognition
[article]
2021
arXiv
pre-print
Motivated by this hypothesis, in this work, we take an object-centric approach to action recognition. ...
On the Something-Else and Action Genome datasets, we demonstrate (i) how to extend multi-head attention for spatio-temporal layout-based action recognition, (ii) how to improve the performance of appearance-based ...
plug-in components to improve the temporal reasoning [58] . ...
arXiv:2111.01936v1
fatcat:q3l3m7nj7jadflecmeulmwefcy
LSTA: Long Short-Term Attention for Egocentric Action Recognition
2019
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Egocentric activity recognition is one of the most challenging tasks in video analysis. It requires a fine-grained discrimination of small objects and their manipulation. ...
In this paper we propose LSTA as a mechanism to focus on features from relevant spatial parts while attention is being tracked smoothly across the video sequence. ...
However, the performance of deep learning action recognition from videos is still not comparable to the advances made in object recognition from still images [12] . ...
doi:10.1109/cvpr.2019.01019
dblp:conf/cvpr/SudhakaranEL19
fatcat:numtqwnpdjgijhao3s4e7optii
HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction
[article]
2022
arXiv
pre-print
We present HOI4D, a large-scale 4D egocentric dataset with rich annotations, to catalyze the research of category-level human-object interaction. ...
HOI4D consists of 2.4M RGB-D egocentric video frames over 4000 sequences collected by 4 participants interacting with 800 different object instances from 16 categories over 610 different indoor rooms. ...
In a simple scene, the object interacted with is not obscured by surrounding objects, and the variety of camera views in the video is small to improve consistency across frames. ...
arXiv:2203.01577v3
fatcat:kkwisjhrkbgzfp764bt26hd2ra
Egocentric scene context for human-centric environment understanding from video
[article]
2022
arXiv
pre-print
We train such models using videos from agents in simulated 3D environments where the environment is fully observable, and test them on real-world videos of house tours from unseen environments. ...
We present an approach that links egocentric video and camera pose over time by learning representations that are predictive of the camera-wearer's (potentially unseen) local surroundings to facilitate ...
For egocentric video, prior work has used structure from motion (SfM) to map people and objects for trajectory [55] and activity forecasting [29] and action grounding in 3D [62, 14] . ...
arXiv:2207.11365v1
fatcat:qhcezflvxzekdoagfzo2oyzpq4
LSTA: Long Short-Term Attention for Egocentric Action Recognition
[article]
2019
arXiv
pre-print
Egocentric activity recognition is one of the most challenging tasks in video analysis. It requires a fine-grained discrimination of small objects and their manipulation. ...
In this paper we propose LSTA as a mechanism to focus on features from spatial relevant parts while attention is being tracked smoothly across the video sequence. ...
However, the performance of deep learning action recognition from videos is still not comparable to the advances made in object recognition from still images [12] . ...
arXiv:1811.10698v3
fatcat:ix72yqsyfnhhfcwcsvffr4xnxq
A Sequential Classifier for Hand Detection in the Framework of Egocentric Vision
2014
2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops
Hand detection is one of the most explored areas in Egocentric Vision Video Analysis for wearable devices. ...
Experimental results show a considerable improvement in the detection of true negatives, without compromising the performance of the true positives. ...
According to the seminal work proposed in [16] , known for being the first public dataset for egocentric object recognition, existing methods for detecting hands in a scene can be divided into two groups ...
doi:10.1109/cvprw.2014.92
dblp:conf/cvpr/Betancourt14
fatcat:drxt6iamrzg2vdr3gg7zyqggva
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
[article]
2016
arXiv
pre-print
(TDU), and Univ. of Tsukuba that aims to systematically summarize papers on computer vision, pattern recognition, and related fields. ...
For this particular review, we focused on reading the ALL 602 conference papers presented at the CVPR2015, the premier annual computer vision event held in June 2015, in order to grasp the trends in the ...
to improvement of accuracy in object detection. ...
arXiv:1605.08247v1
fatcat:cd4mc7uor5f2rpu3aketd6naf4
2020 Index IEEE Transactions on Image Processing Vol. 29
2020
IEEE Transactions on Image Processing
Fan, B., +, TIP
2020 8120-8133
Scene Recognition With Prototype-Agnostic Scene Layout. Chen, G., +, TIP
2020 5877-5888
Semantics-Preserving Graph Propagation for Zero-Shot Object Detection. ...
Scene Recognition With Prototype-Agnostic Scene Layout. Chen, G., +, TIP 2020 5877-5888 Screen Content Video Quality Assessment: Subjective and Objective Study. ...
doi:10.1109/tip.2020.3046056
fatcat:24m6k2elprf2nfmucbjzhvzk3m
A Metaverse: taxonomy, components, applications, and open challenges
2022
IEEE Access
Furthermore, we describe essential methods based on three components and techniques to Metaverse's representative Ready Player One, Roblox, and Facebook research in the domain of films, games, and studies ...
access to connectivity with reality using virtual currency. ...
Most video processing uses third-person video data sets, so egocentric video data is not enough. ...
doi:10.1109/access.2021.3140175
fatcat:fnraeaz74vh33knfvhzrynesli
2021 Index IEEE Transactions on Image Processing Vol. 30
2021
IEEE Transactions on Image Processing
-that appeared in this periodical during 2021, and items from previous years that were commented upon or corrected in 2021. ...
Note that the item title is found only under the primary entry in the Author Index. ...
., +, TIP 2021 207-219 Together Recognizing, Localizing and Summarizing Actions in Egocentric Videos. ...
doi:10.1109/tip.2022.3142569
fatcat:z26yhwuecbgrnb2czhwjlf73qu
« Previous
Showing results 1 — 15 out of 167 results