Filters








1,088 Hits in 4.6 sec

Multiview RGB-D Dataset for Object Instance Detection [article]

Georgios Georgakis, Md Alimoor Reza, Arsalan Mousavian, Phi-Hung Le, Jana Kosecka
2016 arXiv   pre-print
for object detection and recognition.  ...  Finally, we compare the performance of the object proposals and a detection baseline to the Washington RGB-D Scenes (WRGB-D) dataset and demonstrate that our Kitchen scenes dataset is more challenging  ...  Conclusions We have presented a new RGB-D multi-view dataset for object instance detection and recognition of commonly encountered house hold objects in realistic settings.  ... 
arXiv:1609.07826v1 fatcat:hcn6tpj5xvgdbabor5qwqh6som

Indoor Scene Change Captioning Based on Multimodality Data

Yue Qiu, Yutaka Satoh, Ryota Suzuki, Kenji Iwata, Hirokatsu Kataoka
2020 Sensors  
for datasets with high complexity.  ...  Most existing scene change captioning methods detect scene changes from single-view RGB images, neglecting the underlying three-dimensional structures.  ...  We list the object class and instances in Table 1 . Table 1 . Set-ups of object classes and instances for datasets used in this study.  ... 
doi:10.3390/s20174761 pmid:32842516 fatcat:6lkcmnipvnhenba7o2ae2s6peq

MISD-SLAM: Multimodal Semantic SLAM for Dynamic Environments

Yingxuan You, Peng Wei, Jialun Cai, Weibo Huang, Risheng Kang, Hong Liu, Chi-Hua Chen
2022 Wireless Communications and Mobile Computing  
Finally, we evaluate MISD-SLAM by comparing to ORB-SLAM3 and the state-of-the-art dynamic SLAM systems in TUM RGB-D datasets and real-world dynamic indoor environments.  ...  An instance segmentation network is used to provide semantic knowledge of surrounding environments in instance level. The ORB features located on the predefined dynamic objects are removed directly.  ...  The GPU is only used for instance segmentation. Experiments in TUM RGB-D Datasets.  ... 
doi:10.1155/2022/7600669 fatcat:cdf5iu5jezdxrpyiu3gxtshy5y

3SGAN: 3D Shape Embedded Generative Adversarial Networks

Fengdi Che, Xiru Zhu, Tianzi Yang, Tzu-Yang Yu
2019 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)  
Furthermore, we utilized an existing RGB-D dataset, NYU Depth V2 with edges learned by the Holistically-nested Edge Detection model.  ...  To evaluate our approach, we generated an RGB-D dataset with edge contours from ShapeNet models.  ...  Experiment setting Datasets and pre-processing For our training dataset, we generated our RGB-D data using ShapeNet CAD models [4] .  ... 
doi:10.1109/iccvw.2019.00412 dblp:conf/iccvw/CheZYY19 fatcat:skft4ncgtnagjmjcw22hijapaa

Hand Keypoint Detection in Single Images using Multiview Bootstrapping [article]

Tomas Simon, Hanbyul Joo, Iain Matthews, Yaser Sheikh
2017 arXiv   pre-print
The noisy detections are then triangulated in 3D using multiview geometry or marked as outliers. Finally, the reprojected triangulations are used as new labeled training data to improve the detector.  ...  The method is used to train a hand keypoint detector for single images. The resulting keypoint detector runs in realtime on RGB images and has accuracy comparable to methods that use depth sensors.  ...  We demonstrate that multiview bootstrapping produces hand keypoint detectors for RGB images that rival the performance of RGB-D hand keypoint detectors.  ... 
arXiv:1704.07809v1 fatcat:vjzfmkwvgfgrvgiifblekf4iuq

3DQ-Nets: Visual Concepts Emerge in Pose Equivariant 3D Quantized Neural Scene Representations

Mihir Prabhudesai, Shamit Lal, Hsiao-Yu Fish Tung, Adam W. Harley, Shubhankar Potdar, Katerina Fragkiadaki
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
Our Real world desk scenes dataset training setup consists of 8 Microsoft Kinect Azure sensors surrounding the table to capture multiview RGB-D data.  ...  For Our CLEVR veggie dataset, we build upon the CLEVR Blender simulator ( 5 ) and add 17 vegetable object models bought from Turbosquid. Each scene is recorded by 28 RGB-D cameras.  ... 
doi:10.1109/cvprw50498.2020.00202 dblp:conf/cvpr/PrabhudesaiLTHP20 fatcat:s3lonyu6r5dafirlk5mcy2sjra

Multiview Detection with Feature Perspective Transformation [article]

Yunzhong Hou, Liang Zheng, Stephen Gould
2021 arXiv   pre-print
Incorporating multiple camera views for detection alleviates the impact of occlusions in crowded scenes.  ...  To address these questions, we propose a novel multiview detection system, MVDet.  ...  The authors thank all anonymous reviewers and ACs for their constructive comments.  ... 
arXiv:2007.07247v2 fatcat:4d2m64oigzewjd3qyfcrzopdd4

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language [article]

Dave Zhenyu Chen, Angel X. Chang, Matthias Nießner
2020 arXiv   pre-print
We introduce the task of 3D object localization in RGB-D scans using natural language descriptions.  ...  We also introduce the ScanRefer dataset, containing 51,583 descriptions of 11,046 objects from 800 ScanNet scenes.  ...  their efforts in building the ScanRefer dataset, and Akshit Sharma for helping with statistics and figures.  ... 
arXiv:1912.08830v3 fatcat:uf7uq5wnhrcknlv4jx64weilw4

From Pose to Activity: Surveying Datasets and Introducing CONVERSE [article]

Michael Edwards and Jingjing Deng and Xianghua Xie
2015 arXiv   pre-print
We categorize datasets regarding several key properties for usage as a benchmark dataset; including the number of class labels, ground truths provided, and application domain they occupy.  ...  The survey identifies key appearance and pose based datasets, noting a tendency for simplistic, emphasized, or scripted action classes that are often readily definable by a stable collection of sub-action  ...  RGB-D [22] [23] 2 CAD60 RGB-D [22] [24] 2 CASIA RGB [25] [26] 3 CAVIAR RGB [27] [28] 1, 2 CMU MMAC RGB, MoCap, IMU [29] [30] 6 CMU MoCap MoCap [7] - 1 CONVERSE RGB-D [1]  ... 
arXiv:1511.05788v2 fatcat:3ayno5for5ao7kdaot2zexgovy

RGB-D-based action recognition datasets: A survey

Jing Zhang, Wanqing Li, Philip O. Ogunbona, Pichao Wang, Chang Tang
2016 Pattern Recognition  
To address this issue, this paper provides a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view datasets, 10 multi-view datasets, and  ...  To address this issue, this paper provides a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view datasets, 10 multi-view datasets, and  ...  and the other for detection of objects).  ... 
doi:10.1016/j.patcog.2016.05.019 fatcat:sdm6dgp745fdfkotrfksretrsu

RGB-D-based Action Recognition Datasets: A Survey [article]

Jing Zhang and Wanqing Li and Philip O. Ogunbona and Pichao Wang and Chang Tang
2016 arXiv   pre-print
To address this issue, this paper provides a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view datasets, 10 multi-view datasets, and  ...  Human action recognition from RGB-D (Red, Green, Blue and Depth) data has attracted increasing attention since the first work reported in 2010.  ...  This review has highlighted the need for comprehensive statistically significant evaluation protocols as part of algorithm development and testing.  ... 
arXiv:1601.05511v1 fatcat:ekz5c5awkbgl7gxi2hvamcz63m

FroDO: From Detections to 3D Objects

Martin Runz, Kejie Li, Meng Tang, Lingni Ma, Chen Kong, Tanner Schmidt, Ian Reid, Lourdes Agapito, Julian Straub, Steven Lovegrove, Richard Newcombe
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Figure 1: Given a localized input RGB sequence, FroDO dectects objects and infers their pose and a progressively fine grained and expressive object shape representation.  ...  SLAM++ [39] demonstrated one of the first RGB-D object-based mapping systems where a set of previously known object instances were detected and mapped using an object pose graph.  ...  From Detections to 3D Objects Object Detection and Data Association We use a standard instance segmentation network [15] to detect object bounding boxes bb 2 i and object masks M in the input RGB video  ... 
doi:10.1109/cvpr42600.2020.01473 dblp:conf/cvpr/RunzLTMKS0ASLN20 fatcat:rax2vea64nbdhlreuauajyqebi

Disentangling 3D Prototypical Networks For Few-Shot Concept Learning [article]

Mihir Prabhudesai, Shamit Lal, Darshan Patil, Hsiao-Yu Tung, Adam W Harley, Katerina Fragkiadaki
2021 arXiv   pre-print
We present neural architectures that disentangle RGB-D images into objects' shapes and styles and a map of the background scene, and explore their applications for few-shot 3D object detection and few-shot  ...  We show that classifiers for object categories, color, materials, and spatial relationships trained over the disentangled 3D feature sub-spaces generalize better with dramatically fewer examples than the  ...  multiview RGB-D videos of static scenes.  ... 
arXiv:2011.03367v3 fatcat:bqxpkyamcnf53cvbgrravlgk4m

A Multiview Multimodal System for Monitoring Patient Sleep

Carlos Torres, Jeffrey C. Fried, Kenneth Rose, Bangalore S. Manjunath
2018 IEEE transactions on multimedia  
MASH uses three RGB-D cameras to monitor patients in a medical Intensive Care Unit (ICU) room.  ...  This study introduces Multimodal, Multiview Motion Analysis and Summarization for healthcare (MASH).  ...  Leilani Price from Santa Barbara Cottage Hospital for their support. Special thanks to Professor Victor Fragoso and Archith John Bency for their feedback.  ... 
doi:10.1109/tmm.2018.2829162 fatcat:6douh7seerhtnpkrllsiaj7tha

From pose to activity: Surveying datasets and introducing CONVERSE

Michael Edwards, Jingjing Deng, Xianghua Xie
2016 Computer Vision and Image Understanding  
We categorize datasets regarding several key properties for usage as a benchmark dataset; including the number of class labels, ground truths provided, and application domain they occupy.  ...  The survey identifies key appearance and pose based datasets, noting a tendency for simplistic, emphasized, or scripted action classes that are often readily definable by a stable collection of subaction  ...  RGB [110] [111] 1 MSR Action-I RGB [112] [113] 1 MSR Action-II RGB [112] [114] 1 MSR Action3D RGB-D [112] [40] 1 MSR DA3D RGB-D [112] [41] 1 MSR Gesture3D RGB-D [112] [115]  ... 
doi:10.1016/j.cviu.2015.10.010 fatcat:dr6riozhe5bsjfz5iqzcpizufm
« Previous Showing results 1 — 15 out of 1,088 results