A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
Hand pose estimation from monocular RGB inputs is a highly challenging task. Many previous works for monocular settings only used RGB information for training despite the availability of corresponding data in other modalities such as depth maps. In this work, we propose to learn a joint latent representation that leverages other modalities as weak labels to improve RGB-based hand pose estimation. By design, our architecture is highly flexible in embedding various diverse modalities such as heatdoi:10.1109/iccv.2019.00242 dblp:conf/iccv/YangLLY19 fatcat:tymbevymqraypfzhvii34awpsi