A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
BTranspose: Bottleneck Transformers for Human Pose Estimation with Self-Supervised Pre-Training
[article]
2022
arXiv
pre-print
The task of 2D human pose estimation is challenging as the number of keypoints is typically large (~ 17) and this necessitates the use of robust neural network architectures and training pipelines that can capture the relevant features from the input image. These features are then aggregated to make accurate heatmap predictions from which the final keypoints of human body parts can be inferred. Many papers in literature use CNN-based architectures for the backbone, and/or combine it with a
arXiv:2204.10209v1
fatcat:zh3u2cg2orapdp5f5odsjpf5fa