BTranspose: Bottleneck Transformers for Human Pose Estimation with Self-Supervised Pre-Training [article]

Kaushik Balakrishnan, Devesh Upadhyay
<span title="2022-04-21">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
apply it to the task of 2D human pose estimation.  ...  We consider different backbone architectures and pre-train them using the DINO self-supervised learning method [3], this pre-training is found to improve the overall prediction accuracy.  ...  Conclusion For the task of 2D human pose estimation, we explored a model-BTransposeby combining Bottleneck Transformers with the vanilla Transformer Encoder (TE).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="">arXiv:2204.10209v1</a> <a target="_blank" rel="external noopener" href="">fatcat:zh3u2cg2orapdp5f5odsjpf5fa</a> </span>
