Walking and talking: A bilinear approach to multi-label action recognition

Sameh Khamis, Larry S. Davis
<span title="">2015</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ilwxppn4d5hizekyd3ndvy2mii" style="color: black;">2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</a> </i> &nbsp;
Action recognition is a fundamental problem in computer vision. However, all the current approaches pose the problem in a multi-class setting, where each actor is modeled as performing a single action at a time. In this work we pose the action recognition as a multi-label problem, i.e., an actor can be performing any plausible subset of actions. Determining which subsets of labels can co-occur is typically treated as a separate problem, typically modeled sparsely or fixed apriori to label
ation coefficients. In contrast, we formulate multi-label training and label correlation estimation as a joint max-margin bilinear classification problem. Our joint approach effectively trains discriminative bilinear classifiers that leverage label correlations. To evaluate our approach we relabeled the UCLA Courtyard dataset for the multi-label setting. We demonstrate that our joint model outperforms baselines on the same task and report state-of-the-art per-label accuracies on the dataset.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cvprw.2015.7301277">doi:10.1109/cvprw.2015.7301277</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/cvpr/KhamisD15.html">dblp:conf/cvpr/KhamisD15</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ueal4yqlerhrrn3mknqp45iwhy">fatcat:ueal4yqlerhrrn3mknqp45iwhy</a> </span>
