A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit <a rel="external noopener" href="http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wu_Deep_Multiple_Instance_2015_CVPR_paper.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ilwxppn4d5hizekyd3ndvy2mii" style="color: black;">2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</a>
The recent development in learning deep representations has demonstrated its wide applications in traditional vision tasks like classification and detection. However, there has been little investigation on how we could build up a deep learning framework in a weakly supervised setting. In this paper, we attempt to model deep learning in a weakly supervised learning (multiple instance learning) framework. In our setting, each image follows a dual multi-instance assumption, where its object<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cvpr.2015.7298968">doi:10.1109/cvpr.2015.7298968</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/cvpr/WuYHY15.html">dblp:conf/cvpr/WuYHY15</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/x4b4cmcuvzbdphuevi6f2upy2a">fatcat:x4b4cmcuvzbdphuevi6f2upy2a</a> </span>
more »... ls and possible text annotations can be regarded as two instance sets. We thus design effective systems to exploit the MIL property with deep learning strategies from the two ends; we also try to jointly learn the relationship between object and annotation proposals. We conduct extensive experiments and prove that our weakly supervised deep learning framework not only achieves convincing performance in vision tasks including classification and image annotation, but also extracts reasonable region-keyword pairs with little supervision, on both widely used benchmarks like PASCAL VOC and MIT Indoor Scene 67, and also a dataset for imageand patch-level annotations.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20160528042120/http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Wu_Deep_Multiple_Instance_2015_CVPR_paper.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/58/85/58857830e1e0fdbc7edfb283a1c23c5bcff528d5.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cvpr.2015.7298968"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>