A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit <a rel="external noopener" href="https://pure.aber.ac.uk/portal/files/42036032/04_Employing_Bilinear_Fusion_and_Saliency_Prior_Information_for_RGB_D_Salient_Object_Detection.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<i title="Institute of Electrical and Electronics Engineers (IEEE)">
<a target="_blank" rel="noopener" href="https://fatcat.wiki/container/sbzicoknnzc3tjljn7ifvwpooi" style="color: black;">IEEE transactions on multimedia</a>
Multi-modal feature fusion and saliency reasoning are two core sub-tasks of RGB-D salient object detection. However, most existing models employ linear fusion strategies (e.g., concatenation) for multi-modal feature fusion and use a simple coarse-to-fine structure for saliency reasoning. Despite their simpleness, they can neither fully capture the cross-modal complementary information nor exploit the multi-level complementary information among the cross-modal features at different levels. To<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tmm.2021.3069297">doi:10.1109/tmm.2021.3069297</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cvspnkbza5al7nvowh3zf6t7e4">fatcat:cvspnkbza5al7nvowh3zf6t7e4</a> </span>
more »... ress these issues, a novel RGB-D salient object detection model is presented, where we pay special attention to the aforementioned two sub-tasks. Concretely, a multi-modal feature interaction module is first presented to explore more interactions between the unimodal RGB and depth features. It helps to capture their cross-modal complementary information by jointly using some simple linear fusion strategies and bilinear fusion ones. Then, a saliency prior information guided fusion module is presented to exploit the multi-level complementary information among the fused cross-modal features at different levels. Instead of employing a simple convolutional layer for the final saliency prediction, a saliency refinement and prediction module is designed to better exploit those extracted multilevel cross-modal information for RGB-D saliency detection. Experimental results on several benchmark datasets verify the effectiveness and superiority of the proposed framework over some state-of-the-art methods. Index Terms-RGB-D salient object detection, bilinear fusion strategy, saliency prior information guided fusion, saliency refinement and prediction.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210715170714/https://pure.aber.ac.uk/portal/files/42036032/04_Employing_Bilinear_Fusion_and_Saliency_Prior_Information_for_RGB_D_Salient_Object_Detection.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/bc/5b/bc5bc1100cee476d65f9f40354f91141b2c57d91.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/tmm.2021.3069297"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>