Learning to Make Better Mistakes

Hui Wu, Michele Merler, Rosario Uceda-Sosa, John R. Smith
2016 Proceedings of the 2016 ACM on Multimedia Conference - MM '16  
We propose a visual food recognition framework that integrates the inherent semantic relationships among fine-grained classes. Our method learns semantics-aware features by formulating a multi-task loss function on top of a convolutional neural network (CNN) architecture. It then refines the CNN predictions using a random walk based smoothing procedure, which further exploits the rich semantic information. We evaluate our algorithm on a large "food-in-the-wild" benchmark [3], as well as a
more » ... nging dataset of restaurant food dishes with very few training images. The proposed method achieves higher classification accuracy than a baseline which directly fine-tunes a deep learning network on the target dataset. Furthermore, we analyze the consistency of the learned model with the inherent semantic relationships among food categories. Results show that the proposed approach provides more semantically meaningful results than the baseline method, even in cases of mispredictions.
doi:10.1145/2964284.2967205 dblp:conf/mm/WuMUS16 fatcat:hkruy7mbhbes7f3j527bmq7nj4