von Mises-Fisher Loss: An Exploration of Embedding Geometries for Supervised Learning [article]

Tyler R. Scott and Andrew C. Gallagher and Michael C. Mozer
2021 arXiv   pre-print
Recent work has argued that classification losses utilizing softmax cross-entropy are superior not only for fixed-set classification tasks, but also by outperforming losses developed specifically for open-set tasks including few-shot learning and retrieval. Softmax classifiers have been studied using different embedding geometries -- Euclidean, hyperbolic, and spherical -- and claims have been made about the superiority of one or another, but they have not been systematically compared with
more » ... ul controls. We conduct an empirical investigation of embedding geometry on softmax losses for a variety of fixed-set classification and image retrieval tasks. An interesting property observed for the spherical losses lead us to propose a probabilistic classifier based on the von Mises-Fisher distribution, and we show that it is competitive with state-of-the-art methods while producing improved out-of-the-box calibration. We provide guidance regarding the trade-offs between losses and how to choose among them.
arXiv:2103.15718v4 fatcat:brsxb3so5ngchemwj3pffrvkeq