Semantic audio content-based music recommendation and visualization based on user preference examples

Dmitry Bogdanov, Martín Haro, Ferdinand Fuhrmann, Anna Xambó, Emilia Gómez, Perfecto Herrera
2013 Information Processing & Management  
Preference elicitation is a challenging fundamental problem when designing recommender systems. In the present work we propose a content-based technique to automatically generate a semantic representation of the user's musical preferences directly from audio. Starting from an explicit set of music tracks provided by the user as evidence of his/her preferences, we infer high-level semantic descriptors for each track obtaining a user model. To prove the benefits of our proposal, we present two
more » ... lications of our technique. In the first one, we consider three approaches to music recommendation, two of them based on a semantic music similarity measure, and one based on a semantic probabilistic model. In the second application, we address the visualization of the user's musical preferences by creating a humanoid cartoon-like character -the Musical Avatar -automatically inferred from the semantic representation. We conducted a preliminary evaluation of the proposed technique in the context of these applications with 12 subjects. The results are promising: the recommendations were positively evaluated and close to those coming from state-ofthe-art metadata-based systems, and the subjects judged the generated visualizations to capture their core preferences. Finally, we highlight the advantages of the proposed semantic user model for enhancing the user interfaces of information filtering systems. Please cite this article in press as: Bogdanov, D., et al. Semantic audio content-based music recommendation and visualization based on user preference examples. Information Processing and Management (2012), http://dx.Shapira, and Shoval (2001) identified two main strategies -explicit and implicit user preference inference. The former relies on user surveys in order to obtain qualitative statements and ratings about particular items or more general semantic properties of the data. In contrast, the latter relies on the information inferred implicitly from user behavior and, in particular, consumption statistics. In the present work, we focus on music recommender systems and consider explicit strategies to infer musical preferences of a user directly from the music audio data. When considering digital music libraries, current major Internet stores contain millions of tracks. This situation complicates the user's search, retrieval, and discovery of relevant music. At present, the majority of industrial systems provide means for manual search (Nanopoulos, Rafailidis, Ruxanda, & Manolopoulos, 2009 ). This type of search is based on metadata 1 information about artist names, album or track titles, and additional semantic 2 properties which are mostly limited to genres. Music collections are then queried by tags or textual input using this information. Moreover, current systems also provide basic means for music recommendation and personalization, which are not related to the audio content, i.e., using metadata. Such systems obtain a user's profile by monitoring music consumption and listening statistics, user ratings, or other types of behavioral information, decoupled from the actual music data (Baltr-
doi:10.1016/j.ipm.2012.06.004 fatcat:y6bcjarzandwljl5qxqruv5yh4