A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
The Problem of Limited Inter-rater Agreement in Modelling Music Similarity
2016
Journal of New Music Research
One of the central tasks in the annual MIREX evaluation campaign is the "Audio Music Similarity and Retrieval (AMS)" task. Songs which are ranked as being highly similar by algorithms are evaluated by human graders as to how similar they are according to their subjective judgment. By analyzing results from the AMS tasks of the years 2006 to 2013 we demonstrate that: (i) due to low inter-rater agreement there exists an upper bound of performance in terms of subjective gradings; (ii) this upper
doi:10.1080/09298215.2016.1200631
pmid:28190932
pmcid:PMC5256035
fatcat:h6s6h3hikjayhpzkcnsaxfvnoq