A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
The DALI dataset is a large dataset of time-aligned symbolic vocal melody notations (notes) and lyrics at four levels of granularity. DALI contains 5358 songs in its first version and 7756 for the second one. In this article, we present the dataset, explain the developed tools to work the data and detail the approach used to build it. Our method is motivated by active learning and the teacher-student paradigm. We establish a loop whereby dataset creation and model learning interact, benefitingdoi:10.5334/tismir.30 fatcat:f4b7y65oijboxodfovest6ry6u