Saraga: Open Datasets for Research on Indian Art Music

Ajay Srinivasamurthy, Sankalp Gulati, Rafael Caro Repetto, Xavier Serra
2021 Empirical Musicology Review  
We introduce two large open data collections of Indian Art Music, both its Carnatic and Hindustani traditions, comprising audio from vocal concerts, editorial metadata, and time-aligned melody, rhythm, and structure annotations. Shared under Creative Commons licenses, they currently form the largest annotated data collections available for computational analysis of Indian Art Music. The collections are intended to provide audio and ground truth for several music information research tasks and
more » ... rge-scale data-driven analysis in musicological studies. A part of the Saraga Carnatic collection also has multitrack recordings, making it a valuable collection for research on melody extraction, source separation, automatic mixing, and performance analysis. We describe the tenets and the process of collection, annotation, and organization of the data. We provide easy access to the audio, metadata, and the annotations in the collections through an API, along with a companion website that has example scripts to facilitate access and use of the data. To sustain and grow the collections, we provide a mechanism for both the research and music community to contribute additional data and annotations to the collections. We also present applications with the collections for music education, understanding, exploration, and discovery.
doi:10.18061/emr.v16i1.7641 fatcat:2sdxjpapivahblj7fmb76mfcji