Saraga: Open Datasets for Research on Indian Art Music
Keywords:Open annotated datasets, Indian Art Music, computational analysis
AbstractWe introduce two large open data collections of Indian Art Music, both its Carnatic and Hindustani traditions, comprising audio from vocal concerts, editorial metadata, and time-aligned melody, rhythm, and structure annotations. Shared under Creative Commons licenses, they currently form the largest annotated data collections available for computational analysis of Indian Art Music. The collections are intended to provide audio and ground truth for several music information research tasks and large-scale data-driven analysis in musicological studies. A part of the Saraga Carnatic collection also has multitrack recordings, making it a valuable collection for research on melody extraction, source separation, automatic mixing, and performance analysis. We describe the tenets and the process of collection, annotation, and organization of the data. We provide easy access to the audio, metadata, and the annotations in the collections through an API, along with a companion website that has example scripts to facilitate access and use of the data. To sustain and grow the collections, we provide a mechanism for both the research and music community to contribute additional data and annotations to the collections. We also present applications with the collections for music education, understanding, exploration, and discovery.
Copyright (c) 2021 Ajay Srinivasamurthy, Sankalp Gulati, Rafael Caro Repetto, Xavier Serra
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.