Saraga: Open Datasets for Research on Indian Art Music

Authors

  • Ajay Srinivasamurthy Music Technology Group, Universitat Pompeu Fabra, Barcelona
  • Sankalp Gulati Music Technology Group, Universitat Pompeu Fabra, Barcelona
  • Rafael Caro Repetto Music Technology Group, Universitat Pompeu Fabra, Barcelona
  • Xavier Serra Music Technology Group, Universitat Pompeu Fabra, Barcelona

DOI:

https://doi.org/10.18061/emr.v16i1.7641

Keywords:

Open annotated datasets, Indian Art Music, computational analysis

Abstract

We introduce two large open data collections of Indian Art Music, both its Carnatic and Hindustani traditions, comprising audio from vocal concerts, editorial metadata, and time-aligned melody, rhythm, and structure annotations. Shared under Creative Commons licenses, they currently form the largest annotated data collections available for computational analysis of Indian Art Music. The collections are intended to provide audio and ground truth for several music information research tasks and large-scale data-driven analysis in musicological studies. A part of the Saraga Carnatic collection also has multitrack recordings, making it a valuable collection for research on melody extraction, source separation, automatic mixing, and performance analysis. We describe the tenets and the process of collection, annotation, and organization of the data. We provide easy access to the audio, metadata, and the annotations in the collections through an API, along with a companion website that has example scripts to facilitate access and use of the data. To sustain and grow the collections, we provide a mechanism for both the research and music community to contribute additional data and annotations to the collections. We also present applications with the collections for music education, understanding, exploration, and discovery.

Downloads

Published

2021-12-10