An Annotated Corpus of Tonal Piano Music from the Long 19th Century
DOI:
https://doi.org/10.18061/emr.v18i1.8903Keywords:
corpora, harmony, phrase, cadence, piano, 19th centuryAbstract
We present a dataset of 264 annotated piano pieces of nine composers, composed in the long 19th century (https://doi.org/10.5281/zenodo.7483349). Annotations adhere to the DCML harmony annotation standard and include Roman numerals, phrase boundaries, and cadence types. The scores are encoded in the XML-based MuseScore 3 format. Annotations are embedded within the MuseScore files. In addition, all harmony information, alongside key features of the encoded measure and note objects, is provided in the form of plaintext TSV-formatted tables for increased interoperability with other datasets and analysis tools. Annotations were collaboratively created and reviewed by a pool of trained music theorists. Collaboration took place asynchronously online via a semi-automated GitHub-based workflow designed for quality assurance, allowing cycles of revisions and reviews until consensus is reached. The full revision history is retained, providing data for further empirical research on inter-annotator agreement and related topics. We also present descriptive statistics about the nine corpora and the dataset as a whole, including comparisons of pitch-class contents, phrase lengths, modulations, and cadence types. We conclude with a discussion of our musicological principles for corpus building and considerations of representability.
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Johannes Hentschel, Yannis Rammos, Fabian C. Moss, Markus Neuwirth, Martin Rohrmeier
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.