Quantifying Shapes: Mathematical Techniques for Analysing  Visual Representations of Sound and Music

Genevieve L. Noyce; Mats B. K; Peter Sollich

doi:10.18061/emr.v8i2.3932

Authors

Genevieve L. Noyce
Mats B. K
Peter Sollich

DOI:

https://doi.org/10.18061/emr.v8i2.3932

Keywords:

cross-modal, real-time drawings, musical training, Gaussian processes

Abstract

Research on auditory-visual correspondences has a long tradition but innovative experimental paradigms and analytic tools are sparse. In this study, we explore different ways of analysing real-time visual representations of sound and music drawn by both musically-trained and untrained individuals. To that end, participants’ drawing responses captured by an electronic graphics tablet were analysed using various regression, clustering, and classification techniques. Results revealed that a Gaussian process (GP) regression model with a linear plus squared-exponential covariance function was able to model the data sufficiently, whereas a simpler GP was not a good fit. Spectral clustering analysis was the best of a variety of clustering techniques, though no strong groupings are apparent in these data. This was confirmed by variational Bayes analysis, which only fitted one Gaussian over the dataset. Slight trends in the optimised hyperparameters between musically-trained and untrained individuals allowed for the building of a successful GP classifier that differentiated between these two groups. In conclusion, this set of techniques provides useful mathematical tools for analysing real-time visualisations of sound and can be applied to similar datasets as well.