GIVEN the scarcity of direct historical evidence, it seems natural for researchers in the field of music and evolution to extrapolate principles from present-day, observable phenomena of musical experience when trying to understand the origins and functions of music in the dawn of humanity, as is one of the main topics in Jacques Launay's paper. One such trans-historical element of music could arguably be that of body motion, involved in most (if not all) aspects of musical experience: we may readily observe body motion both in the production of musical sound (e.g. hitting, scraping, plucking, blowing, and so on), and in listening to music (e.g. waving the hands, moving the head, torso, or whole body) (see Godøy & Leman, 2010 for an overview), and we have considerable evidence of body motion sensations also involved in mental imagery of music (e.g. Godøy & Jørgensen, 2001). All such instances of music-related body motion can be seen to be based on what we call "motor cognition," and can also be considered to be ubiquitous in musical experience, as suggested by the title of this review.

The contention here is that motor cognition is a fundamental element in human behavior, in all probability universal as well in its manifestation in the evolution of music, and that it could help us understand a number of musical features as shared experiences, provided we agree that humans are basically quite similar in their motor faculties. This motor cognition perspective would not decide between a group or individual origin of music, but would rather suggest that, regardless of origin, musical experiences may be shared between humans because they share a number of very basic motor sensations: music originating in a group setting (e.g. singing in group work or group dance) would use readily perceivable body motion, as would music originating in a solitary setting (e.g. the lonely shepherd playing a bone flute). Drawing on most peoples' life-long and vast experience of sound-producing body motion, sound becomes an efficient transducer of body motion sensations, something that is at the very core of the so-called "motor theory" of perception.

To substantiate such a view of ubiquitous motor cognition in musical experience, it could be useful first of all to make a brief review of the motor theory of perception, before looking at some sound-motion feature couplings in music. Also, musical experience may be highly composite with multiple parallel layers of meaning, making it useful to discuss some differentiations of meaning in relation to motor cognition.

Motor Theory

The motor theory (or motor theories, as there are some variants here) of perception states that we interpret sensory experiences by mentally (and sometimes also overtly) simulating the body motion that we believe is at the source of whatever it is that we are perceiving, or the body motion that we project onto whatever it is that we are perceiving. For instance, in listening to speech, motor theory suggests that we not only somehow decipher acoustic signals, but that we also actively mentally simulate the vocal apparatus motion and shapes that we believe generate the sound we are hearing (Liberman & Mattingly, 1985; Galantucci, Fowler, & Turvey, 2006). In the case of visual perception, it has been suggested that we have a similar active tracing of salient features in what we are seeing (Berthoz, 1997). Additionally, in interacting with other people, it seems that we actively mentally simulate their body motion, and through this have a basis for social interaction and empathy (Wilson & Knoblich, 2005; Jeannerod, 2006). In other words, there is converging evidence that we understand other people and their expressions, including music, by mentally simulating their body motion.

To further explore how motor theory applies to music, we have, during the last decade, carried out various studies of peoples' spontaneous body motion in listening to music. These include so-called "air instrument" studies, where we wanted to find out what knowledge of sound-producing body motion people with different levels of musical training possessed (Godøy, Jensenius, & Haga, 2006). Related to this, we also conducted a number of studies of so-called "sound-tracing," in which listeners were asked to make spontaneous tracings of the salient features of sound excerpts, either on a digital tablet or in three-dimensional space with their hands (Nymoen et al., 2013). In addition, we studied what we called "free dance"—spontaneous whole-body motion to a given sound excerpt (Haga, 2008). What these studies suggested was that listeners, even those with no musical training, seemed readily able to reproduce salient features of musical sound, in particular pitch contours and dynamic envelopes. More direct, sound-producing body motion was also well reproduced, although there were of course variations in level of detail that correlated with level of musical training.

From our own and other studies, it seems reasonable, then, to conclude that most listeners have (albeit variably so) more or less clear mental images of the "sound-producing" body motion at the source of whatever it is that they are hearing. Also, it seemed that most listeners were able to actively trace the evolution of sound patterns and features, as well as to move the whole body, or parts of the body, to musical sound, in what we call "sound-accompanying" body motion (Godøy, 2010).

In a motor theory perspective, all sound is included in some kind of sound-producing body motion trajectory, and alternatively, all sound features may be traced, hence the idea of ubiquitous motor cognition in music. However, one important feature of motor theory is that of "variable acuity." Applied for instance to language perception, a person (like me) may not understand either Chinese or Russian, yet may be able to discern two very different sets of so-called "phonological gestures" (vocal tract motions and shapes). Likewise, in music-related body motion, we found that some renderings were rather sketchy or approximate, whereas others were quite detailed, something that is of course dependent on expertise (prior training or experience). Yet our assessment here is that just having some sketchy or low-resolution rendering of what goes on is better than having none (i.e. we should appreciate the value of approximate information in motor cognition).

Sound-motion features

The basic tenet here is that most features of musical sound are related to some kind of body motion, hence that we can speak of sound-motion relationships. This means that the "purely" acoustic features (i.e. spectral-temporal features), have correlates in body motion features, either directly by sound-production or more indirectly by the fact that any feature (e.g. vibrato, tremolo, texture, pitch contour, timbral envelope, and so on) can be traced overtly as a shape by our fingers, hands, and other body parts, or covertly as mental images of shapes. To keep track of these various features, it could be useful to make an overview of music-related body motion, first sound-producing and then sound-accompanying, applicable to any feature whatsoever.

Sound-producing motion may be classified as follows (see various contributions in Godøy & Leman, 2010):

  • Excitatory motion: directly transferring energy from the body by hitting, stroking, blowing, and so on;
  • Modulatory motion: motion that modifies the sound (e.g. vibrato, timbral changes, and so on);
  • Ancillary motion: various facilitating (ergonomically) and expressive kinds of motion;
  • Communicative motion: giving cues to other performers in an ensemble, or creating more theatrical effects on the audience.

There are a number of biomechanical and motor control constraints on sound-producing body motion, such as maximal speed, maximal effort, ergonomics, need to avoid strain injury, and so on. Also, there are a number of constraints that contribute to shape the resultant sound, in particular the phenomenon of "coarticulation," meaning the fusion of otherwise distinct sound and body motion events into more coherent and superordinate chunks (Godøy, Jensenius, & Nymoen, 2010; Godøy, 2014).

Also, we should keep in mind that music-related body motion may often be multi-functional. For example, a pianist's upbeat motion of the hands may be part of an excitatory motion, as well as a communicative motion to the rest of the ensemble and the audience. As for sound-accompanying body motion (i.e. motion made by listeners), this may variably be:

  • Reflecting sound production, cf. the categories above;
  • Tracing some sonic feature (i.e. melodic, rhythmic, textural, and timbral features, etc.);
  • An entrainment to the music, moving to the beat or some other salient feature;
  • Adding a new, "contrapuntal," element to the music (i.e. making body motion that is somehow related to the overall affective and/or aesthetic elements of the music).

Sound-accompanying body motion to the same excerpt of music may vary between listeners, hence the idea of rich motion affordances of musical sound (Godøy, 2010). Yet, in spite of the multitude of features that may be manifest in any musical excerpt, these may all be seen as having some kind of shape (e.g. overall dynamic envelope, various dynamic-, timbre-, or pitch-related fluctuations, rhythmical and textural patterns, and so on), that in turn may all be reflected in body motion patterns. In other words, it seems that musical sound is an effective transducer of body motion and associated sensations of effort, mood, affect, and so on, and this is the basis for claiming the ubiquity of motor cognition in musical experience.

Multiple Significations

Clearly, music is a cultural phenomenon, in each instance embedded in a web of significations that may be unknown to an outsider. More than a century of ethnomusicology research has taught us to be careful not to impose our Western musical concepts on other cultures (e.g. notions of high and low in pitches, on tunings, notions of pitch nuances, and so on), as well as more high-level elements of aesthetics and meaning in music.

A motor cognition perspective on musical experience should of course recognize that any instance of music has multiple significations, and, notably, significations that may escape us as outsiders. Yet it could also be useful to distinguish between the more local, culture-specific, and the more universal, basic motor-related elements in musical experience. Pierre Schaeffer's model of listening could be useful here, as this model distinguishes between meaning ascribed to any sound object and the sound object features in a more general sense (Schaeffer, 1966; Chion, 1983): the sound of a door squeaking may tell me that someone is entering the room, particularly if I am expecting a visitor. Yet this same door squeaking could also be interesting for me as a sonic object (a metallic, upward glissando-type sound). Schaeffer called such focus on sonic features "reduced listening" to indicate a mode of listening where everyday significance was disregarded by an active, intentional focus on the sonic feature of the door squeak. Again, this shift of attention is related to expertise, yet it could also be seen as a shift toward a more general motion-related mode of listening: the upward glissando of the squeak in this case could be traced as a motion shape, as a general energy and pitch envelope. In fact, it could be understood as a motor cognition element.

Re-evaluating Schaeffer and colleagues' theories, it seems clear that most of the basic sonic feature categories relate very nicely to body motion categories and, hence, to motor cognition (Godøy, 2006). Schaeffer's approach of top-down feature differentiation by actively tracing shapes, beginning with the overall envelopes, (i.e. the so-called "typology"), and continuing to the content of the sonic objects (i.e. the so-called "morphology"), step by step enhancing our knowledge of musical sound by actively tracing its perceptually salient features, is applicable to any kind of musical sound, regardless of origin (instrumental, vocal, environmental, or electronic). The typology comprises the overall dynamic envelopes:

  1. Impulsive: an abrupt peak of effort immediately followed by relaxation (e.g. as in struck percussion instruments, plucked string instruments, or piano tones);
  2. Sustained: having a more continuous energy transfer from the body to the instrument or the vocal apparatus (e.g. in bowing, blowing, or singing);
  3. Iterative: meaning a fast back and forth motion such as in a tremolo.

Additionally, there are the overall pitch-related envelopes as follows:

  1. Pitched: meaning stable, determinate pitch;
  2. Variable: meaning a determinate pitch that varies (e.g. a glissando);
  3. Complex: meaning a non-pitched or highly inharmonic sound.

The morphology comprises a fairly large number of internal sonic features, all of which may be correlated with body motion and/or postures, so here are just two of the most important ones:

  • Grain: denoting fast fluctuations in the sound, such as in a tremolo, vibrato, or fast fluctuations in timbral content;
  • Gait: denoting slower fluctuations in the sound, typically manifest as various more or less regular rhythmical patterns, as in dance.

These basic features may in turn be further differentiated (e.g. we may trace the amplitude and rate of the grain fluctuations, their regularity vs. irregularity, and so on), and, taken together with all the other morphological features, sub-features, sub-sub-features, and so on, constitute a highly nuanced system for qualifying perceptually salient features as motor cognition elements. Furthermore, these typological and morphological features may be combined in more complex sonic objects, and these sonic objects may in turn be combined into longer passages of musical sound and motion, in effect constituting prolonged action scripts also entailing various affective features such as sense of effort, calm, agitation, and so on.

Conclusions and Prospects

It seems that various research in the cognitive sciences converge in documenting the fundamental role of motor cognition in human behavior in general, and that we now also see growing support for the crucial role of motor cognition in music. Additionally, possibilities for doing research on motor cognition in music have, in recent years, been greatly enhanced due to advances in motion capture methods, methods for analysis of motion-sound correspondences, and advances in understanding of multi-sensory integration in the human mind. In view of ongoing discussions on topics such as music and evolution, music and social functions, as well as music and empathy, taking motor cognition into consideration could have the following advantages:

  • Universality (i.e. probably valid for most, perhaps all, musical cultures, styles, genres, etc.), and intersubjectivity (i.e. provided we assume that most people are similar with respect to basic motor functions and body motion sensations);
  • Applicable to most (or perhaps all) features of music, in particular perceptually salient features such as dynamical, timbral, and pitch-related contours, as well as rhythmical and textural patterns;
  • Documentable, increasingly so, with presently available methods and technologies for capturing details of musical sound and music-related body motion.

But needless to say, there are a number of substantial challenges here:

  • We need more studies on music-related body motion in general, and in particular in various non-Western musical cultures.
  • We need better methods for correlating detail features of sound with detail features of body motion.

Finally, in a direct response to one of the main issues of Launay's paper: listening to music alone could be regarded as social in so far as music reflects shared motor cognition experiences—intersubjective experiences of how we all may relate body motion sensations to our private and often affective experiences.


  • Berthoz, A. (1997). Le sense du mouvement. Paris: Odile Jacob.
  • Chion, M. (1983). Guide des objets sonores. Paris: INA/GRM Buchet/Chastel.
  • Galantucci, B., Fowler, C. A., & Turvey, M. T. (2006). The motor theory of speech perception reviewed. Psychonomic Bulletin and Review, 13(3), 361-377.
  • Godøy, R. I. (2006). Gestural-Sonorous objects: Embodied extensions of Schaeffer's conceptual apparatus. Organised Sound, 11(2), 149-157.
  • Godøy, R. I. (2010). Gestural affordances of musical sound. In R. I. Godøy & M. Leman (Eds.), Musical gestures: Sound, movement, and meaning (pp. 103-125). New York: Routledge.
  • Godøy, R. I. (2014). Understanding coarticulation in musical experience. In M. Aramaki, M. Derrien, R. Kronland-Martinet & S. Ystad (Eds.), Sound, music, and motion. Lecture notes in computer science (pp. 535-547). Berlin: Springer.
  • Godøy, R. I., Haga, E., & Jensenius, A. (2006). Playing 'air instruments': Mimicry of sound-producing gestures by novices and experts. In S. Gibet, N. Courty & J.-F. Kamp (Eds.), GW2005, LNAI 3881 (pp. 256-267). Berlin, Heidelberg: Springer-Verlag.
  • Godøy, R. I., & Jørgensen, H. (Eds.) (2001). Musical imagery. Lisse (Holland): Swets & Zeitlinger.
  • Godøy, R. I., Jensenius, A. R., & Nymoen, K. (2010). Chunking in music by coarticulation. Acta Acustica united with Acustica, 96(4), 690-700.
  • Godøy, R. I., & Leman, M. (Eds.) (2010). Musical gestures: Sound, movement, and meaning. New York: Routledge.
  • Haga, E. (2008). Correspondences between music and body movement. PhD thesis, University of Oslo.
  • Jeannerod, M. (2006). Motor cognition. Oxford: Oxford University Press.
  • Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21(1), 1-36.
  • Nymoen, K., Godøy, R. I., Jensenius, A. R., & Tørresen, J. (2013). Analyzing correspondence between sound objects and body motion. ACM Transactions on Applied Perception (TAP), 10(2), article no. 9.
  • Schaeffer, P., 1966: Traité des objets musicaux. Paris: Éditions du Seuil.
  • Wilson, M., & Knoblich, G. (2005). The case for motor involvement in perceiving conspecifics. Psychological Bulletin, 131(3), 460-473.
Return to Top of Page