I am very sympathetic with Kersten's argument for an extended view of music perception. In this commentary, I focus on Kersten's helpful criticism of my approach to the musically extended mind in Krueger (2014). I address Kersten's worry that I don't say enough about how we manipulate—and in so doing, integrate with—music when we become part of a musically extended cognitive system. Along the way, I also express some reservations about Kersten's neglect of emotions in his account.

Hypothesis of Extended Cognition and Musical Emotions

In Krueger (2014), I apply a version of the hypothesis of extended cognition (HEC) to music perception. According to HEC, the physical basis of cognition need not be confined to the head (Clark & Chalmers, 1998; Hurley, 1998; Menary, 2010). Some cognitive processes are comprised not only of brain or even bodily processes but also structures and processes in the surrounding environment.

What about emotions? There seem to be many real-world cases where emotions are supported and driven, at multiple timescales, by features of our environment (Colombetti & Roberts, forthcoming; Slaby, 2014). Consider being drawn into and swallowed up by the collective euphoria at a concert, the focused rage of a political protest, or the pervasive serenity of a natural setting or space of worship. Or imagine a skilled musician using her instrument to summon and work though the nuances of an emotional episode.

In Krueger (2014), I argue that music is an especially powerful environmental resource for extending emotions. We primarily listen to music to regulate our emotions and behavior—and when others are present, intensify our social experiences (Balkwill & Thompson, 1999; Krumhansl, 2002). We even appear to make similar judgments about the emotional character of specific pieces of music: the recognition of basic emotional expressions in music (happy, sad, scared/fearful, etc.) is culturally universal, including when listening to music that comes from an unfamiliar culture or tradition (Fritz et al, 2009). Neuroimaging studies suggest that our emotional responses to music recruit core brain structures involved in initiating, generating, detecting, maintaining, and regulating emotions (Koelsch, 2010; see also Blood et al, 1999; Koelsch, 2014; Overy & Molnar-Szakacs, 2009). Music perception and emotion experience are thus closely coupled processes.

I dwell on this point at the outset because I find Kersten's omission of emotions in his otherwise rich account of music perception surprising. As this brief survey indicates, our capacity to perceive and respond to musical stimuli involves many of the same brain regions and physiological responses involved in the production and experience of emotions. And at the experiential level, considering the character of our perceptual engagement with music without saying anything about our emotional responsiveness is a significant phenomenological omission; it neglects a central part of what makes musical engagement so compelling. Considerations such as these are what led me to focus my analysis in Krueger (2014) on the musically extended emotional mind.

Forming a (Musical) Functionally Integrated Gainful System

Few would dispute that music exerts a strong emotional power over us. But how might music function as a mind-extending resource? My argument in Krueger (2014) rests on reading HEC as a theory of access. Without the ongoing, active contribution of certain external resources, we cannot access the cognitive functions they set up, drive, and regulate. Biological memory, for example, is greatly augmented by exploiting different features of our material and social environments; external tools, artifacts, and symbol systems offer a representational format, storage capacity, stability, and flexibility of access unavailable to the unaided brain. By routinely exploiting these external resources, "the human organism is linked with an external entity in a two-way interaction, creating a coupled system that can be seen as a cognitive system in its own right" (Clark & Chalmers, 1998, p.8). We become part of what Kersten, following Wilson (2010), refers to as a functionally integrated gainful system (hereafter, FIG).

Something similar can happen with music. It is an external resource that allows us to cultivate, refine, and explore familiar emotional experiences in new ways—or even develop emotional experiences we may not otherwise have. My thesis in Krueger (2014) is that it does so by integrating with, and subsequently enhancing, the functional complexity of various endogenous processes responsible for generating and sustaining emotional experience. In some instances, we use the music as an emotion extending tool. We offload some of the regulatory and emotional work onto the music—much the way we offload part of the remembering process onto our social and material environments—and music becomes part of the physical vehicle needed to access certain emotional experiences.

Establishing the formation of a FIG is key to establishing whether or not a given external resource gets to be counted as part of a spatially extended cognitive system. However, Kersten says that I don't do enough to justify my claim that we actively manipulate, and in so doing integrate with, the music that I suggest potentially becomes part of a music-listener FIG. I turn to that task now—and along the way, show why Kersten's analysis might be enriched by adopting my perspective on the artifactual nature of music.

Manipulating Emotions

We often play an active role in manipulating emotional dynamics such as latency, rise time, persistence, range, and intensity (Thompson, 1994). We employ various strategies that influence the emotions we have and how they are experienced and expressed (Gross, 1998, Gross et al, 2006). If I am seated next to a fussy child on a long flight, I can manipulate my rising irritation and anger by redirecting my attention to the serene view outside my window, telling myself that the child is probably suffering from ear discomfort and thereby transform my irritation into sympathy, softening my tense facial expressions and clenched fists, or moving to a different part of the plane. These strategies involve subject-centered resources (attention and behavior modification) that regulate the character of the emotion as it unfolds.

Sometimes our manipulative strategies exploit resources beyond the individual, such as music. We can use music to mask or occlude ambient noise. But we can also interact with music, and manipulate both the music and our emotional responses to it, in a more direct and ongoing way. Tia DeNora (2000) speaks of using music as an aesthetic technology for emotional "venting". One woman DeNora interviewed says that playing specifically-chosen music while sad is like "looking at yourself in a mirror being sad"; by augmenting this individual's sadness, the music guides her into a qualitatively deepened state before then slowly leading her out of it (DeNora, 2000, p.57). Another woman reports using music to shape various felt dimensions of her grief: "The Verdi Requiem is one of my favorites. That is associated with losing a baby. And I'd got to know it through my husband and it was really quite a way of grieving—I'd shut myself away in a room [she begins to cry]…It's cathartic, I think" (DeNora, 2000, p.58). These reports are not uncommon. As DeNora summarizes, venting with music is to use music "as a virtual means of expressing or constructing emotion…to define the temporal and qualitative structure of that emotion, to play it out in real time and then move on" (DeNora, 2000, p.58).

These reports highlight a crucial functional gain we realize when engaging with music: an expanded phenomenological repertoire. When we listen to music this way, we manipulate emotional dynamics and gain access to an expanded realm of feeling states and modes of expression largely inaccessible outside of a musical context. 1 This is because music is constituted by expressive dynamics that are more agile, evocative, and nuanced than are their behavioral counterparts. In contrast to the somewhat coarse-grained expressive profile of a facial expression or gesture, say, musical expressions exhibit increased complexity, temporal range, subtlety, and force (Cochrane, 2008, p.338). Part of the mystery and allure of music is that it can convey a nearly infinite spectrum of emotional expressions in ways that seem both familiar and alien. When we vent with music, then, we gain access to a richer emotional and expressive palette, much the way that a novice dancer can see her own performance elevated when supported and guided by the advanced moves of a more skilled partner.

Musical Manipulations

But how does this happen? How do we actively manipulate and, in so doing, integrate with music? This is the heart of Kersten's worry when he claims that I fail to adequately specify the integrative conditions under which music and listener might be said to form a FIG.

Recall that a FIG consists of processes that are (1) coupled (linked by reliable causal connections); (2) integrated, in that they are mutually-influencing and working together as one; and (3) functionally gainful, in that they manifest novel functions relative to the individual processes (Kersten, 2014). It is easy enough to see how (1) can be satisfied. With the advent of portable listening technologies (MP3 players, streaming services via smartphones and computers, etc.), listeners can be permanently coupled with their personally curated soundtracks. And I briefly touched on (3) above in discussing music's capacity to support the realization of an expanded phenomenological repertoire.

However, (2) initially seems trickier. When Otto records new information in his notebook, he affects a material change in that external artifact—a change that, in turn, impacts his subsequent reasoning and behavior (Clark & Chalmers, 1998). 2 But one might worry that similar manipulation-cum-integration is impossible in the case of music. Unlike material artifacts, the worry goes, we don't strictly speaking change the music by listening to it, or it us. And if so, this lack of causal reciprocity threatens the prospect for genuine extension. Recall that functional gain for Kersten requires that the component processes of a FIG work together as one (i.e., in a causally integrated, bi-directional way) in order to manifest novel functions. By failing to motivate this "manipulation thesis", Kersten argues that I provide inadequate justification for the various levels (neural, physiological, behavioral) of music-listener integration I discuss in some detail (Krueger, 2014, pp.5-8).

This is an important point. But I think Kersten's worry can be dealt with in a relatively straightforward way while simultaneously highlighting a possible lacuna in his approach. Kersten's analysis is exclusively concerned with the computational and information-processing nature of music perception. But by thinking of music purely in terms of sonic information, there is a danger, I propose, of losing touch with its artifactual nature. Music is, after all, an artifact, an aesthetic technology; it is something we do things with (Krueger, 2011; Small, 1998). This subtle shift of emphasis enables us to bring into sharper relief the array of musical manipulations, the back-and-forth causal interplay, which facilitates the integration needed to form a music-listener FIG.

Consider a live music setting such as a rock concert or DJ mixset. In these contexts, listeners and performers are causally interrelated. The audience's emotional responses and reactive behavior play a significant role in determining both what music is played as well as how it is played—this is especially true for DJs, who adapt fluidly to what is happening on the dance floor—and the music, in turn, drives and regulates the listeners' emotions and behavior. Listeners and performers thus both exert their own distinctive modulatory force over the other. A similar bi-directional interplay can be said to characterize the internal dynamic of a musical group such as a jazz trio.

Even in everyday cases of music listening, technological mediation means that we can freely manipulate the structure of the music in real time, and in so doing manipulate our emotions and reactive behavior. For example, listening technologies make it easy to play with the mix of a given piece. We can dramatically enhance the bass of an up-tempo song for a more visceral, gut-level reaction—which lifts our mood, and compels us to move and dance—or boost the treble for a brighter, cleaner sound, which evokes a different array of emotional motor responses. We can also quickly repeat sections we find especially moving (e.g., the swell of a stirring chorus) or fast-forward through less compelling sections. We can edit, chop up, and rearrange song elements with relative ease, or recontextualize tracks by embedding them within personally curated playlists in order to manipulate their emotional impact. Our listening tools play a central role in this process. We use headphones to attend to the finer details of a piece, its subtle movements and tonal shadings—and then blast the same piece from our stereo at home, losing ourselves in the music as it consumes the sonic space around us. These environmental manipulations are not uncommon. They are part of our repertoire of everyday listening practices.

Our musical manipulations are not necessarily constrained by our listening technologies. Even without their direct intervention, we still have a great deal of autonomy in terms of how we actively construct our perceptual experience (Krueger, 2009). This is because, when we listen to music, we don't simply passively register acoustic properties of the piece. Rather, we engage with it. Our auditory experience is constituted by features of our agency; it is structured and organized by ongoing process of probing, exploring, selecting, modifying, focusing, and re-focusing our attention on different aspects of the musical object, as well as the larger musical context (Reybrouck, 2005, p.252). At times, Kersten's computational analysis appears to downplay or overlook the extent to which our auditory experience is enacted in this manner. But once again, this omission runs the risk of mischaracterizing the phenomenology of how it is that we actually engage with, respond to, and use music in everyday listening contexts. We are all, to a certain extent, perceptual DJs, manipulating our self-selected soundtracks in real time in order to maximize their emotional and behavioral impact. As a material artifact in the world, music affords this manipulation. 3

In light of our technologically-mediated listening habits, then, as well as the enactive autonomy we have in terms of what we do with the piece, perceptually speaking, we can see that we do very often change the music. Much like Otto's notebook, the musical object is not a static object with a fixed structure. Rather, it is something that can be manipulated in all sorts of user-specific ways—crucially, with a downstream modulatory impact on our ongoing patterns of emotional responsiveness and reactive behavior. The music we manipulate loops back onto us, shaping our future manipulations. And to return to Kersten's worry: if we accept that we can become reliably coupled with music (e.g., via ever-present listening technologies), as well as actively manipulate it—and, in so doing, integrate with the music in a mutually-modulatory way—it is fairly easy to see, then, how this music-listener integration potentially seeds different sorts of functional gain I spend considerable time spelling out the details of in Krueger (2014).

In sum, there is much to like in Kersten's analysis of extended music perception. For the reasons I mention above, I think it would be enriched by paying more attention to the emotional character of our musical manipulations, as well as the artifactual nature of the music we manipulate. But I also think that Kersten has made an important contribution, both in terms of bringing extended cognition approaches to the domain of music cognition, as well as demonstrating why those interested in 4E (embodied, embedded, enacted, extended) approaches to cognition ought to look to music cognition for further inspiration.


  1. This is even the case for newborns and preterm infants. See Adachi and Trehub (2012) and Krueger (2013).
    Return to Text
  2. Clark elsewhere refers to this mutual reciprocity as "continuous reciprocal causation" (Clark, 1997, p.165).
    Return to Text
  3. Although not for everyone. See my discussion of amusia in Krueger (2014, p.6).
    Return to Text


  • Adachi, M., & Trehub, S. E. (2012). Musical lives of infants. In G. E. McPherson & G. F. Welch (Eds.), The Oxford handbook of music education (Vol. 1, pp. 229-247). New York: Oxford University Press.
  • Balkwill, L.-L., & Thompson, W. F. (1999). A cross-cultural investigation of the perception of emotion in music: psychophysical and cultural cues. Music Perception, 17(1), 43-64.
  • Blood, A. J., Zatorre, R. J., Bermudez, P., & Evans, A. C. (1999). Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions. Nature Neuroscience, 2(4), 382-387.
  • Clark, A. (1997). Being there: putting brain, body and world together again. Cambridge: MIT Press.
  • Clark, A., & Chalmers, D. (1998). The extended mind. Analysis, 58(1), 7-19.
  • Cochrane, T. (2008). Expression and extended cognition. The Journal of Aesthetics and Art Criticism, 66(4), 329-340.
  • Colombetti, G., & Roberts, T. (forthcoming). Extending the extended mind: the case for extended affectivity. Philosophical Studies.
  • DeNora, T. (2000). Music in everyday life. Cambridge: Cambridge University Press.
  • Fritz, T., Jentschke, S., Gosselin, N., Sammler, D., Peretz, I., Turner, R., Friederici, A., & Koelsch, S. (2009). Universal recognition of three basic emotions in music. Current Biology, 19(7), 573-576.
  • Gross, J. J. (1998). The emerging field of emotion regulation: An integrative review. Review of General Psychology, 2(3), 271-299.
  • Gross, J. J., Richards, J. M., & John, O. P. (2006). Emotion regulation in everyday life. In D. K. Snyder, J. A. Simpson, & J. N. Hughes (Eds.), Emotion regulation in couples and families: Pathways to dysfunction and health (Vol. 2006, pp. 13-35). Washington, D.C.: American Psychological Association.
  • Hurley, S. (1998). Consciousness in action. Cambridge: Harvard University Press.
  • Kersten, L. (2014). Music and cognitive extension. Empirical Musicology Review, 9(3), 193-202
  • Koelsch, S. (2010). Towards a neural basis of music-evoked emotions. Trends in Cognitive Sciences, 14(3), 131-137.
  • Koelsch, S. (2014). Brain correlates of music-evoked emotions. Nature Reviews Neuroscience, 15(3), 170-180.
  • Krueger, J. (2009). Enacting musical experience. Journal of Consciousness Studies, 16(2-3), 98-123.
  • Krueger, J. (2011). Doing things with music. Phenomenology and the Cognitive Sciences, 10(1), 1-22.
  • Krueger, J. (2013). Empathy, enaction, and shared musical experience: Evidence from infant cognition. In T. Cochrane, B. Fantini, & K. Scherer (Eds.), The emotional power of music: Multidisciplinary perspectives on musical expression, arousal, and social control (pp. 177-196). Oxford: Oxford University Press.
  • Krueger, J. (2014). Affordances and the musically extended mind. Frontiers in Psychology, 4(1003), 1-13.
  • Krumhansl, C. L. (2002). Music: A link between cognition and emotion. Current Directions in Psychological Science, 11(2), 45-50.
  • Menary, R. (Ed.). (2010). The extended mind. Cambridge: MIT Press.
  • Overy, K., & Molnar-Szakacs, I. (2009). Being together in time: Musical experience and the mirror neuron system. Music Perception: An Interdisciplinary Journal, 26(5), 489-504.
  • Reybrouck, M. (2005). A biosemiotic and ecological approach to music cognition: Event perception between auditory listening and cognitive economy. Axiomathes, 15(2), 229-266.
  • Slaby, J. (2014). Emotions and the extended mind. In M. Salmela & C. Von Scheve (Eds.), Collective emotions (pp. 32-46). Oxford: Oxford University Press.
  • Small, C. (1998). Musicking. Middletown, CT: Wesleyan University Press.
  • Thompson, R. A. (1994). Emotion regulation: A theme in search of definition. Monographs of the Society for Research in Child Development, 59(2/3), 25-52.
  • Wilson, R. A., Gangopadhyay, N., Madary, M., & Spicer, F. (2010). Extended vision. In Perception, action, and consciousness: Sensorimotor dynamics and two visual systems (pp. 277-290). Oxford: Oxford University Press.
Return to Top of Page