IN an ambitious paper, Athanasopoulos and Moran (2013) ask: What is the effect of cultural background on an individual's two-dimensional representation of musical sound? To address this, they played brief excerpts of synthesized flute sequences, varying in pitch to express (what may be described as) rising, falling, peak, and valley contours, and asked participants to "represent the sound on paper in such a way that if another member of their community saw their marks they would be able to connect them to the sounds". The participants were experienced musicians in the sense that they had all played and performed on musical instruments for many years. They were drawn from three culturally distinct groups: British participants living in Edinburgh (Scotland) who were all familiar with western standard notation (WSN), Japanese participants in Tokyo and Kyoto familiar with Japanese standard notation (JSN) though with some exposure to WSN, and participants in the BenaBena tribe living in Papua New Guinea who were unfamiliar with any literary or notational script.

Among the main findings: Almost all British participants (24/25) and the majority of Japanese participants (19/24) used representations that expressed time on the horizontal axis, and variations in pitch on the vertical axis — in line with the conventions of WSN. (Further, left-to-right depiction of time was employed by 46 of the 48 British and Japanese participants who provided time-ordered representations, despite JTN's conventional top-to-bottom organization of time). In contrast, only 4 out of 26 BenaBena participants' visual representations followed this scheme. Indeed, neither dimension used by the majority of the British and Japanese participants appeared to be a focus in most BenaBena participants' representations. Neither (i) organizing the sounds with respect to linear chronological ordering nor (ii) attention to variations in pitch apparently served as salient dimensions in the majority of BenaBena participants' representations. These are quite intriguing findings in an area for which we have as yet little cross-cultural empirical work.

The challenges of conducting a cross-cultural study in three countries are significant, especially where remote fieldwork is involved. The authors are to be commended for thoughtfully adapting the procedure in culturally sensitive ways (careful selection of instrument and pitch relations for stimuli, avoiding the use of headphones, using etching to link to an unfamiliar task for the BenaBena group, practice with pens, and emphasis on ecological validity). The inclusion of post-procedure interviews added a layer of information not often reported in standard research studies, and yet important to any investigation about meaning-making. However, some limitations in methodology should be noted:

(i) Average ages of participants varied widely: 23.5 years for the British group, and 47.6 and 57.1 years respectively for the Japanese and BenaBena groups. It is possible that cultural differences may be conflated with cohort effects, and/or with age effects (such as variation in fine motor and sensory capacities, or cognitive abilities, or other developmental differences). (ii) There is lack of clarity about whether participants were run individually, or if groups A, B, and C were run as three whole groups — raising the question of possible order effects. (iii) There is also the question of "demand characteristics" (or cues participants may use to infer what the experimenter is interested in). While the drawing task was quite open, the musical stimuli were carefully controlled to call participants' attention to what the authors considered to be a salient dimension. However, this assumption may have led to two different tasks for the various groups. Most British and Japanese participants read the scenario as a melodic dictation task whereas the BenaBena's lack of familiarity with this context allowed them to approach the task in a more open-ended way, selecting their own salient parameters of sound to represent.

Given that the participants were instructed to provide representations that would enable members of their community to connect them to the sounds, it would have been interesting to pursue the efficacy of the markings for matching to the particular pitch contours. What if participants from each group had been given examples of the most typical notations of each group? How would they have matched the markings to each contour in the recorded excerpts? Are principles of representation necessarily the same as those used in deciphering others' notation? Performing the sounds would also add another layer. What if participants had been given several samples such as the ones shown in Figures 3 to 5, and asked to play them? For instance: "This was drawn by somebody else listening to the same music. Play (or sing) these marks in one or more ways." Such exploration may lead to discoveries about what dimensions of musical sound may receive more or less attention in the inevitable gain and loss that comes with the narrowing of abilities accompanying enculturation into any particular musical tradition, as discussed in the next section.


In an influential essay published in 1987, developmental theorist Paul Baltes offered a set of propositions to characterize the nature of human development across the life span. Two closely interrelated propositions focused on the "multidirectional" nature of development and the idea of development as "a dynamic and continuous interplay of gain and loss" (p. 611). These propositions challenged the idea of human development as a simple linear progression of incremental gains in favor of a more complex view of multiple directions of change, such that advances in development actually consist of simultaneous expression of both gain and loss. In the words of Baltes (1987), "It is assumed that any developmental progression displays at the same time new adaptive capacity as well as the loss of previously existing capacity. No developmental change during the life course is pure gain" (p. 616).

For example, as they approach one year, infants become increasingly sensitive to fine differences in the speech sounds of the language(s) to which they are frequently exposed, as reflected by their babbling which increasingly resembles the sounds of what will become their native language(s). In tandem, however, by the end of the first year, they are no longer able to discern fine differences between some phonemes not common in the language(s) around them (such as /l/ and /r/ for infants in Japanese-speaking homes [Kuhl, Stevens, Hayashi, Deguchi, Kiritani, & Iverson, 2006]). To extend to a musical example, young western infants discern mistuned pitches equally well in melodies based on the Javanese (Indonesian) pélog scale or western major/minor scale (Lynch, Eilers, Oller, & Urbano, 1990). However, by age one year western infants show greater sensitivity to mistuning in melodies based on major/minor scales (Lynch & Eilers, 1992) while becoming less sensitive to mistuning of melodies in other scales.

Further, this multidirectional interplay of gain and loss is also expressed as individuals begin to master the representational tools of their culture. This is because any representational system is selective and incomplete in capturing what it represents or communicates. All forms of notation necessarily select some aspects for attention over others and exclude many more. Western standard notation (WSN) eventually came to represent precise pitch and time relations by way of diastematic and mensural notation (Randel, 2003), and looser (non-calibrated) dynamic and tempo designations — and other parameters of musical sound and performance are not explicitly represented. 1 To adapt a statement from David Olson's (1994) provocative book on the conceptual and cognitive implications of writing and reading script to the realm of music: "The development of a functional way of communicating with visible marks…[is] a discovery of the representable structures of" music (p. 68). 2

Looking at Athanasopoulos and Moran's (2013) study through this frame, the focus is less directed to a comparison of visual representations of music against a "standard" notation, or solely a comparison between groups, than to opening up the question of what we may discover from different ways of engaging with music, and what rich parameters of music receive less attention in the inevitable interplay of gain and loss that comes with the narrowing of abilities accompanying specialization.


Sensitivity to pitch contour is apparent very early in life, in infants' responsiveness to "prosody" (i.e., the musical qualities of speech including melody, rhythm, intonation, phrasing etc). Language researchers (e.g., Papoušek, Papoušek, & Symmes, 1991) observed that rising contours in speech tend to capture infants' attention, falling contours tend to soothe or signal the end of an interactive sequence, and bell-shaped contours hold infants' interest and express approval, and that these contours are used similarly across many tonal and nontonal languages in speech directed to infants. Thus the specific stimuli selected by Athanasopoulos and Moran (2013) were aptly chosen. Musical notation systems developing in the west also reflect special attention to melodic contour (Randel, 2003). Even treatises on notation of musical performance in non-western musical cultures for ethnomusicological purposes (e.g., Abraham & Hornbostel, 1909) devoted significantly more space to discussions of how to represent musical pitch, compared to other parameters such as rhythm, tempo, and duration.

Yet researchers (e.g., Walker, 1997) have also found that "in some cultures studied, musical pitch may not be a predominant feature in the musical [e.g.] vocal behavior examined" (p. 315). Indeed, Walker found that Australian Aboriginal singing often employed only a single pitch, but exhibited rich variety in the formants. Thus the essential information in Aboriginal singing is focused on frequency spectra, in contrast to attention to variations in pitch while holding timbre constant (i.e., maintaining the steady state) as in much of western opera singing. Athanasopoulos and Moran (2013) made a parallel observation about some BenaBena participants' focus on the qualities of the flute sound (e.g., the tone color and fluctuations in loudness) as opposed to pitch contour in their graphic representations of the brief flute excerpts. Although he does not ground his approach in discourse on multidirectional gain/loss, Walker (1997) adopts this spirit, concluding his study with a focus on what we have to learn from this group's engagement with music and by extracting specific practical implications for music training and performance: "Notations based on visual metaphors for tone qualities or vowel pitch [features not notated in WSN] may be useful perceptual and conceptual aids in the education of musical performers" (p. 343, parentheses added). So too, we have much to learn from Athanasopoulos and Moran's observations of the BenaBena group's diverse responses to the task.

Athanasopoulos and Moran conclude that "While WSN is ubiquitous, we are reminded that the underlying principles with which it is associated are not universal". In a previous study, my colleagues and I examined 50 American college students who had never received any formal or informal training in western musical notation to study their intuitions about how to read WSN (Tan, Wakefield, & Jeffries, 2009; see also Tan, 2002, for a related study). Most thought that pitch is depicted on the vertical axis though many believed it is denoted by both note-head and stem. All 50 participants assumed music is read from left to right, and a majority (41/50) thought that notes spaced closer together are played faster than notes placed further apart. Participants commonly assumed that a "note" must have a filled circle and stem, and only 21/50 participants knew a whole note (semibreve) is also a "note", as most thought it signified silence or absence of sound. The majority (39/50) thought duration of note length progressed in this fashion from shortest to longest: whole, half, quarter, eighth notes etc. (semibreve, minim, crotchet, quaver etc.) — corresponding with the symbol's complexity — while actually it is the reverse order. These findings dovetail with Athanasopoulos and Moran's research as the Tan (2002) and Tan et al. (2009) studies suggest that many basic conventions of western standard notation are not intuitive, even for college-aged students immersed in western culture and exposed to many other western notational forms.


Overall, I found Athanasopoulos and Moran's study (part of a more extensive work) to be ambitious and interesting, yielding intriguing findings in its main analysis and more informal observations. Its participant pools make a valuable contribution to a growing empirical area that has mainly focused on children's invented notations of music. The methodological issues I raised are not uncommon in fieldwork and studies that extend their reach beyond the typical convenience samples of college students. These issues are less problematic if the focus is not primarily on comparison to WSN or between groups, but on discovering the breadth of possibilities of human engagement with musical sound. In contrast to a more simple linear assumption, the perspective of specialization as involving simultaneous expression of both gain and loss sheds light on the complexities of the shape of musical enculturation and more broadly, of human development.


  1. For a cogent discussion of the printed musical score as a performance guide and mediator of meaning, see Hultberg (2002).
    Return to Text
  2. The final word in Olson's quotation in the original context was "speech".
    Return to Text


  • Abraham, O., & von Hornbostel, E.M. (1909). Translated by List, G. & E. (1994). Suggested methods for the transcription of exotic music. Ethnomusicology, Vol. 38, No. 3, pp. 202-216.
  • Baltes, P.B. (1982). Theoretical propositions of life-span developmental psychology: On the dynamics between growth and decline. Developmental Psychology, Vol. 23, No. 5, pp. 611-626.
  • Hultberg, C. (2002). Approaches to musical notation: the printed score as a mediator of meaning in western tonal tradition. Music Education Research, Vol. 4, No. 2, pp. 185-197.
  • Kuhl, P.K., Stevens, E., Hayashi, A., Deguchi, T., Kiritani, S., & Iverson, P. (2006). Infants show a facilitation effect for native language phonetic perception between 6 and 12 months. Developmental Science, Vol. 9, No. 2, pp. F13-F21.
  • Lynch, M.P., & Eilers, R.E. (1992). A study of perceptual development for musical tuning. Perception & Psychophysics, Vol. 52, No. 6, pp. 599-608.
  • Lynch, M.P., Eilers, R.E., Oller, D.K., & Urbano, R.C. (1990). Innateness, experience, and music perception. Psychological Science, Vol. 1, No. 4, pp. 272-276.
  • Olson, D.R. (1994). The world on paper: The conceptual and cognitive implications of writing and reading. Cambridge: Cambridge University Press.
  • Papoušek, M., Papoušek, H., & Symmes, D. (1991). The meanings of melodies in motherese in tone and stress languages. Infant Behavior and Development, Vol. 14, No. 4, pp. 415-440.
  • Randel, D.M. (2003). Notation. In: D.M. Randel (Ed.), The Harvard dictionary of music. Cambridge, MA: Belknap, pp. 565-571.
  • Tan, S.-L. (2002). Beginners' intuitions about musical notation. College Music Symposium, Vol. 42, pp. 31-141.
  • Tan, S.-L., Wakefield, E.M., & Jeffries, W.P. (2009). Musically untrained college students' interpretations of musical notation: Sound, silence, loudness, duration, and temporal order. Psychology of Music, Vol. 37, No. 1, pp. 5-24.
  • Walker, R. (1997). Visual metaphors as music notations for sung vowel spectra in different cultures. Journal of New Music Research, Vol. 26, No. 4, pp. 315-345.
Return to Top of Page