TAKEN at face value, some instruments seem better suited than other instruments for certain kinds of tasks (Huron & Berec, 2009). For example, some instruments seem more able to convey or express sadness. The famous comedian and talented banjo player, Steve Martin, frequently noted: "You can't play a sad song on the banjo" (e.g., Goodreads, 2012). Martin's claim is directly echoed in the Willie Nelson song entitled You Just Can't Play a Sad Song on a Banjo. In contrast, instruments like the 'cello and the cor anglais are commonly regarded as better suited for conveying nominally sad affect. The effect of instrumentation on perceived affect is evident in experiments carried out by Hailstone et al. (2009). Happy, sad, angry and fearful melodies were played on different instruments (piano, violin, trumpet and synthesizer). They found that the ability to recognize the intended affect was greatly enhanced by matching the melody with a particular instrument. For example, the odds of recognizing a sad melody were greatest when played on the violin, while the odds of recognizing a happy melody were least when played on the violin. This raises the question of what features of a musical instrument make it especially suitable for sad music.

Research in speech prosody may provide a useful starting point for understanding musical sadness. The first detailed description of sad speech was published in the late nineteenth century by the pioneering German psychiatrist, Emil Kraepelin (1899/1921). Kraepelin identified five characteristics of sad speech: quieter voice, slow speaking rate, lower overall pitch, narrower pitch movements (i.e., more monotone), and mumbled articulation. Modern research adds a sixth characteristic of sad speech, namely darker timbre (for a review, see Murray & Arnott, 1993).

Kraepelin's observations were made in a clinical setting. Over the past century, dozens of formal experiments have confirmed all of the features he described. Sad speech exhibits a quieter dynamic level (Skinner, 1935; Scherer, 1986; Siegman & Boyle, 1993; Banse & Scherer, 1996; Cowie et al., 2001); sad speech is slower in tempo (Siegman & Boyle, 1993; Breitenstein, van Lancker & Daum, 2001); sad speech is lower in pitch (Fairbanks & Pronovost, 1939; Lieberman & Michaels, 1962); sad speech is more monotone (Skinner, 1935; Fairbanks & Pronovost, 1939; Eldred & Price, 1958; Davitz, 1964; Huttar, 1968; Williams & Stevens, 1972; Bergmann, Goldbeck & Scherer, 1988; Banse & Scherer, 1996; Sobin & Alpert, 1999; Breitenstein, van Lancker & Daum, 2001; ); sad speech exhibits more mumbled articulation (Dalla Bella, Peretz, Rousseau, & Gosselin, 2001); and sad speech displays a darker timbre (Ohala, 1980, 1994; Tartter, 1980; Scherer, Johnstone, & Klasmeyer, 2003, Table 23.2; Schwartz, Howe, & Purves, 2003; Ross, Choi, & Purves, 2007).

Many of these same features have also been observed in sad music. For example, Turner and Huron (2008) showed that the dynamic levels are lower for nominally sad music. Post and Huron (2009) found that nominally sad music is slower in tempo. Similarly, Johnson (2010) found that music played on the banjo exhibits much greater density of notes compared with renditions of the same work played on the guitar. Huron (2008) showed that nominally sad music is lower in overall pitch, and also displays a slightly smaller melodic interval size. Finally, Schutz et al. (2008) found evidence consistent with the claim that darker instrumental timbre is associated with sadness.

In light of the speech research and the extant music-related research, we might suspect that what makes an instrument especially well-suited for conveying or expressing nominally sad music is that it exhibits the same six features that we have just described. One might suppose, for example, that a piccolo is poorly suited to playing sad music because it is high in pitch and typically produces a bright sound. Similarly, a banjo exhibits a relatively bright timbre, and the short-sustain of the plucked strings makes it ill-suited to playing slowly. Conversely, a 'cello has a generally low tessitura, and can play relatively quietly using sustained pitches that allow for very slow tempos.

In this study we will address this question using a correlational approach. In brief, musician participants were recruited for two surveys. One survey asked musicians to judge the relative sadness of familiar Western instruments. A second, independent survey, asked respondents to judge five of the six acoustic properties (described above) for the same selection of instruments. The sixth variable (pitch range) was determined from orchestration texts. Finally, we measured the degree to which the deemed acoustical properties could be used to predict the deemed sadness of the various instruments. To anticipate our results, we will see a partial convergence between these two surveys, where those instruments deemed most sad were also those deemed most capable of generating the acoustical features implicated in the conveying of sad affect in speech.


Formally, our hypothesis may be stated as follows:

H. Those musical instruments deemed most capable of expressing or representing sadness are also judged better able to generate the acoustic features needed to convey sad affect in speech. Specifically, instruments characterized as "sad instruments" are deemed better able to play (i) more quietly, (ii) slower, (iii) with smaller pitch movements, (iv) with relatively lower pitch, (v) with a more "mumble-like" articulation, and (vi) with a darker timbre.


As noted above, we conducted two independent surveys. The first survey collected assessments of the sadness for 44 familiar musical instruments. The second survey collected assessments of five acoustical properties for these same 44 instruments.


Twenty-one musician participants were recruited from among graduate music students and faculty at three music educational programs: Ohio State University, University of Arkansas, and Harvard University. None of the participants were familiar with the purpose of the study. Participants were given a list of 44 Western musical instruments and asked to make two judgments on a 7-point scale. The two questions were addressed separately rather than concurrently.


For the first question, participants received the following instructions:

In light of your past music listening experience, for each of the following instruments, rate how commonly or frequently you think this instrument is used to convey sadness in music.

Please note that by "sadness," we mean a sort of depressed empty feeling, rather than a wailing sort of grief.


Not common           Very common
  1 2 3 4 5 6   7

The same question was asked for each of the 44 instruments. Having completed the first task, the same participants continued with a second task, with the following instructions:

In the previous task you were asked to identify how commonly or frequently an instrument is used to convey sadness. In this task, we are looking for something different. For each instrument, rate how well the instrument is able to convey sadness.

That is, an instrument may be rarely used to convey sadness in music, even though you may think that it is very well suited to convey sadness. For this task, we want you to think in terms of ability or capacity rather than common usage. Note also that any given instrument may be suited to a diverse range of emotions. For example, apart from sadness, an instrument may be well suited to conveying happiness. How well an instrument can convey happiness should have no bearing on your judgment of the instrument's suitability for conveying sadness.

In other words, when called upon to convey sadness, how well is this instrument able to produce a sad sound? Once again, by "sadness," we mean a sort of depressed empty feeling, rather than a wailing sort of grief.


Not capable of sadness           Very capable of sadness
  1 2 3 4 5 6   7

Following completion of the survey, participants were debriefed in order to alert the experimenters to the possibility of unwanted demand characteristics, or possible confusion in completing the survey.


Before presenting our results, it is appropriate to measure the intersubjective agreement between the various participants. Accordingly, we calculated the correlation between responses for all 21 respondents. This produced some 210 paired comparisons for each of the survey questions. A priori we established an exclusionary criterion that any subject whose data correlated less than an average of +.1 with all other subjects would be excluded. For the first question (frequency of use of instrument for sadness) the average correlation across all instruments between all subjects was +.53. The lowest correlation was +.16. For the second question (capacity for an instrument to convey sadness) the average correlation between subjects was +.58. The lowest correlation was +.15. Accordingly, none of our participants' data were excluded. In light of the broad agreement between subjects, all of the data were pooled together.

Table 1 shows the mean ratings and standard deviations for both frequency and capacity judgments for the 44 instruments tested. The table is ordered according to the latter ratings. The middle column shows the estimated frequency of use for sadness and the right-most column shows the estimated capacity for sadness. Participants judged the human voice as the most frequently used sad instrument, with the 'cello, viola, violin and piano completing the top five. The triangle was judged the instrument least used for sadness, along with the cymbal, tambourine, wood block, and glockenspiel. Regarding the judged capacity for generating a sad sound, the 'cello edged out the voice for top spot, but the top five instruments remained the same. The rank ordering of the instruments judged least capable of producing sad sounds was similar to that of the frequency data, except that the snare drum replaced the glockenspiel.

Table 1: Estimated Frequency of Use and Capacity for Sadness (Ordered by Sadness Capacity)

Mean Sadness Frequency (SD)
Mean Sadness Capacity (SD)
Violoncello 6.38 (0.92) 7.00 (0.00)
Voice 6.52 (0.87) 6.91 (0.44)
Violin 6.00 (1.30) 6.86 (0.48)
Viola 6.05 (0.87) 6.81 (0.40)
Piano 5.91 (1.26) 6.71 (0.64)
Oboe 5.29 (1.71) 6.62 (0.74)
English Horn 5.24 (1.55) 6.43 (0.87)
B-flat Clarinet 4.86 (1.32) 6.24 (1.00)
Acoustic Guitar 5.19 (1.54) 6.14 (1.24)
Bassoon 4.40 (1.76) 6.05 (1.28)
Flute 4.00 (1.92) 5.95 (1.43)
Horn 4.24 (1.67) 5.81 (1.47)
Harp 4.48 (1.83) 5.71 (1.59)
Alto Saxophone 4.38 (1.50) 5.62 (1.60)
Tenor saxophone 4.65 (1.63) 5.57 (1.66)
Double Bass 3.80 (1.96) 5.55 (1.67)
Electric Guitar 3.33 (1.65) 5.38 (2.06)
Bass Clarinet 3.90 (1.60) 5.38 (1.69)
Trombone 4.00 (1.41) 5.38 (1.32)
Alto Recorder 3.61 (1.85) 5.37 (1.89)
Trumpet 3.33 (1.53) 4.95 (1.53)
Soprano saxophone 3.40 (1.85) 4.81 (1.91)
Soprano recorder 2.74 (1.85) 4.65 (1.87)
Flugelhorn 3.39 (1.79) 4.61 (1.88)
Marimba 3.10 (1.77) 4.40 (1.64)
Baritone Saxophone 3.67 (1.94) 4.39 (1.85)
Vibraphone 3.35 (1.66) 4.35 (1.95)
Chimes 3.05 (1.69) 4.33 (2.08)
Bass Guitar 2.50 (1.67) 4.33 (2.03)
Tuba 3.10 (1.55) 4.24 (1.81)
Contrabassoon 3.17 (1.95) 4.11 (1.97)
Celeste 2.68 (1.46) 4.10 (2.13)
Timpani 2.62 (1.72) 3.71 (2.00)
Banjo 2.29 (1.38) 3.57 (1.63)
Xylophone 2.19 (1.50) 3.57 (1.63)
Piccolo 1.76 (1.45) 3.33 (1.74)
Gong 2.43 (1.57) 3.00 (1.76)
Bass Drum 2.81 (2.14) 2.95 (2.04)
Glockenspiel 1.68 (1.06) 2.90 (1.56)
Triangle 1.43 (0.81) 2.48 (1.83)
Wood block 1.52 (0.98) 2.38 (1.66)
Cymbal 1.48 (0.98) 2.38 (1.43)
Tambourine 1.48 (0.93) 2.10 (1.45)
Snare drum 2.00 (1.38) 1.51 (1.36)

In general, there is considerable concordance between the capacity and frequency judgments. Using a Spearman rank order correlation, the correlation between the two set of data is rs=+.97, df=42, p<.001. This is consistent with the notion that instruments judged most capable of producing sad sounds are likely to be employed for that purpose. Of course, we should hasten to add that despite our instructions, participants may well have been influenced by their musical exposure when engaged in the second task of assessing the capacity of various instruments for producing sad sounds. Table 1 suggests that the voice and stringed instruments are deemed both most frequently used to convey sadness in music and also most capable of conveying sadness.


The goal of our second survey was to solicit impressions of the capacity of different instruments to produce certain acoustical effects consistent with prosodic cues for sad speech. Twenty-three participants were recruited from the Ohio State University School of Music subject pool. This study was one of several that undergraduate students could choose in order to receive partial course credit. The participants completed an online questionnaire in which they were asked to judge "how easy" it is for an instrument to produce some effect. Specifically, the surveys posed five questions for each of 44 instruments.


The instructions are given below:

In this survey we will ask a series of questions related to familiar musical instruments. For each instrument, rate how well the instrument is able to produce a certain sound (such as playing quietly). Note that an instrument may be rarely required to play quietly, even though you may think that it is very well suited to playing quietly. That is, we want you to think in terms of ability or capacity rather than common usage. Note also that any given instrument may be suited to a diverse range of sounds. For example, an instrument might be capable of playing both very loudly and very softly. The fact that an instrument can be very loud (and even commonly does play very loudly) does not necessarily mean it cannot play very quietly. How well an instrument generates a loud sound should have no bearing on your judgment of the instrument's suitability for generating a quiet sound.

In summary: When called upon to produce a certain acoustic feature, how capable are the following instruments to generate that acoustic feature?

In order to ensure that participants understood the instructions, they were asked to answer the following comprehension question before proceeding with the survey:

Based on the instructions above, we would like you to rate instruments according to (select all that apply):

□ their common usage

□ their ability/capacity

All twenty-three participants selected the correct response for the comprehension question. For each of 44 instruments, we posed five questions:

1. How easy is it on this instrument to play very quietly?

Very easy           Not easy at all
  1 2 3 4 5 6   7

2. How easy is it on this instrument to play very slowly?

Very easy           Not easy at all
  1 2 3 4 5 6   7

3. How easy is it on this instrument to bend the pitch? [to play small intervals]

Very easy           Not easy at all
  1 2 3 4 5 6   7

4. How easy is it on this instrument to make it sound like it's mumbling?

Very easy           Not easy at all
  1 2 3 4 5 6   7

5. How easy is it on this instrument to make a dark timbre?

Very easy           Not easy at all
  1 2 3 4 5 6   7

In addition, we asked participants to indicate whether they were familiar or unfamiliar with each instrument.

Having designed the survey, we became concerned that it was too long to sustain participants' interest. Accordingly, rather than asking all five of the above questions for all 44 instruments, the survey was split into two, each containing a subset of the five questions (henceforth Survey #2A and Survey #2B). Survey #2A consisted of questions 1-3, and was completed by 12 participants; Survey #2B consisted of questions 3-5, and was completed by 11 participants. Notice that question 3 (pitch bending) was asked of all 23 participants.

Both Surveys #2A and #2B were implemented on a password-protected web site. Participants received e-mail reminders containing pointers to the appropriate web pages. Participants were randomly assigned to either of the two surveys. Following the main survey questions, participants were asked to respond to the following debriefing questions:

i)    About how long did the survey take you to complete?
ii)   Please list any instruments with which you are proficient below. If you are not proficient with any instrument please type "none."
iii)  If there were any questions you didn't understand, please explain the confusion below.
iv)  Please briefly describe your experience with the survey.
v)   What strategies did you use to answer the questions?

Most respondents (58%) reported taking between 10 and 20 minutes to complete the survey; no respondent reported taking more than 30 minutes.

With regard to reported instrument familiarity, 97% of responses indicated that the participants were familiar with the instruments in question. Of the small number of reported unfamiliar instruments, the celesta was least familiar, followed by the soprano recorder, the glockenspiel, and the alto recorder. Nevertheless, most of the participants reported being familiar with all of the instruments in the survey.

In response to the question pertaining to confusion, eight (of 12) respondents expressed some degree of confusion about what it means for an instrument to sound like it's mumbling. Similarly, eight (of 12) respondents expressed confusion about what it means for an instrument to play very slowly. For the one question shared in common, eight (of 23) respondents expressed some degree of confusion about what it means for an instrument to bend the pitch or play small intervals.


In order to determine the intersubjective reliability we carried out paired correlations for all responses between all pairs of participants. For participants answering Survey #2A (questions 1-3), the mean correlation was +.28. However, two of the participants exhibited mean intersubjective correlations below +.1, which was our a priori threshold for data inclusion. After eliminating these two respondents, the mean intersubjective correlation was +.39. In the case of Survey #2B (questions 3-5), the mean correlation was +.51. All of the participants in Survey #2B produced an average intersubjective correlation above +.1 so none of the participants were excluded from further data analysis.

Recall that we considered a five-question survey to be excessively long, and so broke up Study #2 into two separate surveys. If we propose to combine the two surveys, it is reasonable to ask whether the participants were drawn from the same population and were behaving in similar ways. Since question #3 appeared in both surveys, it affords an opportunity to compare the within-survey agreement for question #3 to the between-survey agreement for question #3. (Recall that question #3 was answered for each of 44 instruments.) Accordingly, we took each individual respondent for Survey #2A, and calculated the average intersubjective correlation for question #3 with all of the participants for Survey #2B. Similarly, for each respondent for Survey #2B, we calculated the average intersubjective correlation for question #3 with all of the participants for Survey #2A.

A priori we resolved to exclude from analysis any participant whose responses produced an average correlation less than +.1 with the responses from participants completing the other survey. None of the participants met the exclusion criterion. We found a grand average within-survey correlation to be +.58. The grand average between-survey correlation was found to be +.56. In general, the grand average for between-survey correlations compares favorably to the within-survey correlations, suggesting that it is not unwarranted to combine the results for both surveys in further data analysis.


Table 2 reports the average ratings for the five acoustical properties surveyed by instrument. As can be seen, the instruments rated as most able to play quietly included harp, voice, and 'cello; the soprano recorder, vibraphone, and piccolo were rated as least able to play quietly. The instruments rated as most able to play small intervals included voice, violin, and trombone. The snare drum, triangle, and cymbal were rated as least able to play small intervals. The double bass, contrabassoon, and bassoon were rated as most able to play in a "mumbling" manner; conversely, the soprano recorder, piccolo, and triangle were rated as unable to play in a mumbling manner. The instruments rated as most able to play with a dark timbre included voice, bass drum, and timpani, while the instruments rated as least able to play with a dark timbre included soprano recorder, piccolo, and triangle. The piano, bass guitar, and vibraphone were rated as the instruments most able to play slowly, while the trombone, piccolo, and 'cello were rated as the least able to do so.

Table 2: Estimated Acoustical Features by Instrument

Instrument Quietly Slowly Small Intervals Mumbling Dark Timbre Lowest pitch
Acoustic Guitar 5.10 6.40 5.91 4.30 4.30 E2
Alto Recorder 3.57 5.56 4.24 2.60 3.56 F3
Alto Saxophone 5.40 5.60 5.38 4.18 3.55 C#3
Banjo 4.20 5.50 5.37 2.00 2.50 C3
Bassoon 4.10 5.60 4.38 5.73 5.27 A#1
Bass Clarinet 5.70 6.10 4.52 4.46 3.91 D2
Bass Drum 5.67 6.40 1.81 5.64 6.09 G0 [25 Hz]†
B-flat Clarinet 4.90 6.00 4.86 5.00 4.00 E3
Bass Guitar 5.40 6.70 5.48 5.40 5.80 E1
Baritone Sax 4.40 5.70 4.95 5.46 5.09 C#2
Contrabassoon 3.90 5.70 4.14 6.09 5.36 A#0
Celeste 5.25 6.50 1.33 2.44 3.00 C3
Chimes 4.40 6.40 1.67 2.82 2.82 C4
Cymbal 3.60 5.40 0.62 2.82 2.55 B4 [500 Hz]†
Double Bass 5.44 6.40 6.19 6.55 5.82 E1
Electric Guitar 4.50 5.70 6.14 4.30 4.60 E2
English Horn 4.33 5.60 5.00 4.64 4.46 E3
Flute 4.50 5.50 4.95 4.00 3.91 C4
Flugelhorn 4.50 5.40 4.83 4.89 3.56 E3
Glockenspiel 5.25 6.00 1.77 2.25 2.78 G5
Gong 4.80 5.60 1.05 5.09 4.18 G2 [100 Hz]†
Horn 4.30 6.00 5.29 4.73 5.55 F2
Harp 6.60 5.90 2.95 3.73 3.64 C1
Marimba 5.20 6.50 1.86 4.64 4.55 C3
Oboe 4.44 5.20 5.00 4.00 4.00 A#3
Piano 5.20 6.70 1.76 4.82 4.18 A0
Piccolo 2.00 4.90 4.43 1.36 2.00 D5
Snare Drum 5.00 5.80 0.81 1.73 3.36 G2 [100 Hz]†
Sop. Recorder 3.29 5.56 4.19 1.60 2.20 C4
Soprano Sax 4.00 5.80 5.29 3.27 3.18 G#3
Tambourine 4.30 6.10 0.91 1.91 2.82 G2 [100 Hz]†
Timpani 5.56 6.30 5.14 5.09 5.91 F2
Triangle 5.56 6.20 0.67 0.73 1.82 B5 [1000 Hz]†
Trombone 4.40 5.10 6.24 4.64 5.64 F#5
Trumpet 3.80 5.70 4.19 3.18 4.64 E3
Tenor Sax 5.00 5.10 5.19 4.55 4.46 G#2
Tuba 3.90 6.30 4.71 4.73 5.73 E1
Voice 6.20 6.00 6.81 5.36 6.64 F2
Vibraphone 2.50 6.50 2.25 4.50 4.50 F3
Viola 4.33 6.30 6.19 5.18 4.73 C3
Violin 4.22 6.30 6.24 4.36 4.09 G3
Violincello 5.75 4.56 6.21 5.00 5.30 C2
Wood Block 5.00 6.10 0.86 2.64 2.27 B4 [500 Hz]†


The motivation for this research was to determine whether judgments of instrument sadness correspond to an instrument's capacity to generate those acoustical properties known to contribute to sad speech prosody. In Study #1 we collected (1) estimates of the frequency with which an instrument is used to convey sadness, and (2) estimates of the capacity of instruments to convey sadness. In Study #2, we collected estimates of the capacity for instruments to produce various acoustical effects (such as quiet sound). If we assume that the conveyance of sadness in music makes use of the same acoustical properties known to convey sad affect in speech, and if we have reasonable estimates of the capacity of some instrument to generate these acoustical properties, then we should be able to predict the overall ability of that instrument to produce or convey a sad sound. Accordingly, we ought to be able to use the estimates of acoustical properties collected in Study #2 as predictor variables for the instrument judgments collected in Study #1. An appropriate method for testing this conjecture is through the use of multiple regression analysis.

Recall that the frequency judgments and capacity judgments collected in Study #1 are highly correlated. Performing independent tests to predict the frequency and capacity for sadness would reduce the statistical power due to multiple tests. Since the data collected in Study #2 relate to the acoustical capacities of the instruments, it would seem appropriate to predict an instrument's capacity for sadness, rather than its frequency of use for sadness.

Since our aim is to test the six-factor model, the Enter method was used. When predicting the capacity for sadness, the six-factor model proved significant, accounting for just over half of the variance (F(6,36)=8.361, p < .001, adjusted R2=.513). However, when examining the individual predictor variables, only one was found to be significant when shared variance was eliminated, namely "small interval" (beta=.689, p < .001). Table 3 reports the individual correlations for all six factors. Without correcting for multiple tests, notice that four of the six factors would reach significance at the .05 confidence level. Moreover, the signs for five of the six variables are in the direction predicted by the model. The one exception is slowness, which is opposite to the predicted direction.

Table 3: Individual Correlation with Sadness Capacity

Quietly r=.154 p=.162
Slowly r=-.108 p=.246
Small Intervals r=.718 p<.001**
Mumbling r=.516 p<.001**
Dark Timbre r=.445 p=.001**
Lowest Pitch r=.313 p=.020

Table 4 shows the corresponding correlation matrix. As can be seen, there is considerable shared variance among four factors: low pitch, mumbling, dark timbre, and small intervals. Notice that dark timbre is strongly correlated with both mumbling (r=.87) and low pitch (r=.73), and that low pitch (in turn) is strongly correlated with dark timbre (r=.70).

Table 4: Correlation Matrix

Sadness Capacity
Small Interval
Dark Timbre
Lowest pitch
Sadness Capacity
Small Interval
Dark Timbre
Lowest Pitch


In general, the model accounts for roughly 51% of the variance, suggesting that there may be other factors contributing to the perceived capacity of an instrument to play in a sad fashion. Recall that in the first survey we asked respondents to estimate not only how capable an instrument is for conveying sadness, but also how frequently an instrument is used to convey sadness. When a composer selects an instrument to serve a particular function, several factors might be assumed to inform that decision. While the capacity of an instrument to convey sad affect would obviously be important, other factors may also influence the choice of instrument. For example, some instruments are more portable or readily available, or may provide a better blend with other instruments, or the instrument may be part of a conventional ensemble grouping, such as a string quartet. Hypothetically, an ocarina might be better able to convey sadness than a 'cello. However, if the composer is writing a work for orchestra, 'cellos are readily available, whereas the ocarina is not. It would follow, therefore, that the frequency of use of an instrument to convey sadness should be less correlated with the acoustical factors for conveying sadness in speech than the capacity of that instrument for conveying sadness. Accordingly, we can offer the post-hoc hypothesis that the six variables used in our multiple regression analysis will better predict the estimated capacity of an instrument for conveying sadness than the estimated frequency of use of that instrument for conveying sadness. This post-hoc conjecture was not part of the original research conception, and carrying out the appropriate multiple regression analysis invites a loss of power due to multiple tests. Therefore, for curious readers, we relegate this post-hoc analysis to Appendix 1.

As noted, low pitch, dark timbre, mumbling, and small interval are all strongly correlated with one another. When several variables are correlated, one should not necessarily assume that the variable that accounts for the greatest portion of the variance is the one true variable, of which the others are partial descriptions. It is more likely that an unidentified or hidden variable is at play, and that all of the measured variables correlate with this more fundamental factor. In physical terms, low pitch, dark timbre, "mumbling" and small interval appear to be related to low physical energy. An instrument can produce low energy in two ways: first, the performer may convey only a small amount of energy to the instrument, or second, the instrument itself may be inefficient—transducing only a small proportion of the input energy into the resulting sound. Inefficient vibrators typically exhibit high inertia, and high inertia is characteristic of large-mass vibrators. Said another way, an inefficient vibrator will behave in a more sluggish manner. It is plausible that mumbling, low pitch, small intervals, and dark timbre are manifestations of vibrating objects that are simply less efficient.

If low output energy is the underlying cause of these sadness cues, we might ask why quiet level and slow tempo are not also strongly correlated with the other variables. Surely, low amplitude and slow event rates are also characteristic of low energy. Recall our earlier discussion of the ranges between the highest and lowest ratings between each of the variables. We found that the ratings between the highest and lowest judgments for the ability to play quietly were roughly four times smaller than the comparable ranges for the other variables. One might conjecture that the poor correlation may be a result of a ceiling effect for the quietness and slowness judgments. However, a post-hoc analysis of the skews and variances offered none of the telltale signs of a ceiling effect. Evidently, quiet level and slow tempo appear to be independent of the mumbling/darkness/small-interval/low-pitch cluster. By contrast, recent work by Eerola et al. (2012) found that energy was quite successful in predicting valence or pleasantness judgments for a large variety of instrument timbres.

By way of summary, the overall results provide qualified evidence that those instruments deemed most able to convey sadness are also judged better able to generate the acoustical features implicated in the conveying of sad affect in speech. Although only one variable (small interval) proved statistically significant, three other variables were correlated with small interval. We have suggested that small interval per se may not be the best predictor, but that the cluster of variables implies a more fundamental underlying origin. We have speculated that low energy may be that common proximal cause.

Apart from low energy, the large amount of unexplained variance suggests that factors other than the six acoustical features explored in this study may play a role in the perception of sad sounds. For example, in many cultures, a premium is placed on instruments that produce voice-like timbres. It may be that acoustical attributes that convey a more voice-like sound are important, or even essential, for expressing or conveying sadness. In addition, the generally poor ability of percussion instruments to produce a sad sound suggests that percussive articulation and/or the inability to manipulate pitch may be significant factors. For example, pitch-change itself may prove to be important for expressing or conveying sadness.


Our thanks to Drs. Elizabeth Margulis and Olaf Post for assisting with data collection at the University of Arkansas and Harvard University, respectively.


  1. Correspondence can be addressed to: Prof. David Huron, School of Music, 1866 College Road, Ohio State University. Columbus, OH 43210 USA. E-mail: huron.1@osu.edu
    Return to Text


  • Apple, W., Streeter, L.A., & Krauss, R.M. (1979). Effects of pitch and speech rate on personal attributions. Journal of Personality and Social Psychology, 37, 715-727.
  • Banse, R., & Scherer, K.R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70, 614-636.
  • Bauer, H.R. (1987). Frequency code: Orofacial correlates of fundamental frequency. Phonetica, 44, 173-191.
  • Benward, B., & Saker, M. (2003). Music: In Theory and Practice. Vol. I. New York: McGraw-Hill.
  • Bergmann, G., Goldbeck, T., & Scherer, K.R. (1988). Emotionale Eindruckswirkung von prosodischen Sprechmerkmalen. Zeitschrift für Experimentelle und Angewandte Psychologie, 35, 167-200.
  • Breitenstein, C., van Lancker, D., & Daum, I. (2001). The contribution of speech rate and pitch variation to the perception of vocal emotions in a German and an American sample. Cognition & Emotion. 15, 57-79.
  • Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. (2001). Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine,18, 32-80.
  • Dalla Bella, S., Peretz, I., Rousseau, L., & Gosselin, N. (2001). A developmental study of the affective value of tempo and mode in music. Cognition, 80 (3), B1-B10.
  • Davitz, J.R. (1964). Auditory correlates of vocal expressions of emotional meanings. In J.R. Davitz (Ed.), Communication of Emotional Meaning (pp. 101-112).  New York: McGraw-Hill.
  • Eerola, T., Ferrer, R., & Alluir, V. (2012). Timbre and affect dimensions: Evidence from affect and similarity ratings and acoustic correlates of isolated instrument sounds. Music Perception, 30 (1).49-70.
  • Eldred, S.H., & Price, D.B. (1958). A linguistic evaluation of feeling states in psychotherapy. Psychiatry, 21, 115-121.
  • Fairbanks, G., & Pronovost, W. (1939). An experimental study of the pitch characteristics of the voice during the expression of emotion. Speech Monographs, 6, 87-104.
  • Hailstone, J. C., Omar, R., Henley, S. M., Frost, C., Kenward, M. G., & Warren, J.D. (2009). It's not what you play, it's how you play it: timbre affects perception of emotion in music. Quarterly Journal of Experimental Psychology, 62, 2141-2155.
  • Huron, D. (2008). A comparison of average pitch height and interval size in major-and minor-key themes: Evidence consistent with affect-related pitch prosody. Empirical Musicology Review, 3 (2) 59-63.
  • Huron, D., & Berec, J. (2009). Characterizing idiomatic organization in music: A theory and case study of musical affordances. Empirical Musicology Review, 4 (3), 103-122.
  • Huron, D., Kinney, D., & Precoda, K. (2006). Influence of pitch height on the perception of submissiveness and threat in musical passages. Empirical Musicology Review, 1(3), 170-177.
  • Huttar, G. (1968). Relations between prosodic variables and emotions in normal American English utterances. Journal of Speech and Hearing Research, 11, 467-480.
  • Johnson, R.B. (2010). Selected Topics in the Perception and Interpretation of Musical Tempo. PhD dissertation, School of Music, Ohio State University.
  • Kraepelin, E. (1899/1921). Psychiatrie. Ein Lehrbuch für Studierende und Ärzte,ed. 2. Klinische Psychiatrie. II. Leipzig: Johann Ambrosius Barth, 1899. Trans. by R.M. Barclay as Manic-depressive Insanity and Paranoia. Edinburgh: E. & S. Livingstone, 1921.
  • Lieberman, P., & Michaels, S.B. (1962). Some aspects of fundamental frequency and envelope amplitude as related to the emotional content of speech. Journal of the Acoustical Society of America, 34, 922-927.
  • Martin, S. "You Can't Play a Sad Song on a Banjo." http://www.goodreads.com/quotes/97570-the-banjo-is-such-a-happy-instrument--you-can-t-play-a (accessed September 23, 2012).
  • Murray, I.R., & Arnott, J.L. (1993). Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion. Journal of the Acoustical Society of America, Vol. 93, No. 2, pp. 1097-1108.
  • Ohala, J. (1980). The acoustic origin of the smile. Journal of the Acoustical Society of America, 68, S33.
  • Ohala, J. (1994). The frequency code underlies the sound-symbolic use of voice pitch. In L. Hinton, J. Nichols, & J. Ohala (eds.), Sound Symbolism (pp. 325-347). Cambridge: Cambridge University Press.
  • Pavlov, I. (1901). Le travail des glandes digestives. Paris.
  • Post, O. & Huron, D. (2009). Music in minor modes is slower (Except in the Romantic Period). Empirical Musicology Review, 4 (1), 1-9.
  • Ross, D., Choi, J., & Purves, D. (2007). Musical intervals in speech. Proceedings of the National Academy of Sciences,14 (23), 9852-9857.
  • Scherer, K.R., Johnstone, T., & Klasmeyer, G. (2003). Vocal expression of emotion. In R.J. Davidson, K.R. Scherer, & H. Goldsmith (Eds.). Handbook of the Affective Sciences (pp. 433-456). Oxford: Oxford University Press.
  • Scherer, K. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40, 227-256.
  • Scherer, K., London, H., & Wolf, J.J. (1973). The voice of confidence: Paralinguistic cues and audience evaluation. Journal of Research in Personality, 7, 31-44.
  • Schutz, M., Huron, D., Keeton, K. & Loewer, G. (2008). The happy xylophone: Acoustic affordances restrict an emotional palate. Empirical Musicology Review, 3 (3), 126-135.
  • Schwartz, D., Howe, C., & Purves, D. (2003). The statistical structure of human speech sounds predicts musical universals. The Journal of Neuroscience, 23 (18), 7160-7168.
  • Siegman, A., & Boyle, S. (1993). Voices of fear and anxiety and sadness and depression: The effects of speech rate and loudness on fear and anxiety and sadness and depression. Journal of Abnormal Psychology,102, 430-437.
  • Skinner, E.R. (1935). A calibrated recording and analysis of the pitch, force and quality of vocal tones expressing happiness and sadness. Speech Monographs, 2, 81-137.
  • Sobin, C., & Alpert, M. (1999). Emotion in speech: The attributes of fear, anger, sadness, and joy. Journal of Psycholinguistic Research, 28, 347-365.
  • Tartter, V.C. (1980). Happy talk: Perceptual and acoustic effects of smiling on speech. Perception & Psychophysics, 27 (1), 24-27.
  • Turner, B., & Huron, D. (2008). A comparison of dynamics in major- and minor-key works. Empirical Musicology Review, 3 (2), 64-68.
  • Williams, C.E., & Stevens, K.N. (1972). Emotions and speech: Some acoustical correlates. Journal of the Acoustical Society of America 52 (4), 1238-1250.

APPENDIX I: Multiple Regression for the Frequency of Use for Sadness

In the main text, we reported a multiple regression analysis where the capacity of the instrument for conveying sadness was used as the predicted variable. Since we also collected data on the frequency of occurrence, one might be curious how well the predictor variables predicted the frequency data. Once again, using the Enter method, the six-factor model was found to be significant (F(6,36)=6.626, p < .001, adjusted R2 = .446). Once again, only "small interval" proved to be a significant predictor (beta=.522, p=.001). Although these results show the predicted reduction in explained variance compared with the analysis for sadness capacity, this difference must not to be considered statistically significant, in light of the issue of multiple tests.

Table 5: Individual Correlation with Sadness Frequency







Small Interval






Dark Timbre



Lowest Pitch



Return to Top of Page