ORCHESTRATION treatises often consistently describe the characteristic timbres of musical instruments (Wallmark, 2019). In treatises as well as in everyday musical discourse, people can (and often do) talk about the timbre of specific instrument types and may comment on one instrumental timbre with respect to another—for example, comparing the timbre of the oboe to the timbre of the trumpet. Yet, it is also widely recognized that musical instruments usually produce not a single timbre, but rather, as McAdams and Goodchild (2017) phrase it, "a constrained universe of timbres" (p.129). Timbre varies not only across instruments, but also within instruments, in relation to many factors including (but not limited to) pitch, register, intensity, articulation, duration, and vibrato. Historically, most literature on timbre does not address this polysemy, though explicitly distinguishing across-instrument vs. within-instrument timbre has become increasingly important as timbre and orchestration research continues to gain traction in recent decades; Siedenburg and McAdams (2017) discuss these issues in relation to their observation that timbre functions on different scales of detail.

There are many sources of within-instrument timbral variation. For example, Traube (2004) examined the effects of elements including plucking position and instrumental gestures on the timbral variability of the classical guitar. Practical considerations, including fingerings, which string a note is played on, mouthpieces, and reeds all have timbral repercussions. In many contexts, professional musicians consider the resulting timbral effects from variation on such parameters to be of critical importance even when differences in tone color are subtle. Other timbral variations can result from the idiosyncratic contributions of an individual's tone quality, the particular instrument brand and specific instrument being played, and the acoustics of the room. In aiming to achieve a certain variation of timbre on an instrument, musicians may intentionally vary aspects of their playing; for example, Bernays and Traube (2014) found some commonalities in how pianists manipulated dynamics and articulation when performing the same excerpt with different timbral intentions, but they also found that pianists exhibited unique profiles for how they approached achieving various timbres. In addition to the plethora of sources of timbral flexibility offered by traditional modes of performance, there is yet another world of timbres offered by extended techniques.

With so many different means of timbral variation available from a single instrument, choosing a starting point for systematic investigation is challenging. In nearly all contexts, music varies in pitch, loudness, articulation, and/or duration in ways that seem to be salient for most listeners, and so these parameters offer a reasonable place to begin investigation into within-instrument timbral variability. The experiment reported in this paper focuses on timbral variability in pitch (register) and intensity (dynamics), with the goal of establishing a methodology by which future studies can extend the scope of such investigations to manipulations of articulation, duration, and other playing parameters.

Relatively few experiments have investigated interactions between pitch and timbre. Some results suggest that pitch differences over an octave make it difficult to extrapolate a common sound source (Handel and Erickson, 2001; Marozeau et al., 2003), though musicians seem to be more successful than nonmusicians at this task (Steele & Williams, 2006). Complications arising from the interaction of pitch and timbre have been reported in speeded classification (Krumhansl & Iverson, 1992). An experiment reported by Allen and Oxenham (2014) observed interference of pitch and timbre, with results suggesting that in particular, pitch height and brightness are tightly related. Several other experiments provide evidence in support of a general link between pitch height and brightness perception (e.g., Marozeau & de Cheveigné, 2007; Cousineau et al., 2014).

Timbral differences across an instrument's range may be observed among registers. Here, "register" is used to refer to the concept of a section of the range of an instrument or voice that often is considered to demonstrate characteristic timbral qualities. Pitch itself is a continuous variable; register is categorical, but registers are also more clearly defined for some instruments than others. Different registers of an instrument or voice are not determined by a fixed interval, but rather are related to the physics of the sound source (Drabkin, 2001). Consequently, as the term is used in this paper, register is defined relative to a single instrument type and is not constant across instruments (e.g., as with orchestral register)—the lowest register of the flute, for example, overlaps with the highest register of the bassoon. Anecdotally, for some instruments more so than others, timbre varies characteristically among registers. The B♭ clarinet provides a standard example of distinct conceptual registers, with its four registers often defined as: the dark, hollow "chalumeau" register from E3 to G4; the "throat" register from G#4 to B♭4; the brighter, sweeter "clarion" register from B4 to C6; and the shrill, "extreme" register above C6 (Page et al., 2001). This "extreme" register is often referred to as the "altissimo" register.

Even for instruments without specific vocabulary for registers, regions of the instruments often have distinct timbral characters. Colloquially, it is often common to refer generally to the low, middle, or high register of an instrument. For example, the Vienna Symphonic Library (n.d.) describes the oboe's registers as lower (B♭3–F4), middle (F#4–B♭5), and upper (B5–G6). Notes above G6 are possible but rare and are described as belonging to the altissimo register (Redgate, 2018). As circumscribed by the Vienna Symphonic Library, the registers of the French horn include the low register (B1–D3), middle register (E♭3–E♭4), and upper register (E4–F5). While the timbral characters of different registers of instruments are familiar to many musicians, and especially familiar for a player of a given instrument, no research has systematically considered timbre semantics in relation to instrument registers.

With natural instrument sounds, the relative contributions of pitch height and register to an instrument's timbre are difficult, if not impossible, to parse perceptually. One study presents evidence that register may contribute uniquely to timbre, apart from variation in pitch: a machine learning model using Linear Discriminant Analysis (LDA) reported in Weihs, et al. (2005) demonstrates highly successful classification of register by spectrum only, that is, when all pitch information has been eliminated. However, in the current experiment, which uses unmanipulated recordings of natural instrument sounds, pitch and register remain conflated for the sake of ecological validity. Because of the way in which this factor is operationalized (see Materials), the term "register" includes differences in both register and pitch height.

Musical observation confirms that greater playing effort and consequently louder dynamics affect timbre significantly. Increased playing effort amplifies the energy at all of the same frequencies present in the original sound, but also, as more and more modes of vibration of the instrument spring into action, the spectrum spreads toward higher frequencies. The results of such changes can include a higher spectral centroid, a greater spectral spread, and a lower spectral slope (McAdams, 2019). However, very few studies have focused on the interactions between timbre and intensity. Pratt and Bowsher (1978), investigating the qualities of trombone timbre, compared one high and one low sound as well as one loud and one soft sound. Most participants rated the higher sound as brighter and less pleasant than the low sound; the louder note was also rated as brighter and less pleasant than the quiet sound. Using synthesized sounds as stimuli, Melara and Marks (1990) observed Garner interference in support of interaction between timbre and intensity as well as between timbre and pitch, which they argue suggests that these three attributes of sound are processed jointly by the perceptual system. Fabiani and Friberg (2011) asked participants to judge dynamics of stimuli varying on pitch, loudness, and timbre; they found that instrumental timbre and loudness influenced judgments of dynamics across all instruments, while pitch generally influenced judgments of dynamics for three out of the five instrument timbres tested.

One likely reason that little research has been done on the semantics related to the timbral flexibility of individual instruments is that the possible combinations of pitch, loudness, articulation, duration, etc. are often multitudinous. Furthermore, due to differences in physical construction, it is probable that instrument sounds do not all vary uniformly, and there are likely interactions among factors affecting timbres—thus, a complete mapping of variations on one type of instrument will likely not be fully applicable to other instruments. The project of mapping out timbral interactions across many instruments is a formidable task and beyond the scope of any individual study. Though some trends may be generally or for the most part shared among certain groups of instruments—for example, a plausible tendency for brightness to increase systematically with pitch—the extent of such commonalities is unknown. Yet, the results of such endeavors to map conceptualizations of within-instrument timbral variations for a wide variety of instruments would be useful in both composition and music analysis. These findings could also be informative for future empirical research in timbre; for example, in providing a measure of semantic distances among stimuli. The primary goal of this exploratory paper is to determine which semantic dimensions may be most pertinent to register and to dynamics in order to make grounded future predictions.

This article reports an experiment that uses the 20-dimensional timbre qualia model reported in Reymore and Huron (2020) to investigate the effects of changes in register and dynamics on the timbres of two instruments, the oboe and the French horn. Reymore and Huron's model, reproduced in Table 1, was derived from a multi-stage study involving interviews and rating tasks and reflects Western musicians' cognitive conceptions of qualia associated with typical instrument sounds through linguistic descriptions.

Table 1. 20-dimensional model of musical instrument timbre qualia; reproduced from Reymore and Huron (2020). Dimension descriptions were derived from component loadings of the Principal Component Analyses and contain between one and seven adjectives; the authors also offer shorthand labels for these dimensions, reported in the righthand column, which are intended to facilitate discussion and are used throughout this paper
Dimension description Shorthand label
1. rumbling, booming, low, deep, thick, fat, heavyrumbling/low
2. soft, smooth, singing, voice-like, sweet, gentle, calmsoft/singing
3. watery, fluidwatery/fluid
4. direct, projecting, loud, aggressive, commanding, assertive, powerfuldirect/loud
5. nasal, reedy, buzzy, pinched, constrainednasal/reedy
6. shrill, harsh, noisyshrill/noisy
7. percussivepercussive
8. pure, clear, precise, cleanpure/clear
9. brassy, metallicbrassy/metallic
10. raspy, guttural, grainy, gravellyraspy/grainy
11. ringing, long decayringing/long decay
12. sparkling, shimmering, brilliant, brightsparkling/brilliant
13. airy, breathyairy/breathy
14. resonant, vibrantresonant/vibrant
15. hollowhollow
16. woodywoody
17. muted, veiledmuted/veiled
18. sustained, evensustained/even
19. openopen
20. focused, compactfocused/compact

Reymore (2021) mapped the semantic characteristics of 34 Western musical instruments using this 20-dimensional model, producing "Timbre Trait Profiles" (TTPs) which summarize how each instrument is conceived to be characterized in this semantic space. Like the study by Reymore and Huron (2020) which built the model in Table 1, these ratings were made using imagined timbres with the goal of understanding instrument timbre prototypes for orchestral instruments. In these previous articles, I made the case that these profiles can be used as tools in timbre semantics and orchestration research. However, without information about timbral variation with respect to commonly manipulated factors, such as register and dynamics, these tools remain limited by their generality.

As described earlier, the project of understanding the relationship between timbre semantics and changes in musical parameters across musical instruments is enormous given the number of available instruments and their timbral flexibility. While some common trends are likely to exist for certain semantic dimensions, instrument families, or some combinations thereof, the complete picture is probably a highly complex one. This complexity is likely reflected in the general dearth of research in orchestration practice and the historical sentiment, as expressed by Rimsky-Korsakov, that orchestration "cannot be taught" (1913). The more trends that can be identified, the easier it will be to understand the complex picture of timbre semantics among orchestral instruments. As reviewed above, some data exist which consider timbre in relation to other musical parameters, but these studies have been primarily grounded in an acoustic, rather than a semantic perspective. Data that explicitly describe how instrumental semantics vary with factors such as pitch and register are needed to construct hypotheses and develop models.

Thus, the in-depth, exploratory studies of two instruments presented here serve four principal purposes. The data are first intended to serve as a foundation for the development of hypotheses that can be tested in future experiments. Second, the study design is offered as a template for easily replicating the methodology with other instruments and/or augmenting it in order to investigate other timbral variations relative to parameters including but not limited to articulation, duration, specific instruments or brands, and acoustic environments. This design could be used both in additional in-depth, exploratory studies, such as this one, or in wider studies intended to test hypotheses and/or models. Third, this study serves as an evaluation of Reymore and Huron's timbre qualia model: although Reymore (2021) offered evidence of the model's reliability across studies, both studies used imagined instrument sounds. Questions remain about whether ratings of recorded timbres are comparable to ratings of imagined timbres. The results of this study provide insight into these questions. Fourth, the detailed data reported in this paper for oboe and horn can be used for timbrally focused music theoretical analyses of solo pieces and may be informative for composition.



Participants (n = 47; 29 F, 18 M) were music majors recruited from the Ohio State University School of Music participant pool. The average age was 19.6 years (range 18–21, SD = 1.6). Participants reported an average of 11.3 years of regular musical practice (range 3–19, SD = 3.6).


Participants listened to two-second recordings of instruments playing sustained notes and were randomly assigned to rate either notes played on the oboe or notes played on the horn. Manipulated variables included pitch, register, and dynamic level; specifically, the stimuli for each instrument included four registers (Low, Middle 1, Middle 2, and High) represented by three chromatically neighboring pitches, and each of these 12 pitches were presented at three different dynamic levels (pianissimo, mezzo mezzo, and fortissimo), for a total of 36 stimuli per instrument. Participants listen to 18 of the 36 recorded notes in a random order and made ratings of the stimuli on 20 semantic scales from Reymore & Huron (2020; see Table 1). A priori, it was decided that for the purposes of analysis, values for each register would be represented by the mean ratings of the three pitches chosen to represent that register.


Stimuli were recorded by professional musicians for the purpose of the experiment. To address the issue of the tractability of the study, stimuli were limited to a representative sample of pitches and dynamics. One approach would be to sample single pitches from different registers at varying dynamics. However, using a single pitch risks interference due to idiosyncratic timbral variations among pitches. For example, C5 on the oboe is a notoriously difficult-to-control note because it exhibits a timbre that is relatively more unfocused and nasal than any of the other instrument's pitches, a result of the physics of the instrument. While well-designed instruments minimize this timbral difference and professionals can further temper the difference, anecdotally, a timbral uniqueness to C5 is still almost always perceptible to the trained ear and may even be able to aid in absolute pitch identification (Reymore & Hansen, 2020). Were C5 to be chosen as the single pitch representing the middle range of the oboe, it is likely that ratings of dimensions such as focused/compact and nasal/reedy would better characterize this single pitch than the middle range of the register, which is in general much more focused and less nasal than C5. While such note-to-note idiosyncrasies are quite familiar to the players of a given instrument, there is little empirical information about this source of variability, though a systematic report of idiosyncratic intonation tendencies can be found in Snow (2006). Likely, intonation and timbral idiosyncrasies are in many cases related—plausibly, pitches that tend to be sharp are also brighter and/or more nasal. It is probable that timbral idiosyncrasies, like intonational idiosyncrasies, are more prominent in certain instruments than others, most likely as a function of various aspects of the physics and acoustical properties of each instrument.

To mitigate the potential idiosyncratic effects of individual notes, three neighboring chromatic pitches were used to represent each register. The selection of neighboring pitches minimizes the potentially confounding effect of pitch height on timbre but provides a method in which each register is represented by multiple pitches. Pitches were selected through consultation with the professional players who recorded the stimuli. First, we determined the lowest and highest sets of three pitches that are both comfortable to play and are reasonably common in the repertoire. The other two pitch sets for both instruments were derived by dividing up the space between the high and low sets equally. A priori it was determined that these sets could be shifted up or down by half or whole step in order to avoid particularly idiosyncratic notes (like C5 on the oboe); decisions about shifting were made in consultation with the professional musicians. As a result, four sets of three pitches were recorded to represent the high, low, and middle registers of the instruments. See Appendix A for more detail on stimuli selection. Table 2 lists the pitches recorded for the oboe and the horn; these pitch samples are visualized in Figure 1.

Table 2. Recorded pitches for oboe and horn.
RegisterFrench hornOboe
LowB♭1, B1, C2C4, C#4, D4
Middle 1B2, C3, C#3 A4, B♭4, B4
Middle 2B3, C4, C#4 F5, F#5, G5
HighC5, C#5, D5 D6, E♭6, E6
Diagram of piano keys illustrating pitch stimuli for the oboe. Keys C2, C3, C4, C5, and C6 are labelled and 4 groups of keys are highlighted in purple. More description below Diagram of piano keys illustrating pitch stimuli for the French horn. Keys C2, C3, C4, C5, and C6 are labelled and 4 groups of keys are highlighted in orange. More description

Figure 1. Piano keyboards illustrating pitch stimuli for the oboe (purple, top) and French horn (orange, bottom).

The professional musicians recording the stimuli were instructed to play each of the pitches at three different dynamic levels:

  1. Pianissimo. Play as piano as possible without sacrificing tone quality or intonation.
  2. "Mezzo mezzo." Play at a comfortable middle dynamic level for your instrument, between mezzo piano and mezzo forte. 2
  3. Fortissimo. Play as forte as possible without sacrificing tone quality or intonation.

The musicians recorded 36 notes for each instrument (4 registers * 3 pitches * 3 dynamics). While neither the oboe nor the horn has specifically defined registers in the way that the clarinet does, general descriptions of high, middle, and low registers are applicable; we can consider the lowest set of pitches to represent the instruments' low registers and the highest to represent their highest register. For the purposes of discussion, I refer to the four registers with relative names: Low, Middle 1, Middle 2, and High.


The study was implemented as an Internet survey using the Qualtrics platform; participants were tested individually in an Industrial Acoustic Corporation sound attenuation room. Instructions are included in Appendix B. Participants listened to 36 two-second recordings of either the horn or the oboe playing sustained notes. After listening to each stimulus, for each of 20 timbre qualia dimensions, they rated how well the dimension applied to each note on a scale from 1 to 7 where 1 was labeled as "does not describe at all" and 7 was labeled as "describes extremely well." Note that while shorthand terms (Table 1, column 2) are used throughout his paper for ease of discussion, participants were presented with the full semantic descriptions during the experiments (Table 1, column 1). Participants were able to listen to each sound as many times as they felt was necessary. In piloting the experiment, it was determined that rating all 36 pitches for a single instrument took about an hour. As the task is a particularly repetitive one and attention was a concern, the experiment length was reduced so that each participant would rate a subset of 18 pitches for one of the two instruments, which took about a half hour.


With 47 participants each rating one half of the stimuli for a single instrument, 11–12 judgments were collected for each individual stimulus. Because each register is represented by three pitches, each register/dynamic combination is thus represented by 33–36 judgments.

Examining the rating means across dimensions on a note-by-note basis—that is, for each pitch-dynamic combination—reveals some idiosyncratic notes. For example, at times, rating means on a given dimension vary more from pitch to pitch than between registers (see Discussion for more detail), suggesting that the practice of representing registers with multiple notes in order to help mitigate idiosyncrasies and better represent the register as a whole was useful. Individual note means are included in Appendix C.

Here, results are first summarized for the oboe and then the horn: the top descriptors for each register-dynamic combination are reported and trends across dynamics and registers are illustrated with graphs. The underlying structure of the rating data for each instrument is considered separately through Exploratory Factor Analysis (EFA). Next, the data are modeled using both partial proportional odds regression and random forest classifiers. Finally, to address the question of the comparability of semantic judgements of imagined prototypical instrument timbres and recorded tones, results from the current experiment are compared to ratings from Reymore (2021), which were based on imagined timbres.


The top-rated dimensions for all pitches at all dynamics on the oboe—comprising its aggregate average profile—are nasal/reedy (4.57), sustained/even (4.42), resonant/vibrant (4.22), focused/compact (4.11), pure/clear (4.07), and open (3.95). Tables 3 summarizes the top six descriptors for each pitch register combination. The dimensions that are not included in the top six descriptors in the instrument's aggregated average profile are bolded in order to highlight which features are relatively unique to certain registers, dynamics, or register-dynamic combinations. For example, oboe notes are perceived as relatively more strongly sparkling/brilliant in the Middle 2 and High registers, while woody only appears in Low and Middle 1 registers as a top descriptor. The soft/singing quale is associated relatively more with softer and middle dynamic notes in middle registers.

Table 3. Top descriptors for each of 12 register-dynamic combinations. Dimensions not included in the top six descriptors in the instrument's aggregated average profile are bolded to highlight features that are relatively unique to certain registers, dynamics, or register-dynamic combinations.
Low (C4, C#4, D4)
PianissimoMezzo mezzoFortissimo
Middle 1 (A4, B♭4, B4)
PianissimoMezzo mezzoFortissimo
Middle 2 (F5, F#5, G5)
PianissimoMezzo mezzoFortissimo
High (D6, E♭6, E6)
PianissimoMezzo mezzoFortissimo

Complete visualizations of differences in qualia profile for each register-dynamic combination are visualized in Appendix E, using the radar plot Timbre Trait Profile format from Reymore (2021).

Means on each dimension as grouped by register and dynamics for the oboe are visualized in Figures 2 and 3. In these figures, the reported mean for each register is the average of the means of the three notes in each register set. Error bars illustrate the 95% confidence intervals. Figure 2 reports register on the x-axis, with shape/color of the points indicating dynamics. Figure 3 presents the same data, but with dynamics on the x-axis and register indicated by shape/color. Numerical data related to these figures can be found in Appendix D.

20 small graphs showing qualia variation by register. The x-axis is the register with labelled points low, middle 1, middle 2, and high. The y-axis is labelled rating and spans from 1 to 7. Each graph has data points represented by a colored dot with a vertical line through it. These dots are separated into three dynamics by color, the dynamics being p, m, and f. The 20 graphs are titled as follows: AiryBreathy, BrassyMetallic, DirectLoud, FocusedCompact, Hollow, MutedVeiled, NasalReedy, Open, Percussive, PureClear, RaspyGrainy, ResonantVibrant, RingingLongDecay, RumblingLow, ShrillNoisy, SoftSinging, SparklingBrilliant, SustainedEven, WateryFluid, and Woody. More description below.

Figure 2. Qualia variation by register, oboe.

20 small graphs showing qualia variation by dynamic. The x-axis is the register labelled with points p, m, and f. The y-axis is the rating that spans from 0 to 7. Each graph has data points represented by colored data points with vertical lines of varying lengths. These data points are color-coded into 4 groups: Low, Middle 1, Middle 2, and High. The graphs are titled as follows: AiryBreathy, BrassyMetallic, DirectLoud, FocusedCompact, Hollow, MutedVeiled, NasalReedy, Open, Percussive, PureClear, RaspyGrainy, ResonantVibrant, RingingLongDecay, RumblingLow, ShrillNoisy, SoftSinging, SparklingBrilliant, SustainedEven, WateryFluid, and Woody. More description below.

Figure 3. Qualia variation by dynamic, oboe.


The top-rated dimensions for all pitches at all dynamics on the horn (aggregate average profile) are brassy/metallic (5.21), direct/loud (3.95), rumbling/low (3.81), nasal/reedy (3.76), raspy/grainy (3.65), and sustained/even (3.60). Table 4 summarizes the top six descriptors for each register/dynamic combination. As before, dimensions that are not included in the top six descriptors in the instrument's aggregated average profile are bolded in order to highlight which features are relatively unique to certain registers, dynamics, or register/dynamic combinations. For example, horn notes were rated as relatively more strongly shrill/noisy at a forte dynamic across ranges; this descriptor does not appear in the top six for piano or mezzo dynamics. Pure/clear characterizes the Middle 2 and High registers at piano and mezzo dynamics.

Table 4. Top descriptors for each of 12 register-dynamic combinations. Dimensions not included in the top six descriptors in the instrument's aggregated average profile are bolded to highlight features that are relatively unique to certain registers, dynamics, or register-dynamic combinations.
Low (B♭1, B1, C2)
PianissimoMezzo mezzoFortissimo
Middle 1 (B2, C3, C#3)
PianissimoMezzo mezzoFortissimo
Middle 2 (B3, C4, C#4)
PianissimoMezzo mezzoFortissimo
High (C5, C#5, D5)
PianissimoMezzo MezzoFortissimo

Each register/dynamic combination is visualized via radar plot in Appendix E.

Figures 4 and 5 demonstrate the rating means on each dimension for the horn. As with the oboe data reported earlier, Figure 4 provides register on the x-axis, while Figure 5 presents the same data with dynamics on the x-axis. Numerical data can be found in Appendix D.

20 small graphs of the qualia variation by register for the french horn. The x-axis is the register and each graph is labelled with low, middle 1, middle 2, and high. The y-axis is the rating spanning 1-7. Each graph has several data points with vertical lines going through them. These vertical lines differ in length. Each data point is color coded by dynamic, with the dynamics being p, m, and f. The graphs are titled as follows: AiryBreathy, BrassyMetallic, DirectLoud, FocusedCompact, Hollow, MutedVeiled, NasalReedy, Open, Percussive, PureClear, RaspyGrainy, ResonantVibrant, RingingLongDecay, RumblingLow, ShrillNoisy, SoftSinging, SparklingBrilliant, SustainedEven, WateryFluid, and Woody. More description below.

Figure 4. Qualia variation by register, French horn

20 small graphs of the qualia variation by register for French horn shown through dynamic v. rating. The x-axis is labelled as register with the points p, m, and f and the y-axis is labelled rating and spans 1 to 7. Each graph has several data points with vertical lines going through them. These vertical lines differ in length. Each data point is color coded by register, with the registers being low, middle 1, middle 2, and high. The graphs are titled as follows: AiryBreathy, BrassyMetallic, DirectLoud, FocusedCompact, Hollow, MutedVeiled, NasalReedy, Open, Percussive, PureClear, RaspyGrainy, ResonantVibrant, RingingLongDecay, RumblingLow, ShrillNoisy, SoftSinging, SparklingBrilliant, SustainedEven, WateryFluid, and Woody. More description below.

Figure 5. Qualia variation by register, French horn.


First, the latent structure of the rating data is explored through Exploratory Factor Analyses (EFA). Secondly, two types of exploratory models are presented in order to determine which semantic dimensions may be significantly related to register and/or dynamics in the oboe and horn and to assess relative importance of semantic dimensions. Finally, results are compared with ratings from Reymore (2021) to assess potential differences in the timbre qualia model's applications to imagined and recorded sounds.

Factor Analyses

Previous work engaging the 20-dimensional timbre qualia model has reported a number of analyses concerning the structure of rating data. Reymore and Huron (2020) originally derived the model using Principal Component Analysis (PCA) and other techniques from ratings on 77 categories. PCA explained 72% of the variance with 20 components, while smaller models with 3–5 components, comparable to the size of models proposed in previous literature, explained only 39–50% of the variance. The data in that study were thoroughly examined for optimal dimension reduction—details about the PCA and the rationale for choosing a 20-dimensional model are included in the paper and its Supplementary Materials. The dimensionality of the model was further assessed in Reymore (2021), using a different rating dataset, in three ways: with respect to the correlation matrix, the relative usefulness of descriptors, and Principal Component Analysis. No recommendations were made for dimension reduction on these bases. This paper also presented analyses suggesting the model to be reliable across studies for generating ratings of imagined musical instrument sounds.

Analyses reported in these papers, along with other unpublished analyses of additional datasets using the model, suggest that the underlying structure of rating data may vary with different sets of rated instruments. That is, while 20 dimensions are needed to capture reasonable variance for the entire set of Western orchestral instruments (as in Reymore, 2021), it may be the case that certain dimensions could be highly correlated among subsets of instruments. Were this the case, we might observe variation in latent structure among ratings for different instrument subsets. By extension, this may also be the case for within-instrument semantic variation, and techniques such as EFA thus may help us to better understand the semantic structure of individual instruments with respect, in this case, to register and dynamics. This possibility was addressed with the current dataset.

For both oboe and horn data, the Kaiser-Meyer-Olkin (KMO) statistic was > .75, and Bartlett's test of sphericity was significant, suggesting that EFA was appropriate. In choosing a number of factors, I considered various rules of thumb, including the eigenvalues, scree plot, and percent of variance explained (Jolliffe, 2002); as in Reymore and Huron (2020), I preferred to err on the side of "too many" rather than "too few" factors and considered the relative interpretability of the potential models. With both sets of data, while the eigenvalue > 1 guideline suggested a 4-factor model, the resulting scree plots did not demonstrate clear elbows, suggesting five, or possibly more, factors. In the case of the oboe ratings, four factors explained 69% of the variance, whereas five factors explained 76%. Within the horn ratings, four factors explained 75% of the variance, while five explained 81%. I determined from a comparative examination of 4- and 5-factor models for both instruments that 5-factor models were substantially more clearly interpretable. Taking into account both the interpretability and the 6–7% increase in variance explained with the addition of a fifth factor, I chose to report the 5-factor models here.

As summarized below in Table 5, factors in both models were named using up to three of the highest loading descriptors greater than .5; in cases where only two descriptors loaded greater than .5, only two descriptors were used to name the factor. The full models with complete loadings are reported in Appendix F.

Table 5. Summaries of 5-factor EFA models for oboe and horn; loadings of up to the top three descriptors are included.
Semantic DescriptionLoadingSemantic DescriptionLoading
Factor 1 sustained/even .75 sustained/even .82
resonant/vibrant .66 resonant/vibrant .80
pure/clear .63 open .68
Factor 2 airy/breathy .70 direct/loud .68
Factor 3soft/singing.71muted/veiled.76
Factor 4low/rumbling.63low/rumbling.77
Factor 5percussive.82woody.61

Although the two 5-factor models demonstrate some close similarities, it is evident that the latent structure differs: that is, between the two instruments, relationships among some of the dimensions differ. Factor 1 is highly similar between the two instruments. Factor 4, for both, is primarily driven by rumbling/low, but for the oboe is also influenced by woody, while the horn sees a secondary influence of raspy/grainy. Factor 5 in both models seems motivated by material; however, as one instrument is a woodwind and the other a brass instrument, the dimensions that load onto this factor vary. Factors 2 and 3 also reveal how the structures of the ratings differ from oboe to horn; for example, direct/loud and nasal/reedy load together for the horn, but not for the oboe, and airy/breathy and nasal/reedy load together for the oboe, but not for the horn, suggesting different relationships between these dimensions for each instrument. These differences are further illustrated in the exploratory models reported in the next section.

Exploratory Modeling

In order to understand which descriptors may be significant and most relevant to changes in register and dynamics—and to lay the groundwork for hypothesis testing in future experiments—two approaches were taken toward post hoc exploratory modeling of the data. Partial proportional odds (ordinal) regression models were used to suggest which descriptors may have significant relationships with dynamics and register. Additionally, random forest classifiers were used to obtain a measure of relative variable importance for the descriptors that can be compared across instruments and parameters.

Partial Proportional Odds Regression

Four partial proportional odds regression models (oboe/dynamics, horn/dynamics, oboe/register, and horn/register) were built using the ordinal package (Christensen, 2019) in R (R Core Team, 2020; version 4.0.0). The 20 semantic ratings were used to predict either the dynamic or the register of the stimuli. Note that while the causal relationship clearly runs in the opposite direction, such models allow us to identify which semantic ratings bear significant relations to dynamics/register. Because of the observable differences in results as seen in Figures 2–5 above, two separate models were constructed for oboe and two for horn, predicting register and dynamics from semantic ratings. Given the exploratory nature of the models and relatively small sample size, it was decided a priori not to make corrections for multiple tests.

All variable inflation factors (VIF) were less than 3, which did not suggest issues with multicollinearity. The Brant test (function brant, MASS package; Venables & Ripley, 2002) was used to test the parallel regression assumption. Violations of this assumption were observed in each of the four models among different subsets of descriptors. Consequently, I used the clmm2 function in the ordinal package to build partial proportional odds regression models in which variables satisfying the Brant test were specified as having ordinal effects, while those that did not were specified as having nominal effects. In the case of nominal effects, the function assessed significance between pairs of semantically adjacent levels (e.g., for dynamics, pairs included p-m and m-f). Because each participant rated multiple stimuli (repeated measures), the models also included a random intercept for participant.

For the oboe, significant (p < .05) semantic descriptors predicting dynamics included airy/breathy, direct/loud, hollow, resonant/vibrant, soft/singing, sustained/even, and woody. For horn dynamics, significant descriptors included direct/loud, muted/veiled, shrill/harsh/noisy, soft/singing, and watery/fluid. For the oboe, significant descriptors related to register included muted/veiled, nasal/reedy, open, raspy/grainy, resonant/vibrant, rumbling/low, shrill/harsh/noisy, soft/singing, sparkling/brilliant, and woody. For horn register, significant descriptors for register included direct/loud, nasal/reedy, open, raspy/grainy, resonant/vibrant, rumbling/low, shrill/noisy, sparkling/brilliant, and watery/fluid. Tables 6–9 below summarizes the results of the four models with respect to the significant semantic descriptors. The full models can be found in Appendix G.

Random Forest Classification

The data were also modeled via random forest classifiers using the caret package (Kuhn, 2020). This modeling process produces a measure of relative variable importance that is helpful in assessing the findings. Performance was measured from 5-fold cross-validation, where the data were split into five folds and each fold served in turn as the testing set for the model trained on the other 80% of the data. Thus, accuracy and kappa values reported here are averaged across all five folds of the cross-validation for each of the four types of models. The models were trained using the train function with method set to "rf," where the control parameter included 5-fold cross-validation.

For the dynamics models, where chance classification is at 33.3%, the oboe model's accuracy was 52.8%, κ = .29, while the horn's accuracy was 62.3%, κ = .44. For the register models, where chance classification is at 25%, the oboe model's accuracy was 62.2%, κ = .50, while the horn model's accuracy was 58.3%, κ = .44. The confusion matrices suggest that the middle registers of both instruments were most problematic to identify and were often confused; similarly, the mezzo mezzo dynamic was consistently misclassified more often than the fortissimo or pianissimo dynamics. Relative variable importance is illustrated in Figure 6 and reported quantitatively in Tables 6–9; relative variable importance measures are taken from each model trained on the entire data set using 5-fold cross-validation.

4 graphs labelled Dynamics, Oboe, Dynamic, Horn, Register, Oboe, and Register, Horn. All 4 graphs have the same x axis labels. The x-axis is labelled Importance and spans 0 to slightly over 100. Each graph has different descriptors listed on the y-axis, such as the 1st graph having Direct.Loud, Airy.Breathy, Soft.Singing, and 17 others. More description below.

Figure 6. Relative variable importance from random forest classifiers predicting register and dynamics for oboe and horn from semantic rating data.

Tables 6–9. Relative variable importances from the random forest classifiers and significant (p < .05) descriptors in predicting register and dynamics for oboe and for horn from partial proportional odds regressions.

Table 6. Dynamics, Oboe
Random ForestPartial Proportional Odds Regression
Semantic DescriptorVariable ImportanceSignificanceEffect Type
direct/loud100.0nominal: p|m, m|f
airy/breathy75.1nominal: p|m, m|f

Note. Significance levels for the proportional odds regressions can be found in Appendix G along with the full models.

Table 7. Dynamics, Horn
Random ForestPartial Proportional Odds Regression
Semantic DescriptorVariable ImportanceSignificanceEffect Type
direct/loud100.0nominal: p|m, m|f
Table 8. Register, Oboe
Random ForestPartial Proportional Odds Regression
Semantic DescriptorVariable ImportanceSignificanceEffect Type
soft/singing49.8nominal: M2|H
woody35.9nominal: M1|M2, M2|H
open31.8nominal: M2|H
resonant/vibrant24.6nominal: M1|M2
Table 9. Register, Horn
Random ForestPartial Proportional Odds Regression
Semantic DescriptorVariable ImportanceSignificanceEffect Type

Comparing ratings of recorded stimuli and auditory imagery

Results of the current study can be compared to results from Reymore (2021) in order to explore how semantic characterizations of imagined "typical" instrument sounds compare to characterizations of recorded instrument sounds. To examine this relationship, I compared the aggregate average ratings in the current study (mean of all 36 pitches for a given instrument on each dimension) to values reported in Reymore (2021) for the oboe and horn.

For the majority of the dimensions, the means for the imagined stimuli tend to be slightly higher than those of the recorded stimuli. For the oboe, the average of the absolute values of the differences in mean ratings between the auditory imagery task and recorded stimuli task was 0.69. That is, on average, mean ratings for each of the dimensions using imagined stimuli and using recorded stimuli were less than one point apart relative to a seven-point scale. The maximum absolute difference (i.e., absolute value of the difference) between ratings of auditory imagery and recorded stimuli was 1.52 points and the minimum absolute difference was 0.06 points.

For the horn, the average absolute difference in mean ratings between the auditory imagery task and recorded stimuli task was 1.10, higher than the comparable value for the oboe. The maximum absolute difference was 2.47 points and the minimum was 0.02 points. Dimensions with the highest absolute differences between imagined and recorded stimuli means include soft/singing (2.47), resonant/vibrant (2.28), and sustained/even (1.97).

Notably, the differences between imagined and recorded means for the French horn are on average greater than for the oboe. One factor that might be related to this difference is the variance in timbre-implicated parameters afforded by each instrument. The oboe has a small pitch range in comparison to most orchestral instruments, spanning only about two and a half octaves. The French horn has a relatively large pitch range, spanning more than four octaves. Accordingly, although stimuli were sampled from four nearly equally spaced locations in each instrument's range, the range of pitches sampled for the oboe was two octaves plus three semitones while the range of pitches sampled for the French horn was three octaves plus four semitones. Similarly, the dynamic range of the oboe is very small and is limited by the physics of the instrument and reed, with its incredibly narrow opening, while the dynamic range of the French horn is especially impressive—a horn may play a quiet, pure, golden note or an ear-shattering roar. In Reymore and Huron (2020), participants were asked to imagine a typical sound that an instrument would make rather than an unusual one. Values from the current study represent an average over 12 register/dynamic combinations, and some of the sounds presented to the participants in the current study likely were far from what participants in previous studies deemed to be a typical sound (for example, the lowest ff notes in the horn).

In general, rating differences between imagined and recorded stimuli for both instruments are likely also related to the differences between imagined typical sounds and real instantiations of sounds that come from the recording equipment, recording space, other factors including distance from the mic and the individual tone quality of the musician making the recordings. In particular, while both instruments were recorded with high-quality equipment, the recordings took place in a small room with little resonance, and the musicians were necessarily somewhat close to the mic, given the dimensions of the room. In considering the observed differences in means in light of the recording circumstances, it seems likely that most participants in the initial study imagining the sound of each instrument as they might have heard it live or recorded in a concert hall or other resonant space. Especially if the instrument was not their native instrument, they likely imagined the sound from a distance rather than in close proximity. The close mic in the recordings from this study picked up on slight imprecisions in the sounds that would be otherwise masked in a resonant space; this may have contributed to the lower ratings of pure/clear in the current study on both instruments: the differences in pure/clear between ratings from auditory imagery and recordings were among the highest for both instruments. For the horn, resonant/vibrant and ringing/long decay were both rated particularly low from the recordings as compared to the imagined tones of the previous study, both dimensions that ostensibly are related to the acoustics of the recording space, particularly in conjunction with the big sound of the horn.

Interestingly, the data from imagined stimuli predicted much higher ratings on soft/singing for both instruments than were collected in the current study. Although participants in the initial experiment were asked to imagine a sustained single tone, the qualities included in the soft/singing dimension (soft, smooth, singing, voice-like, sweet, gentle, calm) may have been associated with the instrument not only by virtue of a particular tone that was imagined, but also with their melodic characters and with the types of sweet (oboe) or soaring (horn) melodies that are characteristic of their orchestral personalities. Some dimensions may be a part of an instrument's characteristic or stereotypical timbral qualia but may not manifest in two-second, single note clips. If this is the case, some dimensions (such as soft/singing) may have been apparent with the use of imagined prototype stimuli but not experienced as strongly by participants in the brief recordings. However, despite some of the differences in means between the two studies (the largest was 2.47 points), the overall agreement in the ratings between the imagined and recorded instruments is quite strong, especially for the oboe.


Comparative Observations

For the oboe, register appeared to consistently provide the most variance along the timbre qualia dimensions; relatively less variance occurred with dynamic level change on most dimensions. Relative to chance performance, the random forest classifiers were much more successful in predicting register than predicting dynamics, even more so for the oboe than for the horn. In general, dynamic level seems to play a more important role in timbre qualia for the French horn than for the oboe. The differences in average ratings as effects of dynamics are both more consistent and more pronounced with the French horn—that is, there are wider ranges of qualia ratings as products of dynamic change, and several clear trends are apparent across dynamics that are mostly consistent among registers.

Tables 10–11 summarize the results of the partial proportional odds regressions and random forest classifiers. Although the nuances of these results may be very important in some contexts, considering overall trends can help guide future predictions.

Tables 10–11.Overlapping and unique significant predictors for register and dynamics in the oboe and horn; results from exploratory modeling. The third and fourth columns indicate whether or not each significant descriptor from the partial proportional odds regression was among the top six most important variables in the corresponding classifier.

Table 10. Register.
Partial Proportional OddsRandom Forest, Top 6 RVI
InstrumentSignificant DescriptorsOboeHorn
Both instrumentsshrill/noisyXX
Oboe onlywoodyX
Horn onlydirect/loudX
Table 11. Dynamics.
Partial Proportional OddsRandom Forest, Top 6 RVI
InstrumentSignificant DescriptorsOboeHorn
Both instrumentsdirect/loudXX
Oboe onlyairy/breathyX
Horn onlymuted/veiledX

Regarding register (Table 7), the two instruments share eight of the same significant predictors while retaining two unique predictors each. However, which descriptors are significant in the dynamics models (Table 8) varies much more in comparison to the register models: the two instruments share only two significant predictors while the horn is associated with an additional three unique predictors and the oboe with an additional five. It is notable that despite the relatively large number of predictors and small sample size, each model still contained 5–11 significant predictors, suggesting that changes in timbre qualia in response to both register and dynamic are somewhat complex: these parameters appear to motivate multidimensional semantic change in a variety of directions and strengths. Furthermore, because the models for register contained ratings of pitches at various dynamic levels, and vice versa, they provide an overall picture of each parameter individually and do not describe potential significant interactions (for example, the mean ratings for shrill/noisy in the horn as function of register behave differently at each of the three dynamic levels).

While there are some differences in the results of the two modeling approaches, there is a significant amount of overlap in that the top 4–6 descriptors in relative variable importance for the random forest classifiers are also significant in the corresponding partial proportional odds regression (see Tables 6–9). These shared descriptors can guide future experiments; the predictive descriptors that are unique to each instrument suggest that it may be the most obvious candidates (e.g., direct/loud and soft/singing for dynamics; rumbling/low, shrill/noisy, and sparkling/brilliant for register) that have similar effects across instruments. However, the unique descriptors are arguably the most interesting from acoustical, compositional, and music analytical perspectives.

Future experiments can use these findings as the basis for hypotheses related to semantics and register and/or dynamics, suggesting which semantic dimensions to focus predictions on. Examining a reduced set of descriptors based on these results increases the tractability of rating tasks through reduced experiment length and/or rating additional stimuli. When testing hypotheses about individual descriptors, reducing the number of descriptors also reduces the necessary number of multiple tests on the same dataset.

Another consideration in future experimental design relates to the inclusion of words in the descriptors which are clearly semantically related to pitch height and loudness, including "shrill" in shrill, harsh, noisy; "low" in rumbling, booming, low, deep, thick, fat, heavy; "loud" in direct, projecting, loud, aggressive, commanding, assertive, powerful; and "soft" in soft, smooth, singing, voice-like, sweet, gentle, calm. One option would be to remove these four words in order to determine the extent to which they are explicitly driving associations.

Note-to-note Idiosyncrasies

Average ratings on individual notes from within the same register sometimes varied significantly (see Appendix C). This is likely due in part to timbral idiosyncrasies that are inherent to the instruments. It is possible that some variability could be attributed to the particular instantiation of the note that was played during the recording. The musicians were permitted to re-record each note as many times as they wanted and were encouraged to do so if they were not happy with the quality of a note. Even so, one particular recording of a note might be slightly louder than the other notes in its register set or vary on another factor, such as articulation or vibrato, which could affect timbre qualia judgments. In future studies, providing several versions of each stimulus note, with one version presented at random to each participant, may help reduce the variability introduced by using a single, instantiation of a note; however, note-to-note variations will likely still be present. Given the idiosyncrasies observed in these results, the procedure of averaging three notes to represent a register, rather than one, was a productive one, illuminating several trends among the interactions of timbre qualia dimensions with register and dynamics, and is recommended in future research.

Within-instrument timbral variability

The level of detail to which the timbre qualia of an instrument might be mapped out is potentially enormous when considering all the parameters that can affect timbre. At some point, the resources and time invested in further refinement may yield diminishing returns, depending on the purpose of the research. Even at the level of detail under consideration here, we can see that while some dimensions can vary somewhat widely within a single instrument, other dimensions remain fairly consistent. For example, in the French horn, the largest range of mean ratings for each register-dynamic combination is 5.24, on the rumbling/low dimension. Specifically, of the 12 conditions (4 registers * 3 dynamics), ff in the low register was rated the highest on rumbling/low at 6.50 and pp in the high range was rated the lowest at 1.25. With this wide range of levels for rumbling/low, the nuanced representation of this dimension resulting from this study could add important information to musical analysis, both orchestrally and in solo or chamber settings. At the other extreme, the smallest range of mean ratings observed here is in the oboe on the dimension percussive, with a difference between the maximum rating (1.52 for mm in the low range) and the minimum rating (1.08 for pp in the low range) of only 0.44. Such a difference is unlikely to be especially useful or relevant to analysis involving the oboe. Further support for this interpretation comes from the observation that percussive was not significant in any of the regression models and was consistently ranked lowest in all random forest models in terms of relative variable importance. While the dimension of percussive might prove crucial in later assessments of timbral differences resulting from variation in articulation styles, it would not be germane to an analysis accounting for timbral interactions with register and/or dynamics.

These considerations lead to the question of what constitutes a meaningful difference in variability on a given dimension for a given instrument. Thus, it is informative to look at which dimensions exhibit the highest within-instrument variability for both instruments. The five dimensions for each instrument with the highest differences between maximum and minimum means are listed below in Table 8a–b. However, while the table below and results from the exploratory modeling can provide some kind of objective criteria for answering the question of meaningful variability in general, for music analysis, meaningful variability can only be assessed in the context of a particular piece of music. Consider a piece or orchestral solo for French horn that took place entirely in the middle register of the instrument; variation on rumbling/low would not prove as relevant here as it would for a piece that made use of all registers of the instrument.

Table 12–13. Highest within-instrument variabilities for the oboe and horn.

Table 12. Oboe.
Dimension Condition,
maximum mean
Max mean Condition,
minimum mean
Min mean Range
shrill/noisyhigh, ff5.82low, pp1.833.99
airy/breathyhigh, pp5.99low, ff2.543.45
direct/loudlow, ff4.35high, pp1.923.28
sparkling/brillianthigh, ff5.12low, pp2.202.92
rumbling/lowlow, ff3.91high, pp1.002.91
Table 13. Horn.
Dimension Condition,
maximum mean
Max mean Condition,
minimum mean
Min mean Range
rumbling/lowlow, ff6.50high, pp1.255.25
direct/loudhigh, ff6.52middle 2, pp1.654.87
shrill/noisyhigh, ff6.17middle 2, pp1.524.65
raspy/grainylow, ff6.04high, mm1.974.07
soft/singinglow, ff4.42high, pp1.093.32

Based on the ratings results reported here, we might frame the dimensions in Tables 12 and 13 as those for which each instrument has the most timbral flexibility, at least in relation to pitch and dynamics with stimuli played in a traditional manner (that is, excluding extended techniques). Both oboe and French horn shared three out of their five highest within-instrument variabilities: direct/loud, shrill/noisy, and rumbling/low. As these dimensions are semantically intertwined with dynamics and pitch, it would be reasonable to extrapolate these trends and predict that all instruments that can vary in intensity would show at least some variance on direct/loud while all instruments that vary in pitch would show at least some variance on shrill/noisy and/or rumbling/low. In general, the results suggest that the French horn has a higher capacity for timbral flexibility as generated through pitch and dynamics than does the oboe, a conclusion that accords with casual observation. The main source of timbral flexibility for the oboe may be generated through the manipulation of other variables not assessed here and their interactions. For example, a primary source of timbral flexibility for the oboe may involve vibrato, which is typically not a source of timbral variation for traditional horn sounds and was not manipulated for either instrument in this experiment.


This paper reported the results of an experiment in which participants rated two instruments on 12 combinations of register and dynamics using the 20-dimensional timbre qualia model from Reymore and Huron (2020). Although trends emerged across both dynamics and registers, the results illustrate the complexity of within-instrument timbral variability. While some trends were approximately linear, others were nonlinear, often where the highest and lowest registers stood apart from the middle registers. Some timbre qualia dimensions display interactions between register and dynamics—an example is that pp notes on the oboe often diverged from notes played at mm and ff, particularly in the high register. Finally, while some trends appeared similar between the oboe and horn, others seem to be unique to each instrument. Further research is needed to determine to what extent these individual trends may generalize to other instruments in the same family (woodwind or more specifically double reed for oboe, and brass for horn).

Summary, Register

Based on exploratory modeling, the most clearly relevant dimensions for register in both instruments include rumbling/low, shrill/noisy, and sparkling/brilliant, though several other descriptors are implicated through either the regressions, random forest classifiers, or both (see Tables 6–9). Uniquely for the oboe (that is, not for the horn), significant relationships between register and woody and muted/veiled were demonstrated. In the horn, the nominal effect for muted/veiled between the Middle 2 and High registers was close to significance (p = .054), suggesting that this possibly may be a subtle effect that could be detected in a larger sample. Uniquely for the horn, direct/loud and watery/fluid were related to register.

Summary, Dynamics

The most clearly relevant dimensions for dynamics include direct/loud and soft/singing; intriguingly, the differences between effects of dynamics on semantic dimensions appear to be greater than for the effects of register. That is, more dimensions were related uniquely to either the oboe or horn and were not shared results. In the oboe, airy/breathy and resonant/vibrant were both significant and ranked highly in relative importance, while for the horn, this was true of muted/veiled and shrill/noisy. While watery/fluid was significant for the horn, it did not rank highly in variable importance. For the oboe, watery/fluid neither was significant nor ranked highly, though it did near significance (p = .08), suggesting again that a general relationship may hold in a larger sample, albeit a weak one.

Recommendations for Future Research

The results recommend several suggestions for consideration in future timbre semantics research, particularly in regard to research which aims to understand timbre semantic variability in response to changing musical parameters. In sum:

  • In relation to timbre semantics, dynamics may interact with register. For example, for the oboe, airy/breathy increases from the Middle 2 to the High registers at a piano dynamic but decreases at a forte dynamic. Brant tests during exploratory analysis confirmed that while some descriptors can be treated as ordinal, others should be considered nominal.
  • Trends between musical parameters and semantic ratings may be non-linear. Future studies will need to account for this in relation to hypothesis testing and model building.
  • The range of semantic affordances may vary from instrument to instrument in relation to a given parameter—here, the horn demonstrated greater semantic flexibility for both register and dynamic as compared to the oboe. The oboe demonstrated relatively limited flexibility in response to dynamics.
  • Note-to-note idiosyncrasies in timbre may significantly impact semantic judgements. This should be considered when selecting stimuli, particularly when seeking to represent register.
  • The type of relationship between a given semantic descriptor and either register or dynamics may vary from instrument to instrument.
  • Similarly, the latent structure of rating data may vary from instrument to instrument.

These exploratory results mark an initial step forward in understanding within-instrument timbral flexibility, an issue that has not been previously addressed in a systematic way. This question is relevant to understanding how within-instrument timbres contribute to musical expression and meaning. For example, results of this study could be applied in analysis of unaccompanied works for oboe or horn. Sandell and Chronopoulos (1996) proposed the term "macrotimbre" to describe the formal unity of timbres associated with a single source, inclusive, for example, of variation in timbre across an instrument's range. The results illustrate the timbral possibilities afforded by these instruments as a result of manipulating dynamics and register, adding dimensionality to our understanding of their macrotimbres as presented in Reymore (2021).

Future research that maps additional instruments and manipulates different parameters would deepen our understanding of timbral similarities across instruments. Soden (2020) introduces the term "metatimbre," for the purpose of describing any collection of related timbres. Definitions of metatimbre include—but are not limited to—those produced by a single instrument, so that one could speak of, for example, the metatimbre of the clarinet, a concept that overlaps with Sandell and Chronopoulos' concept of macrotimbre. However, Soden's concept of metatimbre also includes collections of timbres across instruments with similar perceptual qualities or qualia, which seeks to qualify relationships among timbres of different instruments. For example, Rimsky-Korsakov noted timbral similarities among the viola, the middle register of the bassoon, and the low register of the clarinet as well as between the bassoon and horn when played piano or mezzo-forte (as cited in Soden, 2020). The current work provides a method for quantifying timbral similarities such as these that are specific to register-dynamic combinations. For example, upon collecting data for the viola, bassoon, and clarinet, we would expect to see more similarity among the registers specified by Rimsky-Korsakov than among other registers of these instruments or of other instruments, such as the oboe. Techniques such as hierarchical clustering or rank-ordering Euclidean distances would be appropriate for mapping metatimbral spaces across instruments. It is possible that this type of information may be able to predict perceptual effects of orchestration, such as blend between instruments (as described in McAdams, Goodchild, & Soden, 2022). Such results would also likely be useful in pedagogy of orchestration and composition and may have applications in music-theoretical analysis.

Future research using these methodologies might also examine the effect of musical experience on the cognitive linguistics of timbre by collecting ratings from both musicians and non-musicians or from participants from non-Western cultures. The original model (Reymore & Huron, 2020) was derived from studies involving Western, classically trained musicians. By asking people with different types of musical experience to apply the terms of the model to instrument sounds, we can begin to develop a sense of which linguistic concepts are more or less common across groups or are culturally constrained.

The subtle timbre nuances, tendencies, and idiosyncrasies of a given musical instrument are critical to the experience of a musician who has expertise with that instrument and as evidenced by the results of this study, are often apparent to listeners. Yet, the timbral affordances of musical instruments, even in response to basic parameters such as dynamics and register, are rarely discussed. The study reported here demonstrates that not only is within-instrument timbral variability based on dynamic and register apparent to listeners, such variability may be quite complex, and interactions among parameters may play an important role. Results can inform predictions for future work addressing which dimensions of timbre qualia are unique to specific instruments or groups of instruments and which are common across many instruments as well as the types of relationships, both linear and non-linear, between timbre qualia and playing techniques.


Many thanks to musicians Leland Greene and Benjamin Hottensmith for recording the stimuli and to Marcel Montrey for his statistical advice.


  1. This work is part of the author’s published dissertation. The author is now at the School of Music, Dance and Theater, Arizona State University. Correspondence concerning this article should be addressed to School of Music, Dance and Theater, Arizona State University, 50 E Gammage Pkwy, Tempe, AZ 85251.
    Return to Text
  2. "Mezzo mezzo," while not a standard dynamic term, is sometimes used colloquially by musicians to communicate that the dynamic level should be as middle-of-the-road as possible. This description, with its definition, should be accessible even to musicians who have not previously heard or used the term. Performers recording the stimuli were encouraged to ask for clarification about any of the instructions.
    Return to Text


  • Allen, E. J., & Oxenham, A. J. (2014). Symmetric interactions and interference between pitch and timbre. The Journal of the Acoustical Society of America, 135(3), 1371–1379. https://doi.org/10.1121/1.4863269
  • Bernays, M., & Traube, C. (2014). Investigating pianists' individuality in the performance of five timbral nuances through patterns of articulation, touch, dynamics, and pedaling. Frontiers in Psychology, 5:157. https://doi.org/10.3389/fpsyg.2014.00157
  • Christensen, R. H. B. (2019). ordinal - Regression Models for Ordinal Data. R package version 2019.12-10. https://CRAN.R-project.org/package=ordinal.
  • Cousineau, M., Carcagno, S., Demany, L., & Pressnitzer, D. (2014). What is a melody? On the relationship between pitch and brightness of timbre. Frontiers in Systems Neuroscience, 7:127. https://doi.org/10.3389/fnsys.2013.00127
  • Drabkin, W. (2001). Register. In Grove Music Online. https://doi.org/10.1093/gmo/9781561592630.article.23072
  • Fabiani, M., & Friberg, A. (2011). Influence of pitch, loudness, and timbre on the perception of instrument dynamics. The Journal of the Acoustical Society of America, 130(4), EL193–EL199. https://doi.org/10.1121/1.3633687
  • Handel, S., & Erickson, M. L. (2001). A rule of thumb: The bandwidth for timbre invariance is one octave. Music Perception: An Interdisciplinary Journal, 19(1), 121–126. https://doi.org/10.1525/mp.2001.19.1.121
  • Jolliffe, I. T. (2002). Principal components analysis (2nd ed.). Springer.
  • Krumhansl, C. L., & Iverson, P. (1992). Perceptual interactions between musical pitch and timbre. Journal of Experimental Psychology: Human Perception and Performance, 18(3), 739–751. https://doi.org/10.1037/0096-1523.18.3.739
  • Max Kuhn (2020). caret: Classification and Regression Training. R package version 6.0-86. https://CRAN.R-project.org/package=caret
  • Marozeau, J., & de Cheveigné, A. (2007). The effect of fundamental frequency on the brightness dimension of timbre. The Journal of the Acoustical Society of America, 121(1), 383–387. https://doi.org/10.1121/1.2384910
  • Marozeau, J., de Cheveigné, A., McAdams, S., & Winsberg, S. (2003). The dependency of timbre on fundamental frequency. The Journal of the Acoustical Society of America, 114, 2946. https://doi.org/10.1121/1.1618239
  • McAdams, S. (2019). Timbre as a Structuring Force in Music. In Siedenburg, K., Saitis, C., McAdams, S., Popper, A.N., Fay, R.R. (eds.), Timbre: Acoustics, Perception, Cognition (pp. 211–243). Springer International Publishing. https://doi.org/10.1007/978-3-030-14832-4_8
  • McAdams, S., & Goodchild, M. (2017). Musical Structure: Sound and Timbre. In Ashley, R. & Timmers, R. (eds.), The Routledge Companion to Music Cognition (pp. 129–139). Routledge. https://doi.org/10.4324/9781315194738-11
  • McAdams, S., Goodchild, M., & Soden, K. (2022). A taxonomy of orchestral grouping effects derived from principles of auditory perception. Music Theory Online, 28(3).
  • Melara, R. D., & Marks, L. E. (1990). Interaction among auditory dimensions: Timbre, pitch, and loudness. Perception & psychophysics, 48(2), 169–178. https://doi.org/10.3758/BF03207084
  • Page, J.K., Gourlay, K.A., Blench, R. & Shackleton, N. (2001). Clarinet. Grove Music Online. https://doi.org/10.1093/gmo/9781561592630.article.52768
  • Pratt, R. L., & Bowsher, J. M. (1978). The subjective assessment of trombone quality. Journal of Sound and Vibration, 57(3), 425–435. https://doi.org/10.1016/0022-460X(78)90321-8
  • R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
  • Redgate, C. (2018). Composing Using the Altissimo Range of the Oboe. (accessed November 11, 2020)
  • Reymore, L. (2021). Characterizing prototypical musical instrument timbres with Timbre Trait Profiles. Musicae Scientiae, 26(3), 648-674. https://doi.org/10.1177%2F10298649211001523
  • Reymore, L., & Hansen, N.C. (2020). A theory of instrument-specific absolute pitch. Frontiers in Psychology, 11:560877. https://doi.org/10.3389/fpsyg.2020.560877
  • Reymore, L. & Huron, D. (2020). Using auditory imagery tasks to map the cognitive linguistic dimensions of musical instrument timbre qualia. Psychomusicology, 30(3), 124–144. https://doi.org/10.1037/pmu0000263
  • Rimsky-Korsakov, N. (1913). Principles of orchestration, with musical examples drawn from his own works. (M. Shteĭnberg, Ed., E. Agate, Trans.). New York, NY: Dover Publications.
  • Sandell, G. J., & Chronopoulos, M. (1996). Identifying musical instruments from multiple versus single notes. The Journal of the Acoustical Society of America, 100, 2752. https://doi.org/10.1121/1.416900
  • Siedenburg, K., & McAdams, S. (2017). Four distinctions for the auditory "wastebasket" of timbre. Frontiers in Psychology, 8:1747. https://doi.org/10.3389/fpsyg.2017.01747
  • Snow, D. B. (2006). A Conductor's Guide to Wind Instrument Deficiencies: A Practical Addendum to the Undergraduate Conducting Text. [DMA dissertation, The University of Southern Mississippi].
  • Soden, K. (2020). Orchestrational combinations and transformations in operatic and symphonic music. [Doctoral dissertation, McGill University].
  • Steele, K. M., & Williams, A. K. (2006). Is the bandwidth for timbre invariance only one octave? Music Perception, 23(3), 215–220. https://doi.org/10.1525/mp.2006.23.3.215
  • Traube, C. (2004). An interdisciplinary study of the timbre of the classical guitar. [Doctoral dissertation, McGill University].
  • Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0. https://doi.org/10.1007/978-0-387-21706-2
  • Vienna Symphonic Library. (n.d.) "Oboe – Sound Characteristics." https://www.vsl.co.at/en/Oboe/Sound_Characteristics
  • Wallmark, Z. (2019). A corpus analysis of timbre semantics in orchestration treatises. Psychology of Music, 47(4), 585–605. https://doi.org/10.1177/0305735618768102
  • Weihs, C., Reuter, C., & Ligges, U. (2005). Register classification by timbre. In Classification—the Ubiquitous Challenge (pp. 624–631). Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28084-7_74


Appendix A, Stimuli Selection

At the finest possible level of detail, participant judgments would be collected at every combination of pitch and loudness on an instrument. However, this would result in an intractable number of ratings for a given instrument. As an example, the range of the B♭ clarinet is from concert D3 to B♭6: the clarinet is able to play 45 different chromatic pitches. Dynamics are on a continuous spectrum; however, it is conventional to conceptualize dynamic levels as categorical. The commonly used dynamic levels include pianissimo (pp), piano (p), mezzo piano (mp), mezzo forte (mf), forte (f), and fortissimo (ff). Additional p's and f's are occasionally added to the pianissimo and fortissimo, respectively, in order for the composer to communicate the call for extreme dynamics to the players. Movement between these categorical levels can be executed as a quantum leap or as a gradual increase (crescendo) or decrease (descrescendo) in loudness. Even using only standard categorical dynamic levels, a complete rating task would include six levels of loudness. Given the 45 notes of the clarinet and the six levels of loudness, a complete mapping would require 270 stimuli (45 notes for each of the six levels). For each stimulus, participants must make 20 ratings; 270 stimuli would require 5,400 ratings. This would make for an extremely long study, and the results would map only a single instrument.

Appendix B, Rating Task Instructions

"For each question, you will be given a sound clip. You will be asked to listen to the sound clip. Then, you will be asked to rate how well each of the descriptions listed below describes the sound you just heard. You may listen to the sound clip as many times as you'd like while rating the sound. A rating of 1 indicates that the word or group of words is not at all appropriate to describe the sound of the instrument, while a rating of 7 indicates that at least one of the presented words is very appropriate.

In this experiment, you will rate the sounds of various musical instruments according to several descriptive categories. Most of these categories contain multiple terms. When rating an instrument on a category containing multiple terms, base your rating on the term you think best describes that instrument. You may feel that other terms in the category do not apply strongly to the sound in question, but this should not be factored into your rating.

For example, one of the categories is "brassy, metallic." There may be an instrument that you consider to be exceptionally metallic, but not particularly brassy. This instrument should still receive a high rating in this category, based on your perception of how metallic-sounding it is. Do not lower your rating because you do not consider the instrument to be brassy. Similarly, an instrument you would consider to be very brassy but not metallic should also receive a high rating in the "brassy, metallic" category, based on its brassiness.

Appendix C, Mean Ratings by Stimulus (per note)

Table C1. Average ratings per stimulus, oboe.
Pitch Dynamic airy/breathy brassy/metallic direct/loud focused/compact hollow muted/veiled nasal/reedy open percussive pure/clear
Pitch Dynamic raspy/grainy resonant/vibrant ringing/long decay rumbling/low shrill/noisy soft/singing sparkling/brilliant sustained/even watery/fluid woody
Table C2. Average ratings per stimulus, French horn.
Pitch Dynamic airy/breathy brassy/metallic direct/loud focused/compact hollow muted/veiled nasal/reedy open percussive pure/clear
Pitch Dynamic raspy/grainy resonant/vibrant ringing/long decay rumbling/low shrill/noisy soft/singing sparkling/brilliant sustained/even watery/fluid woody

Appendix D, Rating means by register and dynamic

Table D1. Rating means by register and dynamic, oboe.
Register Low Middle 1 Middle 2 High Low Middle 1 Middle 2 High Low Middle 1 Middle 2 High
Dynamic pp pp pp pp mm mm mm mm ff ff ff ff
ringing/long decay2.602.923.312.723.283.753.513.702.883.603.433.76
Table D2. Rating means by register and dynamic, French horn.
Register Low Middle 1 Middle 2 High Low Middle 1 Middle 2 High Low Middle 1 Middle 2 High
Dynamic pp pp pp pp mm mm mm mm ff ff ff ff
ringing/long decay2.613.153.953.002.272.443.763.672.672.592.773.81

Appendix E, Radar plots by instrument and register-dynamic combination

12 radar plots by register-dynamic combination for the oboe. The radar plots are labelled as follows: High by piano, High by mezzo mezzo, High by forte, Middle 2 by piano, Middle 2 by mezzo mezzo, Middle 2 by forte, Middle 1 by piano, Middle 1 by mezzo mezzo, Middle 1 by forte, and Low by piano, Low by mezzo mezzo, and Low by forte.
12 radar plots by register-dynamic combination for the French Horn.  The radar plots are labelled as follows: High by piano, High by mezzo mezzo, High by forte, Middle 2 by piano, Middle 2 by mezzo mezzo, Middle 2 by forte, Middle 1 by piano, Middle 1 by mezzo mezzo, Middle 1 by forte, and Low by piano, Low by mezzo mezzo, and Low by forte.

Appendix F, Exploratory Factor Analyses

Table F1. Exploratory Factor Analysis with oboe rating data
Factor 1 Factor 2 Factor 3 Factor 4 Factor 5
Table F2. Exploratory Factor Analysis with horn rating data
Factor 1 Factor 2 Factor 3 Factor 4 Factor 5

Appendix G, Partial Proportional Odds Regression Models

Table G1. Partial proportional odds regression: Oboe, Dynamics. Airy/breathy, direct/loud, and raspy/grainy are treated as having nominal, rather than ordinal, effects.
Descriptor Estimate std.error z value p value Significance
ringing/long decay0.
p|m (Intercept)0.020.750.020.98
m|f (Intercept)2.460.763.230.00**
p|m direct/loud-0.540.10-5.510.00***
m|f direct/loud-0.360.08-4.410.00***
p|m airy/breathy0.370.094.270.00***
m|f airy/breathy0.200.082.480.01*
p|m raspy/grainy0.060.090.620.53
m|f raspy/grainy-0.110.09-1.260.21
Table G2. Partial proportional odds regression: Horn, Dynamics. Brassy/metallic and direct/loud are treated as having nominal, rather than ordinal, effects.
Descriptor Estimate std.error z value p value Significance
ringing/long decay-0.0040.085-0.0480.962
p|m (Intercept)-0.6660.727-0.9170.359
m|f (Intercept)2.3670.8542.7730.006**
p|m direct/loud-0.5700.097-5.8600.000***
m|f direct/loud-0.3250.089-3.6530.000***
p|m brassy/metallic0.0710.1010.7060.480
m|f brassy/metallic-0.1800.115-1.5650.118
Table G3. Partial proportional odds regression: Oboe, Register. Open, resonant, soft/singing, and woody are treated as having nominal, rather than ordinal, effects.
Descriptor Estimate std.error z value p value Significance
ringing/long decay0.0790.0810.9720.331
low|middle1 (Intercept)1.5160.9681.5660.117
middle1|middle2 (Intercept)-0.1060.932-0.1130.910
middle2|high (Intercept)-0.0120.972-0.0130.990
low|middle1 soft/singing-0.3530.107-3.3120.001
middle1|middle2 soft/singing0.0390.0950.4110.681
middle2|high soft/singing0.2780.1012.7640.006**
low|middle1 woody0.1030.0931.1110.267
middle1|middle2 woody0.3290.0853.8520.000***
middle2|high woody0.4620.1004.6240.000***
low|middle1 resonant/vibrant-0.2050.137-1.5020.133
middle1|middle2 resonant/vibrant0.2650.1212.1900.029*
middle2|high resonant/vibrant0.1530.1261.2120.225
low|middle1 open0.0040.1210.0360.972
middle1|middle2 open-0.0610.107-0.5670.571
middle2|high open0.3520.1172.9950.003**
Table G4. Partial proportional odds regression: Horn, Register. Brassy/metallic and muted/veiled are treated as having nominal, rather than ordinal, effects.
Descriptor Estimate std.error z value p value Significance
ringing/long decay0.0680.0840.8160.414
low|middle1 (Intercept)-2.1670.974-2.2240.026*
middle1|middle2 (Intercept)0.0110.8850.0120.990
middle2|high (Intercept)3.1770.9913.2040.001**
low|middle1. brassy/metallic-0.1100.123-0.8940.371
middle1|middle2 brassy/metallic-0.1140.114-0.9970.319
middle2|high brassy/metallic-0.0590.127-0.4670.641
low|middle1 muted/veiled-0.0990.121-0.8170.414
middle1|middle2 muted/veiled0.0580.1040.5550.579
middle2|high muted/veiled-0.2200.114-1.9240.054
Return to Top of Page