THIS paper explores the concept of describing a musical repertoire by its characteristic sound. The repertoire in question comes from hip-hop's Golden Age, a historical era encompassing the late 1980s and early 90s.1 By analyzing a large corpus of songs from this era, a definition of the Golden Age hip-hop sound is furnished to unite two sonic aspects: musical parameters (such as harmony, tempo, texture, and form), and production parameters (such as special effects, stereo imaging, loudness, and compression). The joint analysis of musical and production parameters is necessary for hip-hop music, as the artistry of this genre results from a team of creative professionals: MC (writes and performs the lyrics), producer (composes the instrumental beats), recording engineer (mixes the tracks), and record producer (guides the overall aesthetic of the track or album).2 Understanding the sound of a hip-hop repertoire requires the artistic contributions of all these actors to be considered. A 100-song corpus is developed using critical "best-of" lists from sources such as Rolling Stone, LA Weekly, About.com, and VH1, and a detailed analysis of these 100 songs is conducted across a variety of musical and production parameters. Since the corpus analysis was done by ear, a rigorous methodology for transcription and critical listening had to be developed. In addition, as the end goal of analyzing 100 songs was to evaluate them using statistical methods, all analysis data had to be represented by a numerical or binary (yes/no) value.
First, however, we must consider what it means to define the sound of a musical repertoire. The 2016 release Views by Toronto-based hip-hop artist Drake prompted Maclean's magazine editor Adrian Lee (2016) to write about the Toronto [hip-hop] sound, which he suggests is exemplified by "mind-blowing snatches of drowned-out underwater R&B". Even the most devoted hip-hop aficionado might not fully understand what Lee means by this statement; the only tangible cues he gives are that the Toronto sound incorporates underwater R&B, and that Drake's music is a characteristic example of it. Lee's description relies on the logic that examples of a sound (e.g. Drake exemplifying the Toronto sound) coalesce around an abstract sonic idea (e.g. underwater R&B). But what about other Toronto hip-hop artists: are they part of the Toronto sound, even if they perhaps do not sound like Drake, or do not compose with "mind-blowing snatches of underwater R&B"? This dilemma raises the following question: how can sonic parameters—from tempo to harmony to dynamic range compression to vocal production effects—be used to describe the sound of a repertoire both at its global level (all the songs or artists) and the local level (individual songs or artists)?
To address that question, this study uses statistical methods such as correlation and hierarchical clustering to define the Golden Age sound in three trend categories: trends of change, prevalence, and similarity. Trends of change are uncovered by observing large-scale trends in the analysis data over time, speaking to how the sound of hip-hop music changed during the Golden Age. Trends of prevalence are those which describe specific sonic parameters that pervade the entire corpus, regardless of time. Trends of change and prevalence always reference the complete corpus, describing what it sounds like on the global level. Trends of similarity consider smaller groups of songs in the corpus, describing how they sound alike on the local level. Running the analyzed data for each song through an R-language computer application, dissimilarity matrices are formed between each unique pair of songs, from which the overall similarity of any pair of songs can be measured. By exploring how trends of change, prevalence, and similarity each contribute to a definition of the Golden Age sound, this paper proposes a way of empirically defining a repertoire's sound by analyzing its sonic parameters.
The Golden Age of hip-hop
Journalistic and academic discourse reference a "Golden Age" of creativity, diversity, and maturation in hip-hop history, contributing to its emergence as a commercially profitable and artistically autonomous realm of popular music. The 1989 establishment of a hip-hop-specific Grammy award (Best Rap Performance) and Billboard chart (Hot Rap Songs) exemplifies the greater autonomy this music attained during the Golden Age. In addition, the hip-hop magazine The Source began publication in 1988, and the notorious, short-lived Source Awards commenced in 1991.3 The Golden Age also saw the emergence of regional hip-hop scenes (among them Los Angeles, Chicago, Houston, and Atlanta) throughout the United States, meaning this music was no longer limited to the New York City area. Championed by their own artists, these local scenes fed two important narratives of the Golden Age: hip-hop was becoming ubiquitous across America and its regional idiosyncrasies reflected its diversity as a musical genre. The drive for originality and creativity in hip-hop music assumed a more geographic tack, as East- and West-coast record labels battled for the lion's share of the market.
A significant—and perhaps unwelcome—side effect of hip-hop's mainstream emergence was a change in the practice of sampling. As record companies became aware of the financial lucrativeness of hip-hop music, they demanded royalties for the sampled use of their copyrighted material. The landmark legal case of Grand Upright Music, Ltd v. Warner Bros. Records Inc. (1991) was a watershed moment for its implications on sample usage. The case's ruling that sampling without permission qualifies as copyright infringement set a precedent that henceforth, producers would either have to pay sampling royalties (if they could afford it), sample from more obscure records (whose owners might not be able to afford legal proceedings), sample from artists who were amenable to the hip-hop aesthetic (thus unlikely to sue), or use real instruments in place of samples. The aesthetic implications of this legal case, as documented by Demers (2006), were significant.
Allmusic.com (accessed 2016) describes the Golden Age of hip-hop as beginning with the commercial breakthrough of Run-D.M.C. (1986) and ending with the mainstream emergence of gangsta rap popularized by Dr. Dre's The Chronic (1992).4 The New York Times describes it as the late eighties to early nineties.5 Numerous other publications also cite years within this range.6 Keyes (2002) makes no explicit reference to the Golden Age, but writes a chapter chronicling rap music's explosion into the musical mainstream, therein focusing on the period from 1985–1989. This study takes the 11 years between and including 1986 and 1996 as chronological boundaries. 1986 represents a landmark year in hip-hop: seminal and hugely successful albums by Run DMC and the Beastie Boys were released.7 An end date of 1996 roughly coincides with the deaths of Tupac Shakur (2Pac) and the Notorious B.I.G. Regardless of which years or events define the Golden Age, the music produced during this time had a lasting impact on hip-hop's posterity. The constant flow of new, boundary-pushing Golden Age album releases exemplifies this era's unprecedented stylistic fluidity.
Genre and style
It would be overly simplistic to classify this paper as a genre study, because it concerns itself more with musical style. For many people, their closest encounter with the concept of musical genre occurs when they shop for music. Music retailers use genre categories to help customers locate what they are looking for, or to discover new music similar to what they already like.8 Widespread academic interest into how retailers and online databases categorize genre exists, and typically involves machine learning and automatic classification tools.9 But the study of genre classification using computational methods sidesteps the notion that genre categories are flexible entities and perhaps not objectively definable. David Brackett (2016, pp. 19-20) contends that "genres are not static groupings of empirically verifiable musical characteristics, but rather associations of texts whose criteria of similarity may vary according to the uses to which the genre labels are put. 'Similar' elements include more than musical-style features, and groupings often hinge on elements of nation, class, race, gender, sexuality, and so on". He also warns against pinning genres to a set of infallible truths or constants: "doubts arise because inspection of an individual text in terms of style, form, or content inevitably raises doubts as to genre identity: the more that we examine a given grouping of texts, the more dissimilar texts begin to appear…Similarly, the more closely one describes a genre in terms of its stylistic components, the fewer examples actually seem to fit". Brackett's approach to genre classification clearly stems as much from the sociological and cultural discourse occurring in musical communities as it does from the music itself.
Several scholars have considered genre from both sociological and musicological perspectives. To Franco Fabbri (1999, p. 55), genre is "a kind of music, as it is acknowledged by a community for any reason or purpose or criteria, i.e., a set of musical events whose course is governed by rules (of any kind) accepted by a community". The rules Fabbri speaks of may be musical, but they need not be. Philip Tagg (2013, p. 267) situates musical genre as a "larger set of cultural codes that also includes musical rules" and musical style as emphasizing only musical codes. This notion of style stems from Leonard Meyer's (1973, p. 7) definition of style analysis, which serves to illuminate the normative principles of musical vocabulary running constant through a particular body of repertoire. Specifically, style analysis searches for "characteristic features" in a body of music. Adam Krims (2000) was the first music theorist to propose a genre system for hip-hop music, which includes the categories of mack rap, bohemian rap, party rap, and reality rap.10 Krims considers musical and sociological parameters in his genre system, but only in a generalized sense: he acknowledges that even such generalities are often incomplete or inaccurate.11
In a general sense, this study's trends of change and prevalence are style analyses: these trends reveal broad, consistent stylistic tendencies across a large body of repertoire. Trends of similarity are more difficult to situate in a style analysis, because these trends are much more localized within small groups of songs in the corpus. As Brackett has warned, closer examination of texts tends to reveal more differences than similarities, but when the number of texts examined together is sufficiently small, audible similarities do exist. Since these similarities between songs are not large scale, they cannot be considered part of a style analysis. However, as the definition of a repertoire's sound is partly predicated on them, they cannot be ignored either.
With advances in the fields of computer-aided analysis and machine learning, corpus studies encompassing large bodies of repertoire are becoming increasingly commonplace in the discipline of music theory. They are sometimes used to study repertoires that are not yet canonized in music scholarship. Though not unique to popular music, corpus research has often focused on this area. De Clercq and Temperley (2011, p. 50) produced what they called "one of the first, large-scale, systematic corpus analyses of popular music" with the aim of developing a foundation of stylistic norms in rock music upon which future research could be built. Under the auspices of the McGill Billboard Project, Burgoyne (2012) and Burgoyne, Wild, and Fujinaga (2013) explored the use of mathematical and computational methods to statistically analyze harmonic progressions. More recently, corpus studies have proliferated hip-hop scholarship: Ohriner (2016) has worked toward similar goals as de Clercq and Temperley, aiming to provide a representative sample of stylistic norms (in this case concerning hip-hop flow) against which outlying musical examples can be situated. Condit-Schultz (2016) also searches for stylistic norms in hip-hop flow, and identifies several large-scale trends which concur with some of the findings presented in this paper.
This study encompasses a much shorter timespan than most other corpus-based projects. In order to uncover statistically significant trends across this short timespan, a large sample of songs per year was used.12 The size and distribution of a corpus determine its effectiveness in addressing analytical goals. As this study aims to describe the unifying and distinguishing characteristics of a short yet prolific era of hip-hop music, a balance of diversity and manageability was important in assembling the corpus. Eight critical "best-of" song lists from the websites listed in Appendix A were consulted for corpus development. Billboard's Hot Rap Songs chart was excluded: appearing first in 1989, it misses several years covered by the corpus.13 Billboard statistics are measured by the week, so a significant amount of interference would have been necessary to convert these numbers into an all-time ranking.14 In order to minimize subjectivity, songs were included in the corpus only if they appeared on at least two of the eight lists.15 Considering only those songs released between 1986 and 1996, the corpus was formed with 100 songs (shown in Appendix B). These 100 songs were released by 61 different artists, and were reasonably well distributed across the 11-year period.
MUSICAL AND PRODUCTION ANALYSIS
Musical and production analysis proceeded through two stages: transcription/critical listening, and data conversion/database development.
From transcription to database
Music transcription forms a vital part of popular music research. Peter Winkler (1997, p. 170) wrote "rarely [does] the pop music literature address the music itself" (italics his).16 This might be partly due to the pitfalls of popular music transcription, such as how to accurately represent on paper the expressive microtimings of a drum groove or the subtle pitch inflections of a vocal line. But as Winkler (1997, p. 174) puts it, "transcriptions can be of varying degrees of detail and complexity, according to the uses for which they are intended". These uses, he suggests, fall into four categories: as a vivid portrayal of what is in the heard music, to support musicological arguments pertaining to the music, as a validation of the music, and as a prescriptive guide for its recreation. Winkler's first two categories describe the purpose of transcription for this study: to produce a recreation of the heard music, and to use this recreation to substantiate a definition of the Golden Age sound.
Hip-hop music was first documented through bootlegged live recordings and then through studio-produced albums; this music never began in score form (other than whatever system of notation an MC might use to sketch out their lyrics). Therefore, notating hip-hop music using orthography common to music theory necessitates its prior transcription. Some parameters analyzed in this study did not require transcription (such as tempo), while others resisted it (such as vocal timbre or dynamic range reduction). Transcribed data had to be rendered in a quantifiable format for purposes of statistical comparison. For each song, analysis of pitch collection, form, and texture/orchestration began via transcription and was converted to quantifiable or binary data. In the next sections of the paper, the transcription shown in Figure 1 (Audio Example 1) for "Nuthin' but a G Thang" (Dr. Dre ft. Snoop Dogg, 1992) will be used to document this conversion process. Some songs in the corpus were quite easy to transcribe (such as "Nuthin' but a G Thang"), and others were prohibitively difficult: songs like the Bomb Squad-produced "Fight The Power" (Public Enemy, 1989) and "Welcome to the Terrordome" (Public Enemy, 1989) each contain several dozen samples, each often lasting only a few seconds. Instead of deciphering these complex sound collages into detailed notated examples, transcriptions were done to represent the most aurally salient elements of each song.
Analysis according to aural salience
As the music of this corpus began in recorded form, the central goal of analyzing it involves examining how the sonic parameters of each song are salient to a listener, exploring how aurally salient parameters can be compared and contrasted between different songs. Thus, transcription and data collection are intermediary steps between recorded sound and its analysis. Any sonic parameters that are aurally salient to a general listener should be considered. Stephenson (2002) and Moore (2012) both suggest that the average listener, perhaps despite a lack of technical vocabulary, might possess the ability to hear and discern a wealth of sonic particularities without being able to explain them.17 For example, while listeners may not be able to accurately identify the tempo of a song without a metronome, they are usually able to determine whether it is faster or slower than another song. Similarly, they are able to make rudimentary judgments about how many instruments they hear in a recorded song, even if they are unable to articulate the functional roles these instruments perform. Analysis in this study reflects what is heard in the music. Sonic parameters (as detailed in Tables 1a and 1b) are divided into five musical categories: tempo, texture, form, pitch collection, and lyrics/vocals, and three production categories: component analysis/production effects/stereo imaging, vocal effects, and loudness/compression/phase coherence.
Tempo is one of the most aurally salient parameters of popular music. London (2004) amalgamates recent research on tempo perception and confirms humanity's ability to perceive a pulse from between roughly 30 and 240 BPM, preference of pulses between 86 and 120 BPM, and a maximal salience for tempi around 100 BPM. In order to be consistent and consider a salient tempo indicator across all songs, tempo was measured by choosing a pulse which always bore the same relation to the underlying audible drum beat. This was done by measuring tempo with respect to the alternating kick and snare of the basic backbeat pattern. For instance, Figure 2 details the drum part to "Nuthin' but a G Thang", showing how the tempo was determined. Tempo was measured using an online tempo-tap application, which records tempo by continuously averaging all previous taps.18
TEXTURE / ORCHESTRATION
Instrumental texture was analyzed with respect to functional layers, a concept outlined by Moore (2012, pp. 20–21). He advocates for the analysis of song texture with respect to the functions of each textural layer, rather than the instruments which comprise each layer. Moore's layers are four in number: explicit beat, functional bass, melodic, and harmonic filler. In hip-hop music, these four layers are normally present in some combination, but where the melodic layer in other popular genres might be occupied by a singing voice, in hip-hop this does not always occur. Therefore, the melodic layer was defined purely by melodic instruments, reserving the analysis of vocals for a separate parameter category. A fifth textural layer was added to account for any sound effects that do not fit into Moore's four categories. Thus, for each song, the transcription data was parsed into these five layers and represented the data as shown in Figure 3. Here we see how the aurally-identified instruments of "Nuthin' But a G Thang" occupy the five textural layers. Texture was analyzed only in the verses of each song so it could be compared consistently across the corpus. Song texture varies greatly in hip-hop music, especially in songs from the Golden Age. For instance, "The Choice is Yours" (Black Sheep, 1991) has a comparatively minimal texture, as seen in Figure 4 (Audio Example 2). The verses of this song only utilize three textural layers: explicit beat, functional bass, and sound effect, forming a thinner, sparser texture.
The main goals for formal analysis concerned content (what types of formal sections were present in a song) and balance (what proportions existed between these sections). Hip-hop music tends to exhibit cyclical, repetitive song forms. The alternating sections common to rock and pop are also present in much hip-hop, but with certain notable differences. Unlike rock or pop, in hip-hop, singing occurs almost exclusively in the hook-type sections interposed between rapped verses: only three songs in the corpus had substantial amounts of singing in their verses. Moreover, hip-hop music does not heavily rely on textural or harmonic changes to mark the different sections of its songs: only half the songs (48%) in the corpus exhibited noticeable textural changes between sections, and only 27% exhibited harmonic changes. Thus, lyrical content alone often suffices for determining where sectional boundaries occur, and what types of sections those boundaries separate. Verses are fairly easy to identify in hip-hop, but the hook sections that interpose them are less consistent in content. Take, for example, the differences between a refrain and chorus. For rock music, Stephenson (2002, p. 135) defines a refrain as consisting of "one or two textual lines that recur periodically" and a chorus as "a musical section that recurs numerous times with a fixed set of several lines". De Clercq (2012, p. 57) suggests that a refrain does not form a standalone section, rather it "exist[s] within a section". Two dicta can be extracted from these points: a refrain is sub-sectional and much less lyrically substantial than a chorus. With these dicta in mind, we arrive at the following definitions of hip-hop hooks (illustrated in Figure 5 and Audio Examples 3, 4, and 5):
- Refrain: a brief, recurring lyrical entity that typically appears at the end of a verse, and/or is interspersed throughout the period before the next verse begins.
- Extended Refrain: largely similar to the refrain but with more substantial lyrics, which are still formally connected to the verse they follow.
- Chorus: the most substantially lyrical and autonomous hook type, characterized either by a significant amount of lyrics disconnected from each verse, or by sung vocals. Singing often suggests a chorus.
Yet, categorizing formal sections in hip-hop music might itself be viewed as problematic, in that contrasting sections characteristic of many other popular genres are not prerequisites for a hip-hop song: In an interview with producer Mr. Supreme, Schloss (2004, p. 154) notes that "there are no rules in hip-hop…you don't have to have a chorus, you don't have to have a bridge…". Furthermore, no consensus on sectional nomenclature exists in hip-hop discourse: for hook sections, Ohriner (2013) uses "hook", while Adams (2008) uses "chorus". Despite the lack of consensus in published scholarship, the definitions presented here are robust enough for the goal of capturing aurally salient formal parameters. The formal transcription of "Nuthin' but a G Thang" in Table 2 can be converted into quantifiable or binary data: measures are used to calculate the length of all sections, the above definitions are used to define the type of hook section, and the presence (or absence) of intros, outros, and fadeouts is noted.19 Intros and outros were defined as the sonic content on each track that preceded the entry of or followed the cessation of the beat supporting the song.20
The small body of literature studying hip-hop's musicality focuses primarily on texture, rhythm, and timbre, and rarely on pitch collection. Considering how much sampling occurs in hip-hop and the tonal ramifications of this practice, the tonality of hip-hop beats are a rich source for inquiry into the characteristics of this music's overall sound. The following data was gleaned for "Nuthin' But a G Thang", as shown in Table 3: tonic, pitch collection, and presence, length, and composition of harmonic progression. 25% of the songs in the corpus exhibited no discernible pitch collection, and 7% exhibited no perceivable tonic. Songs such as "Bonita Applebum" (A Tribe Called Quest, 1990) challenged the pitch collection scheme in that they were clearly based on a chromatic pitch collection (as shown in Figure 6 and Audio Example 6), but were not objectively classifiable in an established mode. They were thus classified simply as "chromatic". Even though labelling every non-modal song as "chromatic" ignores intricate aspects of these songs, separating songs with chromatic or diatonic pitch collections reflects a salient aspect of their sound.
LYRICS AND VOCALS
The lyrical content and sonic aspects of vocal delivery are prominent elements of hip-hop music; songs are mixed so vocals are easily heard over the beat. As an MC's contribution to a song consists of both lyrics and flow (the rhythmic and articulative delivery of these lyrics), the analysis considers these aspects separately. Adam Krims (2000, pp. 46–92) has written on the relation between lyric subjects and hip-hop genres of "party rap", "mack rap", "jazz/bohemian", and "reality rap". These genres are associated with specific lyrical topics: party rap tends toward lighter topics dealing with celebration, pleasure, humor, romance, and sex, and often laced with self-deprecating irony. Mack rap topics focus on success, braggadocio, money, and sexual prowess. Jazz/bohemian topics are much more serious, self-conscious, yet tend to assume a more positive tone. Reality rap includes gangsta rap, chronicling the harrowing aspects of street life, gangs, and drug use. Extrapolating from Krims' subgenres, 12 distinct topics were enumerated with which the lyrics of each song were coded.21 These topics comprise partying, sex, romance, humor, braggadocio, dissing, the generation gap, stories or messages, reflections on social issues and street life, reminiscence, autobiographical storytelling, and eclectic topics (lyrics which didn't fit into any of the other 11 topic categories).
Sonic aspects of an MC's vocal delivery, such as timbre, articulation, tessitura, and rhythm, are all highly developed facets of flow, and can be used to measure the creativity or virtuosity of an MC.22 Flow supports the conveyance of meaning in lyrics: numerous studies investigating flow and its semantic and/or embodied potential abound, including Woods (2010), Adams (2009), Ohriner (2013), and Rollefson (2015). Despite this interest in flow and its timbral, rhythmic, articulative, and pitch components, objectivity remains elusive. Heidemann (2016) says of vocal timbre: "[it] is a facet of musical experience that cannot be denied any easier than it can be explained" [1.2]. While respecting the subjectivity of salient parameters of vocal delivery, the analysis here was made as broadly applicable as possible by only making very simple observations. For vocal timbre, smoothness or roughness of an MC's vocal delivery was evaluated. Roughness denotes a "gravelly" or guttural vocal quality, such as that typically heard in performances by 2Pac. The analysis of tessitura consisted of judging where in an MC's voice (how high, i.e. in what vocal register) he or she most commonly vocalized. 2Pac again exemplifies this well: in the song "Dear Mama" (1995, Audio Example 7), he adopts a low vocal register, while in "California Love" (1996, Audio Example 8) he raps high in his voice. Vocal articulation was measured according to the general level of enunciation or suppression of consonants, again approximated by ear. For example, Cypress Hill's B-Real raps with articulate, sharply enunciated consonants (Audio Example 9). Vocal rhythm was also generalized into two permeable categories: speech-like or metric, or a mix in between.23 As Adams (2008) points out, the lyrics of rap songs are often composed and recorded after the instrumental track, or beat, and MCs perform across a wide rhythmic spectrum from total synchronicity with the beat to an almost free-flowing poetic style that has very little rhythmic alignment with the beat. By reviewing an excerpt (Audio Example 10) from "Nuthin' but a G Thang", we hear Snoop Dogg rapping in a smooth timbre, in the middle of his vocal range, enunciating his consonants, and adopting a rhythmic style that mixes metric and speech-like rhythms.
Other vocal and lyrical aspects were easily converted to binary data, such as whether singing or profanity were present in a song.24 While this methodology for analyzing lyrics and vocals is undoubtedly subjective, prior scholarship has demonstrated that these musical aspects are resistant to objectivity: timbre and articulation are difficult to systematize, rhythm—with all of its microtemporal deviations—is labor-intensive to analyze, and the analysis of lyrics opens a Pandora's box of interpretive possibilities. That said, gathering a generalized representation of vocals and lyrics across a small but broad set of parameters allows for a general formulation of vocal and lyrical profiles for each song.
Production parameters: critical listening as a tool for analysis
In the audio engineering profession, a small set of tools exist with which engineers can measure and analyze production parameters such as loudness and compression. The measurements these tools provide are limited, and must be analyzed with the aid of critical listening. Audio engineers learn critical listening as a skill to assess, by ear, the typical audio processes (and their specific parameters) that occur during the production of sound. For example, an engineer can listen to the final master mix of a song and determine what type or brand of reverberation was used on the vocal tracks, and its specific parametric settings such as reverberation tail length, the relative timbre, the size of the emulated space, etc. Critical listening training typically takes place on the job or via an audio engineer ear-training program, such as those described by Quesnel (2009) and Martin and Massenburg (2015). In this study, a custom methodology was developed, based on critical listening practices to systematically analyze the aurally salient production parameters. Despite the inherent subjectivity in analysis via critical listening, the consistency of the methodology allows for a degree of rigor to be maintained through this practice. The production parameters were mainly analyzed with respect to each audibly salient component in the mix. Composite parameters (those heard over the whole mix) were also analyzed, though in a more basic fashion.
COMPONENT ANALYSIS, STEREO IMAGING, AND PRODUCTION EFFECTS
The first and most important part of production parameter analysis was to identify every distinct audible element (henceforth called instrument) over the course of each song.25 Distinctions were made between instruments that were sampled and those that were recorded directly. If a sample featured multiple timbres (e.g. a brass section), it was considered one instrument. Conversely, if a drum loop was comprised of individual samples (e.g. unique samples for kick drum, snare, and hi-hat), these were considered individual instruments. Direct-recorded instruments were also treated individually. Some songs had very limited lists of instruments—as low as three—and others reached well beyond ten (still very few by today's production standards). For organizational purposes, instruments for each track were named in a consistent manner; 121 unique instrument names were used. For example, the seven instruments identified in "Straight Outta Compton" (N.W.A., 1988) are shown in Table 4, and consist of sampled drum loop, hi-hat, guitar, orchestra hit, sampled saxophone, vocals, and turntables.
In Audio Example 11, we can hear the instruments listed in Table 4. The drums come from a single sample (limited in frequency bandwidth, mono point-source within the stereo image, and custom equalization has been used to add low end to the kick). A hi-hat has been added to this drumbeat, likely from a pre-recorded loop: it does not form part of the drum sample (full/unlimited frequency bandwidth and panned to a slightly different position in the stereo image than the drums). The guitar appears to be directly recorded: its timbre sounds clean and no background noise matches its stereo image. The orchestra hit comes from the classic sample heard on many electronic keyboards. Due to its unnatural timbre, the saxophone was determined to be a sample. Finally, vocals are present throughout (without any backing vocals) and turntables (scratching) make consistent appearances.
Each instrument was qualified with a stereo image position (panning position) and a list of any production effects used on it. To determine the panning position of each instrument, a mono source of static noise was generated in Pro Tools and panned around the stereo image until it was perceived to line up with the instrument in question. Numbers represent this panning position: -100 left, 0 center, +100 right. In "Straight Outta Compton" (Table 4), the vocals appear in the center of the stereo image, so they receive a 0. For stereo sounds which occupy a spread of space along the stereo image two numbers are recorded: the left extremity and the right extremity. For example, the orchestra hit has been panned in stereo full left and right, which gets a value pairing of -100 and 100. Only certain instruments seem to be treated with audible production effects. 16 unique production effects were identified across the corpus. For example, in "Straight Outta Compton" the hi-hat has been treated with a short reverb and the vocals with a long reverb.
Additional attention was given to special production effects used on vocals, since they are typically such an aurally salient part of the production. The presence or absence of backing vocals and vocal doubling was documented. If either of these production techniques was used, the number of vocal layers was approximated. The distinction between vocal doubling and backing vocals was made using the following principles: a vocal double must be done by the same voice as the "lead vocal". The vocal double must also be at the same pitch level as the lead vocal. All harmonies and other voices were considered backing vocals.
TRACK LOUDNESS, COMPRESSION, AND PHASE COHERENCE
In addition to instrument-specific analysis, production parameters pertaining to the composite track mix were analyzed. These parameters included dynamic range, master bus compression, and overall stereo image width (or phase coherence). Dynamic range was measured in loudness units measured to full scale, or LUFS, which are calculated in negative numbers with zero being the highest possible. Compression was measured with the DRC meter from PitchTech (2016) and reflects the overall loudness range between the weakest and strongest audio signals in the track.26 Phase coherence values reflect an overall measurement of how wide or narrow the stereo presentation of the program is (-1 extremely wide, 1 extremely narrow). Across these production effects, critical listening and audio engineer analysis tools allowed for the development of a quantifiable and comparable set of data with which to complement the musical parameter data.
Armed with the musical and production parameter data in an analysis-friendly format, we proceed to the statistical analysis in order to elucidate trends of change, prevalence, and similarity. Trends of change and prevalence document the relationship of corpus songs to general sonic parameters, while trends of similarity document the relationships between individual songs.
Trends of change
Trends of change illustrate the aesthetic state of hip-hop music at the beginning and end of the Golden Age. In the case of some parameters, trends of change can be explained by specific events that had a lasting impact on the way hip-hop music was composed or recorded. The Grand Upright v. Warner Brothers lawsuit had significant ramifications on the sampling practices of DJs and producers. The emergence of gangsta rap influenced the lyrical tendencies of MCs. And the mainstream popularity of Dr. Dre's The Chronic (1992) ushered in a more laid-back, instrument-heavy production aesthetic known as G-funk. Such events altered the course of music making in hip-hop across many domains: tempo, form, texture, lyrics, pitch collection, loudness, and compression. The following paragraphs provide an overview of the trends of change observed in these domains.
Considering its mean value per year, song tempo trends downward over time. The highest mean tempo occurs in 1986 (112.71 bpm), while the lowest occurs in 1995 (87 bpm). Figure 7 shows the distribution of tempi in a scatter plot: each dot represents a song classified by its year and tempo. Though there are notable outliers in both high and low tempi, the majority of songs cluster around the line of best fit, which illustrates the downward trend. Condit-Schultz (2016) found a similar trend in a corpus that covered a much broader timespan. Hip-hop's crossover with rock, exemplified on albums such as License to Ill (Beastie Boys, 1986) and Raising Hell (Run DMC, 1986), might account for some of the faster tempi in the early years of the corpus, as do some of the songs that Krims (2000) might classify as "party rap", such as those by MC Hammer, Sir Mix-a-lot, and Heavy D. and the Boyz. Subsequently, the rise of G-funk-influenced West-coast artists (particularly Dr. Dre, Snoop Dogg, and 2Pac) may speak to the slower tempi of the early nineties.
Though nearly all hip-hop songs contain verses (where rapping typically occurs), a wide variety of section types—or none at all—can be interposed between these verses.27 The content of these interposing sections in each song was evaluated and assigned one of four labels: chorus, refrain, extended refrain, or none. The percentage of songs in each year receiving these labels is illustrated in Figure 8. The best-fit curves in this graph appear to neatly partition the time period in two: from 1986 to 1991, each section type encountered brief moments of dominance in the corpus (except "neither chorus nor refrain", which declines slowly throughout the whole time period). From 1991 onwards, the chorus rises to unchallenged prominence as the preferred hook type. In fact, in 1995 and 1996, all songs in the corpus had choruses. How can this trend towards total chorus prevalence be explained? One hypothesis suggests that when hip-hop gained more visibility and a broader audience base, it became subject to the same sort of commercial requirements as other mass-produced popular music genres: it is no accident that top-selling hits contain catchy hooks. Indeed, Krims (2000, pp. 85–86) notes that prior to the arrival of G-funk and "don rap" (which he describes as being heard on Notorious B.I.G.'s releases), R&B-style sung choruses were rare.
In 1991, Grand Upright, Ltd. v. Warner Brothers, Inc. named Biz Markie and his label Cold Chillin' as having unlawfully used a sample from Gilbert O'Sullivan's "Alone Again (Naturally)" in Markie's song "Alone Again". Markie et al. lost the case, and a new precedent was set whereby hip-hop artists and their labels were much more cautious in their use of sampled material. Demers (2006, p. 97) outlines the ramifications for this and subsequent sampling lawsuits: top-level artists who could afford to pay the clearance costs still used samples, and indie artists, who were essentially below the radar of the big industry, continued to use samples relatively undetected. Mid-level hip-hop artists resorted to shorter samples (which were more difficult to identify), real instruments, and samples with lower clearance costs. Figure 9 shows this trend by plotting sample usage over time, again with dots representing individual songs by year. Note that this graph only reflects the portion of the corpus where samples were audibly present—48 songs—and that these samples were tallied by ear. This method of sample taxonomy was chosen because it reflects how the samples would have been identified by legal parties pursuing compensation for unfair use; the less audible the samples were to them, the less they are to the average listener. By the mid-nineties, the use of real instruments was notable on releases by Dr. Dre, Snoop Dogg, Notorious B.I.G., and others.
SINGING AND PROFANITY
The presence of singing in hip-hop consistently increases over the 11-year period considered in this study: only two of the earliest 30 songs contained singing, while 11 of the latest 20 songs did (as detailed in Figure 10a). This trend can be explained by the fact that singing almost always occurs during the chorus, thus a rise in chorus prevalence suggests a concomitant rise in singing. In many cases—"Nuthin' but a G Thang" (Dr. Dre ft. Snoop Dogg, 1992), "Tennessee" (Arrested Development, 1992), and "Tha Crossroads" (Bone Thugs N Harmony, 1995) being notable exceptions—guest artists do the singing. (Contemporary hip-hop has seen an increase in MCs doing their own singing; Drake and Kendrick Lamar are notable examples of this.) Figure 10b shows the use of profanity over time in the corpus, which exhibits a notable upward trend from 1990, tapering off around 1993. During this time, the West-coast dominance of hip-hop led by Dr. Dre and Snoop Dogg was beginning to occur. These rappers were steeped in the L.A.-bred gangsta rap tradition, pioneered by acts such as N.W.A. (and its members' solo releases), Ice-T, and Too $hort. Besttickets.com contributor Andrew Powell-Morse (2014) has undertaken a large-scale survey of profanity in hip-hop since 1985, and his findings suggest that before the 1994 East-coast releases Illmatic (Nas) and Ready to Die (Notorious B.I.G.), profanity in rap was almost wholly a West-coast phenomenon.28
LOUDNESS AND COMPRESSION
The term "loudness war" refers to a historic trend in popular music production where songs were produced with ever-increasing overall volume. Producers and psychoacousticians agree that if two songs are essentially equal in every way except for loudness, the louder song will sound better to the listener.29 Music producers were (and still are) "warring" to be louder than one another. Recording mediums have a limited amount of headroom, which the audio signal cannot exceed, so producers would apply additional dynamic range reduction so the entire signal could be made louder. This trend can be identified in the earliest years of the corpus with an increase in both overall loudness and dynamic range reduction (compression) shown in Figures 11a and 11b.
Pitch collection in hip-hop music resists simple analysis: in a beat, samples may not be exactly in tune with one another, or might be shifted from their original pitch. Determining the home note or key of a hip-hop track may not always be possible, but determining its pitch collection (the underlying scale or mode) can be a more productive pursuit. The proportion of songs with no discernible pitch collection differs drastically between the earlier and later halves of the corpus: of the 46 songs released before 1991, 17 (37%) had no discernible pitch collection, compared to just 5 (9%) of the 54 songs released in 1991 or later. This trend may also be linked to the decline in sample use: as producers experimented more with real instruments, beats were more often composed ex nihilo, or were interpolations of existing compositions, re-recorded in place of being sampled.30 However, there are still outlying examples that buck this trend: "Check the Rhime" and "Award Tour" (A Tribe Called Quest, 1991 & 1993) have aurally salient pitch collections in sample-heavy environments.
54 songs in the corpus exhibited a salient harmonic progression in their verses. Similar to the trends observed for pitch collection, the use of harmonic progressions increases over time. In addition to this trend, songs with four or more chords in their harmonic progressions outnumbered songs with two chords, but only after 1991. Notable are the years 1995 and 1996, in which 10 of the 11 songs exhibit harmonic progressions with four or more chords. Some of these progressions go as far as displaying directional tonality or parsimonious voice leading, as shown in Figures 16a-c and Audio Examples 12, 13, and 14. "Big Poppa" (Notorious B.I.G., 1995) uses harmonic devices such as applied chords and half-diminished ii chords, momentarily obfuscating the tonic by shifting between F major and A minor. "Gangsta's Paradise" (Coolio, 1995) follows a common minor mode progression: VI - iv - V - i. Finally, "I Wish" (Skee-Lo, 1995) chains together four major 7th chords, with parsimonious voice leading via multiple common tones between each chord. These are just a few examples of how hip-hop harmony began to get increasingly sophisticated toward the end of the Golden Age.
Trends of change paint a vivid picture of how hip-hop music and production entered and exited its Golden Age. During this time, hip-hop became louder, more compressed, more chorus-friendly, slower, less reliant on crude sampling techniques, featured more singing and profanity, and became more tonally grounded with intricate harmonic progressions. These observations confirm those made by Condit-Schultz, Powell-Morse, and Demers for hip-hop music over a longer time period, suggesting that the trends of change observed in the Golden Age continued well onward into subsequent eras of hip-hop's history. Trends of change show how the songs in the corpus relate to general sonic parameters, demonstrating ways in which this repertoire evolves and develops through those parameters. The Golden Age sound can thus be defined, in part, by its ancestry and legacy.
Trends of prevalence
Throughout the corpus, several sonic parameters have either remained constant, or have maintained a distribution that tends heavily towards its mean value. Some trends of prevalence essentially confirm trends of change: the mean tempo for the corpus, for example, lies almost exactly halfway between the highest and lowest mean tempi by year. Others suggest a preference for certain stylistic norms, such as for pitch collection. The parameters discussed here include mean tempo, mean tempo as a function of lyrical topic, pitch collection, aspects of vocal delivery (timbre, tessitura, articulation, and rhythm), and form.
Given London's (2004) findings, which suggest that listeners prefer tempi between 86 and 120 bpm, we might expect the tempi in this corpus to fall mainly between those two numbers as demonstrated in Figure 13. 86% of the songs have tempi between London's boundary values, and 95% have tempi between 80 and 130 bpm. Furthermore, just as London reports a maximal pulse salience of roughly 100 bpm, the mean tempo for the corpus is 100.42 bpm. These correlations suggest that a listener's ability to maintain an embodied response to hip-hop music, be it through dance, toe tapping, or mindful entrainment, is related to the music's critical or commercial success.
TEMPO V. LYRIC TOPIC
Correlating tempo with lyrical topics reveals some subtle trends of prevalence. While many groups of songs classified by lyrical topic had mean tempi close to the overall mean, several groups stood out. Songs with lyrical topics such as reminiscence or moral-based storytelling trended toward slower tempi, while songs with topics like partying and humor trended toward higher tempi. These correlations may not seem surprising: up-tempo songs are more suitable for dancing, and thus party-themed lyrics might seem apropos. We recall that Krims's (2000, p. 56) "party rap" genre uses faster beats with the goal of "moving the crowd". The somber emotional songs about reminiscence would suggest that slower tempi are appropriate. Furthermore, if moral-based storytelling aims to impart a clear, unambiguous message, less complicated and slower rapping atop a slower beat might strengthen the clarity and potency of this message.
We have already observed the trend of change toward greater clarity of key in the later years of the corpus. Taking stock of the 78 songs that displayed salient pitch collection reveals the data found in Table 5. Before dissecting this data, it should be noted that certain inferences had to be made with some songs. In some cases, an incomplete number of pitches or harmonies were noted and thus a scale or mode was assigned without 100% certainty. In these and all other situations, labels were decided upon by ear: if a specific pitch collection was perceived, even if it was not completely represented in the transcription, this was deemed a strong enough indicator. Often the missing pitch was scale degree 6, thus making differentiation between Dorian and Aeolian mode difficult.31 Table 5 partitions the pitch collection schemes at larger levels, which yields a more interesting observation: of the 77 songs that were classified as representing a mode or diatonic scale, roughly two-thirds (51 of 77) were in minor modes (those with a lowered scale degree 3): Dorian, Aeolian, Phrygian, or harmonic minor. 42 of these 51 were Dorian or Aeolian.
By investigating parameters of vocal delivery—timbre, tessitura, articulation, and rhythm—we can observe which types of these parameters occur more often (e.g. do MCs more frequently vocalize high or low in their voice), and whether specific combinations of these parameters occur more commonly than others (e.g. does rapping high in the voice typically exhibit a rougher timbre). The abbreviations shown in Figure 14 were used when analyzing vocal delivery parameters. Table 6 lists the most common "vocal profiles"; that is, specific combinations of the four parameters. For instance, the most popular vocal profile includes a gravelly timbre, high tessitura, dull consonant articulation, and rigid vocal rhythm, or in short form, GHDR. To get a sense of what this sounds like, Audio Example 15 contains an excerpt of "Hey Ladies" (Beastie Boys, 1989). Overall, certain aspects of vocal delivery are dominant: gravelly vocal timbre, high and medium tessitura, and rigid vocal rhythm.32 Woods' (2010) observations on connecting projections of masculinity to vocal timbre support this paper's findings: many of hip-hop's lyrical topics are connected with masculine imagery (e.g. braggadocio, financial success, the tough nature of surviving street life, and sexual prowess), and the use of a masculine-sounding, rough timbre might enhance the delivery of these topics.
Several formal characteristics were widely consistent across the corpus. First, of the 69 songs that contained hook sections of consistent length (i.e. each hook section in a song is the same metrical length), 52 (75%) contained eight-measure hooks, 9 (13%) contained four-measure hooks, and the remaining 8 songs had hooks of 6, 9, 10, 12, and 14 measures. The mean value for the number of distinct verses in each song from the corpus was 3.11. 77% of the songs had either three or four verses, while 92% of the songs had two, three, or four verses. As many of the songs in the corpus received significant radio play and were commercial hits, formal consistencies such as eight-measure choruses and three or four verses per song are perhaps not surprising.
To summarize, the trends of prevalence reveal several tendencies across the corpus. Lighter lyrical subject matter seems to correlate with faster tempi, and the converse is also true. Tempi trend toward a mean of 100 bpm, with a median of 99 bpm. Vocal delivery trends towards a prevalence of features such as gravelly timbre, higher-tessitura vocals, and metrically rigid vocal rhythm. Form and tonality trend towards three or four verses per song, usually in minor modes. Similar to trends of change, trends of prevalence reference general sonic parameters. However, instead of relating the Golden Age to its ancestry and legacy, trends of prevalence represent the era's defining characteristics.
Trends of similarity
We have observed trends of change and prevalence through various sonic parameters. These observations were usually represented by a mean value or a percentage: they told us something about the whole corpus, but nothing about any one song in the corpus. Brackett (2016) suggested that closer one examines two texts, the greater the dissimilarities between them would seem. But if we only focus on the similarities between the two texts (in our case, songs), is that not sufficient for grouping these together in the same repertoire? Recall the case of Toronto hip-hop artists: they still can be considered part of the Toronto sound, even if they do not sound exactly like Drake or rely on "underwater R&B". As long as some convincing element of sonic similarity relates these artists to each other (and/or to Drake), they are all a part of the Toronto sound, even if that element of similarity is not consistent. Thus, the artists are connected through a chain or network of various similarities, rather than by a top-down, all-encompassing similarity.
Measuring similarity between songs in the corpus involves taking any song pair, comparing the musical and production parameter data, and determining, on a numerical scale, how closely similar the two songs are. To do this, we need first to establish whether each analyzed parameter should contribute equally to the similarity measurement. For instance, do two songs with the same tempo sound more similar than two songs with the same chorus type? Parameters that are more aurally salient were given more weight in the similarity computations. Salient parameters included tempo, hook type, lyric subject, instrument list, and vocal production, while parameters such as stereo image width and verse length were considered less salient. With a relative weighting scheme in place, similarity tests could be run on the 4950 unique pairings of the 100 songs, thereby generating a similarity rating for each pair.
To accomplish this task, the Daisy application from the cluster package in the programming language R was used (R Core Team, 2015; Maechler et al., 2015). This application facilitates the combination of different types of data (including numerical and categorical types) into a single computation of similarity rating. Daisy also permits the use of a custom weighting scheme. All 4950 pairwise similarity ratings were combined into a single dissimilarity matrix, which was then used to develop a divisive hierarchical cluster, shown as a dendrogram in Figure 15. A divisive hierarchical cluster graph creates smaller clusters of entries in stages from a parent group of entries. In this case, Figure 15 shows how the computer program determined a similarity-based categorization scheme at a level that created eight groups, as denoted by color. Pairs of songs that lie very close to the left-hand edge of the dendrogram are statistically the most similar. For example, such ultra-similar pairings can be seen between "Hey Ladies" and "Shake Your Rump" (both Beastie Boys, 1989), and between "Shook Ones Part II" (Mobb Deep) and "Flava in Your Ear" (Craig Mack et al.).
Daisy was run again with the 100 songs, this time using genre data collected from www.acclaimedmusic.net (Franzen, 2016). (This website classifies each song and album according to a genre label system, the origin of which is unclear.) The same hierarchical clustering algorithm was used to form four song groups based on genre similarity, as shown in Table 7. In general, these groups resemble the four genre categories laid out by Krims: group 1 appears to be a mix of Krims' reality rap and jazz/bohemian genres; group 2 resembles his party rap genre; group 3 consists almost exclusively of gangsta rap and G-funk (aspects of Krims' mack and reality genres); while group 4 seems like a 'leftover' group, bearing no consistency with any Krimsian genre. The goal of running this categorization process on both the analyzed data and the Acclaimed Music genre data was to see if the results produced similar groups.
CAN TRENDS OF SIMILARITY BE HEARD?
In general, the eight groups produced by the analyzed data (Figure 15) and the four categories produced by Acclaimed Music's data (Table 7) were extremely uncoordinated. Figure 16 shows a dendrogram for the four Acclaimed Music groups, using the color of each song title to indicate the eight analyzed data groups. (If the groups were similar, the colors would be clustered together.) This discordance suggests either that the weighting scheme used in this study was erroneous for determining similarity, or that sonic parameters were not use to produce the genre labels found at Acclaimed Music. If the latter holds true, we are presented with a conundrum. If Acclaimed Music categorizes its music by genre, then it would be reasonable to believe that artists or albums in the same genre might sound somewhat similar. Why, then, are the grouping strategies so dissimilar? To be sure, we do not know how Acclaimed Music defines its genre labels, but with such stark differences between the two groupings, the ones generated by this study should be re-evaluated to determine if trends of similarity can, in fact, be "heard".
Consider the yellow-colored group of nine songs from Figure 15. These include "How Ya Like Me Now" (Kool Moe Dee, 1987), "I Know You Got Soul" (Eric B. and Rakim, 1987), "Follow the Leader" (Eric B. and Rakim, 1988), "Mama Said Knock You Out" (L.L. Cool J, 1991), "Hey Ladies" and "Shake Your Rump" (Beastie Boys, 1989), "They Want EFX" (Das EFX, 1992), "Vapors" (Biz Markie, 1988), and "Wild Thing" (Tone Loc, 1989). Audio Example 16 chains together short clips of each of these songs. With perhaps the exception of "Wild Thing", each song in this group shares aurally salient similarities with another: six have very similar tempi, eight have eight-measure choruses, each has a vocal profile almost exactly the same as one another song in the group, four are in the Dorian mode, six have refrains, all have relatively low levels of dynamic compression, and they all mainly comprise complex combinations of sampled and machine-produced drum sounds. These features are not unique to the yellow group, but their specific orientation in these songs caused Daisy to group them as maximally similar. Audio Example 17 assembles the black-colored songs from Figure 15. Again, we can hear convincing similarities between these songs, which were all released in 1995 and 1996. (This is particularly notable since the similarity ratings generated in this study did not consider release year.)
However, even in the small groups presented here, the elements of similarity between songs are not wholly consistent. If the groups were made larger, the similarities connecting songs would become even more diverse. Trends of similarity do not necessarily identify stylistic norms across a whole repertoire. (Of course, the fact that all songs in this corpus have vocals and beats is itself a trend of similarity, but a rather useless one to consider here.) While a song in a repertoire may not sound like all other songs in that repertoire, it will sound similar to one or several other songs. Thus, trends of similarity are an important component to the definition of a repertoire's sound because they represent the actual individual songs which comprise that sound.
This paper presents an attempt to codify the sound of Golden Age hip-hop through corpus-driven analysis. Using eight "best-of" lists (five of which branded themselves as "best of all time"), a corpus of 100 songs was developed. The 100 songs were analyzed across an extensive set of aurally-salient musical and production parameters. The analysis data was then converted into numerical or binary values, a format that is more amenable to statistical analysis. Statistical analysis revealed three trend categories.
Trends of change reflect the wholesale sonic shifts that occurred during the Golden Age. The most notable trends of change occurred in the parameters of tempo, form, instrumentation, lyrics/vocals, and loudness/compression. In general, song tempi became increasingly slower, with the mean tempo per year falling by nearly 10 beats per minute over the 11-year span covered by the corpus. Over the same period, the use of samples also dropped notably, which might be due to the increased legal attention given to hip-hop production during this time. Singing, choruses, and profanity all increased in use over time. Finally, overall loudness and compression increased as a result of the ongoing loudness war between record producers.
Trends of prevalence were observed where specific parameters exhibited certain trends across a significant portion of the corpus. These trends were observed in the parameters of tempo, tempo vs. lyrical topics, harmonic organization, and vocal profiles. In general, song tempi clustered around a mean of approximately 100 bpm, and over 85% of songs had tempi within London's (2004) preferred tempo window of 86–120 bpm. Songs with lighter lyrical topics tended to exhibit faster tempi than songs with serious topics. When measured according to timbre, articulation, rhythm, and tessitura, vocal profiles more frequently included a gravelly vocal timbre, high or medium tessitura, and dull consonant articulation. Finally, nearly two-thirds of the songs in the corpus contained eight-measure hooks and, on average, three to four verses. Even though the aforementioned trends of change suggest that the Golden Age sound was a transitionary one, these trends of prevalence show that several parameters remained quite constant throughout this era.
Trends of similarity were identified on various hierarchical levels, beginning with the entire corpus and extending down to pairs of songs. In contrast with trends of change and prevalence, trends of similarity are defined at the local, song-to-song level. This trend relies on the logic that although not all songs in a repertoire may sound alike, a song's similarity with one or more other songs is perhaps sufficient criteria to be considered part of that repertoire's sound.
This research is preliminary in nature. Hip-hop music has been given close academic attention only recently in the music-theory community; large-scale, corpus-driven studies currently advance the field in their effort to establish consensus on the stylistic norms of hip-hop music. The data generated and analyzed in this study could be used for further research in several meaningful ways. First, it can be used to demonstrate idiosyncrasies in the stylistic practices of certain artists active in the Golden Age. Second, machine learning could be utilized to further compare the analyzed data with genre labels generated by music databases and retailers. Using the analyzed data, a computer program could determine a weighting scheme that generates a grouping as similar as possible to Acclaimed Music's grouping. If successful, we might better learn what parameters influence how Acclaimed Music determines hip-hop genres. Third, this study did not address the role played by the sampled recordings themselves in defining the Golden Age sound; specifically, the breakbeat samples which were ubiquitous in this era. It also did not investigate technological advances in sampling methodology or drum machine construction. Due consideration of these hip-hop production aspects could provide an even more thorough description of the Golden Age sound.
The most intriguing future direction this research could take involves running listening tests with human subjects. Since the data for this study was collected manually and by ear while the similarity analysis was performed by computer, it could be telling to see how human listeners discern similarity in the corpus. Future research could include developing simple listening tests wherein subjects are exposed to short fragments of the songs in the corpus and asked to rate pairs of these fragments according to similarity. Their results could then be compared with the similarity rankings generated by Daisy based on the manually-inputted analysis data. This test would provide insight into the salient parameters that are important to listeners when evaluating similarity between songs.
The significance of this study can be expressed in terms of three broad categories. First, its contribution to the growing body of analytical work on hip-hop helps further our understanding of this music's stylistic roots and its evolution to the present day. Second, this study considers an often-overlooked aspect of hip-hop music: the in-studio contributions of the recording engineer and record producer. Some of the most salient characteristics of hip-hop music are not produced by its artists, but instead are the audio effects, loudness boosts, stereo imaging, and panning added to the track by the engineer and producer. Overlooking these aspects of hip-hop's sound does the stylistic analysis of this repertoire a great disservice. Third, this study presents a labor-intensive but logical method for describing the sound of a body of repertoire, be it of a genre, era, or locale. This method revolves around the analysis of general sonic characteristics in a musical corpus and the identification of low-level similarities among songs in that corpus. Considering either of these steps alone furnishes an incomplete definition of a repertoire's sound, just as considering style alone furnishes an incomplete definition of genre.
The Golden Age was a diverse and fluid period in hip-hop's history, and this study reveals some of the trends of change, prevalence, and similarity that were present in the hip-hop music of that era. These trends, considering sonic parameters that are aurally salient to a listener, present us with a way of defining the Golden Age hip-hop sound by the songs which comprise it.
The authors would like to thank the Centre for Interdisciplinary Research in Music Media and Technology (CIRMMT) at McGill University for funding this research.
- The origin of "Golden Age" as applied to hip-hop is unclear, but the term was used as far back as the mid nineties, when a Rolling Stone review of Behind Bars (Slick Rick, 1994) referred to "rap's '86–'89 Golden Age, when it seemed that every new single reinvented the genre" (Coker, 1995). That this review occurred during the timespan covered by this study suggests that any definition of the Golden Age is, at best, accurate insofar as it suits its own needs.
Return to Text
- Allan Moore's soundbox model (2012, pp. 29–44) addresses some of the considerations one must take into account when analyzing recorded music, including several, but not all, of the production parameters discussed in this study.
Return to Text
- Held in New York City, the 1995 Source Awards were the boiling point of the East-coast vs. West-coast hip-hop rivalry which led to the deaths of Tupac Shakur and the Notorious B.I.G. Cantor (2015) chronicles the 1995 award ceremony through an interview with Source Magazine's then-owners, David Mays and Ray "Benzino" Scott.
Return to Text
- The website notes, incorrectly, that The Chronic was released in 1993.
Return to Text
- See Caramanica (2005).
Return to Text
- McLeod and DiCola (2011, p. 5) refer to a "Golden Age of sampling", which extends from the late eighties to early nineties. Weinstein (2007, p. 341) suggests 1988–1993, and Steinberg et al. (2006, p. 361) suggest 1987–1993, citing this era as the period when the musical and lyrical elements of rap music began to overshadow other aspects of hip-hop culture such as breakdancing and graffiti art. Williams (2013, p. 47) suggests 1986–1993, noting gangsta rap's hegemony over other styles of hip-hop music as signaling the Golden Age's end.
Return to Text
- Raising Hell (Run–D.M.C., 1986) and License to Ill (Beastie Boys, 1986) are universally lauded as pillars of early hip-hop music, instrumental in the expansion of hip-hop into the mainstream. According to complex.com, License to Ill has sold 9 million copies and remains today the eighth best-selling hip-hop album of all time.
Return to Text
- Pachet and Cazaly (2000) describe and criticize the rigor of genre taxonomies used by physical and online music retailers. Perhaps reflecting on the perilousness of detailed genre classification, the Library of Congress Subject Headings (2016) lists just six "narrower terms", or sub-genres of, rap music: bounce, Christian rap, crunk, gangsta rap, go-go, and ragga. Of these six, only gangsta rap and crunk appear with any frequency in the leading academic discourse on rap. Religion in rap receives some attention, but not as a separate musical genre per se.
Return to Text
- Machine-led genre classification studies aim to be objective in their efforts to taxonomize genre. Pachet and Cazaly (2000) focused on similarity in automatic genre classification. McKay and Fujinaga (2004) analyzed musical parameters drawn from midi data, stressing high-level features over low-level ones. Bergstra et al. (2006) used feature-extraction programs to analyze very short segments of audio. Tzanetakis and Cook (2002) categorized genre using timbre, rhythm, and pitch feature sets.
Return to Text
- This taxonomy is presented in chapter 2 of Krims (2000, pp. 46–92). Other publications (Keyes, 2002 and Hess, 2010) also discuss subgenres of hip-hop, but in a less systematic manner, only where required to support other arguments.
Return to Text
- Krims (2000, p. 89) confirms this by stating that "genres are constantly shifting entities, guidelines at best".
Return to Text
- Temperley and de Clercq (2011) used the 20 top-ranked songs from each decade to form their corpus. With a sample size of two songs per year on average, they were able to show large-scale trends of prevalence, and due to their long timespan, they were able to show trends of change. With a shorter timespan, more songs per year would be required to identify any sort of meaningful trends.
Return to Text
- Admittedly, so do several of the "best-of" lists used to generate the corpus (see note 15). Though developed independently of Billboard statistics, the corpus does contain many tracks which made it onto the Billboard Hot Rap Songs charts between 1989 and 1996.
Return to Text
- Burgoyne (2012) stresses the importance of using proper sampling methodology when developing a representative song corpus from Billboard statistics.
Return to Text
- The three time-specific lists do not significantly skew the content of the corpus. The first list 1985–1996 as its boundaries. The second list also ends at 1996. The third list considers releases from the nineties. Thus any songs released after 1996 were discarded. Had these lists not been included, and only the five "all time" lists been consulted with the same culling criteria, 81 of the 100 songs would have still been selected for the corpus.
Return to Text
- Stephenson (2002) and Moore (2012) have led the efforts to fill this void, the latter insisting on doing so only as a starting point for further consideration of recorded music's meaning to its listeners.
Return to Text
- See Stephenson (2002, xiii) and Moore (2012, 2).
Return to Text
- The application used can be found at www.tempotap.com.
Return to Text
- Measure lengths were all determined in the same way, based on four tacti (hence creating a 4/4 measure) of the same type that were used to determine tempo.
Return to Text
- In addition, various aspects of textural and harmonic change between formal sections were analyzed: for instance, whether samples or instruments were added when a verse yielded to a hook section, and whether these additions bore any effect on the harmonic progression. In the end, these parameters, though interesting, were not deemed salient enough to be given any weight in the analysis of trends of similarity.
Return to Text
- Edwards (2009) also explores the categorization of subject matter in hip-hop lyrics, citing five topics (real-life, fictional, controversial, conscious, and club/party), and five forms (braggadocio/battling, conceptual, story, abstract, and humorous).
Return to Text
- The analyses of these aspects are admittedly basic but are based on aural salience. Kautny (2015) and Adams (2009, 2015) have investigated articulation, Ohriner (2016) and Condit-Schultz (2016) have done more detailed work on rhythm, and Condit-Schultz has also investigated pitch.
Return to Text
- The idea for these two descriptors of vocal rhythmic style stems from Krims' (2000, pp. 49–51) concepts of "percussion-effusive", "speech-effusive", and "sung" styles of rap flow.
Return to Text
- Profanity was measured by scanning all lyrics for the presence of four words: F*ck, sh*t, bi*ch, and ni**a.
Return to Text
- Instruments heard only once in the song were deemed unimportant for analysis and thus discarded. Adams (2015) agrees that these instruments do not form an important component of a hip-hop song's identity.
Return to Text
Return to Text
- For example, "Paid in Full" (Eric B. and Rakim, 1987) contains only one long verse, and thus has no space for interposing sections.
Return to Text
- Powell-Morse's (2014) study only surveys five albums per year. Notable early instances of profanity-laden albums not from the West coast are Grip it! On That Other Level (Ghetto Boys, 1989) and Live and Let Die (Kool G Rap, 1992) originating from Houston and New York, respectively.
Return to Text
- For a review of the "loudness war" as a psychoacoustic phenomenon, see Vickers (2010).
Return to Text
- The legal and cost difference between interpolation and sampling can be summarized as follows: when holding a mechanical license, a producer can re-record / reproduce the music specified under that license, which lies under the copyright of the songwriter(s). To physically sample the actual recording requires both the mechanical license and additional licensing from the copyright owner of the recording (typically the record label). Thus, the dual licensing required for sampling costs more than securing only the mechanical licensing to reproduce the music with instruments.
Return to Text
- Aeolian mode represents the commonly-named natural minor scale. Raised scale degree 6 differentiates Dorian from Aeolian mode. Therefore, if this scale degree was missing from a song, it would be difficult to accurately determine which mode better describes the song's pitch collection.
Return to Text
- Though no significant trend of prevalence with respect to enunciation was noted, Edwards (2009, pp. 244–246) states that many MCs stress the importance of clear enunciation to avoid perceived ambiguity in their lyrics.
Return to Text
- Adams, K. (2008). Aspects of the Text/Music Relationship in Rap. Music Theory Online, 14(2).
- Adams, K. (2009). On the Metrical Techniques of Flow in Rap Music. Music Theory Online, 15(5).
- Adams, K. (2015). The Musical Analysis of Hip-Hop. In J. Williams (Ed.), The Cambridge Companion to Hip-Hop (pp. 118–134). Cambridge, UK: Cambridge University Press. https://doi.org/10.1017/CCO9781139775298.012
- Bergstra, J., Casagrande, N., Erhan, D., Eck, D., & Kégl, B. (2006). Aggregate Features and AdaBoost for Music Classification. Machine Learning, 65(2), 473–484. https://doi.org/10.1007/s10994-006-9019-7
- Brackett, D. (2016). Categorizing Sound: Genre and Twentieth-Century Popular Music. Berkeley, CA: University of California Press.
- Burgoyne, J. A. (2012). Stochastic Processes and Database-Driven Musicology. Unpublished doctoral dissertation, McGill University, Canada.
- Burgoyne, J. A., Wild, J., & Fujinaga, I. (2013). Compositional Data Analysis of Harmonic Structure in Popular Music. In J. A. Burgoyne, J. Wild, & J. Yust (Eds.), Mathematics and Computation in Music: 4th International Conference, MCM 2013, Montreal, Canada, June 12–14, 2013, Proceedings. Berlin, DE: Springer. https://doi.org/10.1007/978-3-642-39357-0_4
- Cantor, P. (2015). How the 1995 Source Awards Changed Rap Forever. Retrieved from Complex.com: http://ca.complex.com/music/2015/08/how-the-1995-source-awards-changed-rap-forever.
- Caramanica, J. (2005). Hip-Hip's Raiders of the Lost Archives. Retrieved from The New York Times: http://www.nytimes.com/2005/06/26/arts/music/hiphops-raiders-of-the-lost-archives.html.
- Coker, C. H. (1995). Slick Rick: Behind Bars. Retrieved from Rolling Stone: https://web.archive.org/web/20100202153447/http://www.rollingstone.com/artists/slickrick/albums/album103326/review/5945316/behind_bars.
- Condit-Schultz, N. (2016). MCFlow: A Digital Corpus of Rap Transcriptions. Empirical Musicology Review, 11(2), 124-147. https://doi.org/10.18061/emr.v11i2.4961
- De Clercq, T. (2012). Sections and Successions in Successful Songs: A Prototype Approach to Form in Rock Music. Unpublished doctoral dissertation, University of Rochester, USA.
- De Clercq, T., & Temperley, D. (2011). A Corpus Analysis of Rock Harmony. Popular Music, 30(1), 47–70. https://doi.org/10.1017/S026114301000067X
- Demers, J. (2006). Steal This Music: How Intellectual Property Law Affects Musical Creativity. Athens, GA & London, UK: University of Georgia Press.
- Edwards, P. (2009). How to Rap: The Art and Science of the Hip-Hop MC. Chicago, IL: Chicago Review Press.
- Fabbri, F. (1999). Browsing Music Spaces: Categories and the Musical Mind. Retrieved from http://www.tagg.org/others/ffabbri9907.html.
- Franzen, H. (2016). Acclaimed Music Full Genre Tree. Retrieved from Acclaimed Music: www.acclaimedmusic.net.
- Golden Age (2016). Retrieved from AllMusic.com: http://www.allmusic.com/subgenre/golden-age-ma0000012011.
- Heidemann, K. (2016). A System for Describing Vocal Timbre in Popular Song. Music Theory Online, 22(1).
- Hess, M. (2010). Hip hop in America: A Regional Guide. Santa Barbara, CA: Greenwood Press.
- Kautny, O. (2015). Lyrics and Flow in Rap Music. In J. Williams (Ed.), The Cambridge Companion to Hip Hop (pp. 101–117). Cambridge, UK: Cambridge University Press. https://doi.org/10.1017/CCO9781139775298.011
- Keyes, C. (2002). Rap Music and Street Consciousness. Urbana, IL: University of Illinois Press.
- Krims, A. (2000). Rap Music and the Poetics of Identity. Cambridge, UK: Cambridge University Press.
- Lee, A. (2016). Drake's Incomplete Views from the Six. Retrieved from Maclean's Magazine: http://www.macleans.ca/culture/arts/drakes-incomplete-views-from-the-six.
- Library of Congress Subject Headings. (2016). Retrieved from The Library of Congress: https://www.loc.gov/aba/publications/FreeLCSH/R.pdf.
- London, J. (2004). Hearing in Time: Psychological Aspects of Musical Meter. Oxford, UK & New York, NY: Oxford University Press. https://doi.org/10.1093/acprof:oso/9780195160819.001.0001
- Maechler, M., Rousseuw, P., Struyf, A., Hubert, M., Hornik, K., Studer, M., Roudier, P., & Gonzalez, J. (2015). Cluster: Cluster analysis basics and extensions. R package version 2.0.3. Vienna, AT: The R Foundation for Statistical Computing.
- Martin, D., & Massenburg, G. (2015). Advanced Technical Ear Training: Development of an Innovation Set of Exercises for Audio Engineers. Paper presented at the 139th Convention of the Audio Engineering Society, New York, NY.
- McKay, C., & Fujinaga, I. (2004). Automatic Genre Classification as a Study of the Viability of High Level Features for Music Classification. In Proceedings of the International Computer Music Conference (pp. 367–70). Ann Arbor, MI: Michigan Publishing.
- McLeod, K., & DiCola, P. (2011). Creative License: The Law and Culture of Digital Sampling. Durham, NC & London, UK: Duke University Press. https://doi.org/10.1215/9780822393528
- Meyer, L. (1973). Explaining Music: Essays and Explorations. Berkeley, CA: University of California Press.
- Moore, A. (2012). Song Means: Analysing and Interpreting Recorded Popular Song. Burlington, VT: Ashgate.
- Ohriner, M. (2013). Groove, Variety, and Disjuncture in the Rap of Kanye West, Eminem, and Andre 3000. Paper presented at the Annual Meeting of the Society for Music Theory, Charlotte, NC.
- Ohriner, M. (2016). Metric Ambiguity and Flow in Rap Music: A Corpus-Assisted Study of Outkast's 'Mainstream' (1996). Empirical Musicology Review, 11(2), 153-179. https://doi.org/10.18061/emr.v11i2.4896
- Pachet, F., & Cazaly, D. (2000). A Taxonomy of Musical Genres. In Proceedings of the Content-Based Multimedia Information Access Conference (RIAO). Paris, FR: CID.
- Pitchtech. (2016). DRC-Meter [computer software]. Retrieved from www.pitchtech.ch/DRC-Meter/index.html.
- Powell-Morse, A. (2014). The Best F*cking Article You'll Read Today: Profanity in Rap Lyrics Since 1985. Retrieved from Best Tickets: http://www.besttickets.com/blog/rap-profanity.
- Quesnel, R. (2009). Timbral Ear Training: A Computer-Assisted Method for Training and Researching Timbre Memory and Evaluation Skills. Saarbrücken, DE: VDM Verlag Dr. Müller Aktiengesellschaft & Co. KG.
- R Core Team. (2015). R: A Language and Environment for Statistical Computing [computer software]. Vienna, AT: R Foundation for Statistical Computing.
- Rollefson, J. G. (2015). 'Got a Freaky, Freaky, Freaky, Freaky Flow': Theorizing 'Illness' in Hip Hop. Paper presented at the Annual Meeting of the American Musicological Society, Louisville, KY.
- Schloss, J. (2004). Making Beats: The Art of Sample-Based Hip-Hop. Middletown, CT: Wesleyan University Press.
- Steinberg, S., Parmar, P., & Richard, B. (2006). Contemporary Youth Culture. Westport, CT: Greenwood Publishing Group.
- Stephenson, K. (2002). What to Listen for in Rock: A Stylistic Analysis. New Haven, CT: Yale University Press. https://doi.org/10.12987/yale/9780300092394.001.0001
- Tagg, P. (2013). Music's Meanings: a |Modern Musicology for Non-Musos. New York, NY & Huddersfield, UK: The Mass Media Music Scholars' Press, Inc.
- Tzanetakis, G., & Cook, P. (2002). Musical Genre Classification of Audio Signals. IEEE Transactions on Speech and Audio Processing, 10(5), 293–302. https://doi.org/10.1109/TSA.2002.800560
- Vickers, E. (2010). The Loudness War: Background, Speculation, and Recommendations. Paper presented at the 129th Convention of the Audio Engineering Society, San Francisco, CA.
- Weinstein, S. (2007). Nas. In M. Hess (Ed.), Icons of Hip Hop: An Encyclopedia of the Movement, Music, and Culture (pp. 341–363). Westport, CT: Greenwood Publishing Group.
- Williams, J. (2013). Rhymin' and Stealin'. Ann Arbor, MI: University of Michigan Press. https://doi.org/10.3998/mpub.3480627
- Winkler, P. (1997). Writing Ghost Notes: The Poetics and Politics of Transcription. In D. Schwarz, A. Kassabian, & L. Siegel (Eds.), Keeping Score (pp. 169–203). Charlottesville, VA: University of Virginia Press.
- Woods, A. (2010). Vocal Practices and Constructions of Identity in Rap: A Case Study of Young Jeezy's 'Soul Survivor'. In N. Biamonte (Ed.), Pop-Culture Pedagogy in the Music Classroom (pp. 265–280). New York, NY: Scarecrow Press.
Examples and Appendices available at: http://hdl.handle.net/1811/81126
- Appendices PDF: contains Appendix A and Appendix B referenced in the paper.
- Tables and Figures PDF: contains all tables and figures (with the below exceptions).
- Figure 15: caption found in Tables and Figures PDF (too large to fit in Tables and Figures PDF).
- Figure 16: caption found in Tables and Figures PDF (too large to fit in Tables and Figures PDF).
Audio Examples available at: http://hdl.handle.net/1811/81126
- Contains all audio examples (in .mp3 format) referenced in the paper. Many of these are also referenced in Tables and Figures PDF where relevant to a specific table or figure.