IN "MCFlow: A Digital Corpus of Rap Transcriptions," Nathaniel Condit-Schultz provides researchers with a robust data set that will be invaluable for empirical research into rap music for years to come. In later portions of the article, Condit-Schultz shows how that research might proceed, documenting how emcees' flow changed as rap music rose to prominence in the 1980s and 1990s. Many of Condit-Schultz's findings (e.g., increasing rhyme density, decreasing tempo, etc.) mainly confirm what other scholars have already intuited (e.g., Krims, 2001; Adams, 2009; Kautny, 2015). But because of the wide range of features Condit-Schultz encodes he is able to make several novel discoveries. There is much I admire in this article, and I'll highlight some of that below. But I also want to raise a few concerns for a rapidly increasing area of study. These concerns focus on Condit-Schultz's sampling method and his representation of accent and pitch.

I begin with praise: Condit-Schultz's careful consideration of several issues should be models for future work. He presents the fullest discussion to date of the phenomenological status of the first instance of a rhyme in a rhyme chain. Other researchers, myself included, discard this issue by assuming an out-of-time stance on the part of the listener. The historical change in the forms of rap songs is similarly a novel finding and invites comparison to other genres. And his painstaking phonetic description—where others including myself rely unthinkingly on digital phonetic dictionaries—grounds his discussion of rhyme in the sounds the listener hears, not the conventional pronunciation of the lyrics.

On Rhyme

This last technique should be adopted widely. Many scholars of rap music, especially in the lyric poetry tradition, wrestle with the status of "near rhyme" or "slant rhyme" in rap lyrics. Holtman (1996) provides equivalence classes that group consonants of similar manner of articulation, but different place of articulation. Hirjee and Brown (2010) mitigate the issue by constructing a probabilistic model of rhyme similarity. But a good portion of the confusion is that emcees routinely alter the pronunciations of words in order to create greater phonetic similarity. Eminem (Cooper, 2010) discusses this kind of coercion:

Eminem: People say that the word 'orange' doesn't rhyme with anything. That kinda pisses me off because I can think of a lot of things that rhyme with 'orange.'

Anderson Cooper: What rhymes with 'orange?'… I'm trying to think…

Em: If you're taking the word at face value, and you just say /ɔ́rndʒ/, nothing is gonna rhyme with it exactly. If you enunciate it and make it more than one syllable, like /ɔ́rɪndʒ/, uh, "I put my orange [/ɔ́rɪndʒ/] four inch [/fɔ́rɪntʃ/] door hinge [/dɔ́rhɪndʒ/] in storage [/stɔ́rɪdʒ/] and ate porridge [pɔ́rɪdʒ] with George [dʒɔ́rɪdʒ]. 1

In a phonetic dictionary, like the Carnegie Mellon University Pronouncing Dictionary (Lenzao & Rudnicky, 2007), many of these words do not rhyme; they won't rhyme in a probability model either. In Eminem's performance of them, they do rhyme—exactly—because of the divergence of how the words are pronounced "at face value" and how they are pronounced in his expressive performance. Because Condit-Schultz makes phonetic transcriptions of the lyrics himself, his data set might, with future work, allow for a better understanding of how emcees make rhymes through altered pronunciations.

At the same time, distinctions between existing work on rap (Ohriner, 2013b, Alim, 2003) and Condit-Schultz calls for a more careful consideration of rhyme structures in rap lyrics. Condit-Schultz annotates rhyme between rake and rock, an example of "frame rhyme" or "pararhyme." 2 This is not a type of rhyme I annotate in rap verses. It would seem to be a personal preference that Condit-Schultz has included it and I have not. "Standard" rhymes—those repeating the vowel and final consonant—occur much more in rap lyrics than in, say, prose writing. I wonder if the same can be said for "frame rhymes" or other kinds of rhymes. 3

On Sampling

Because Condit-Schultz is interested in documenting historical trends in flow, he has tried to create a widely representative collection of rapping. And following best practices from fields with more experience in corpus research (e.g., computational linguistics), Condit-Schultz is right to employ a sampling method. All too often musical corpus studies are population studies that present themselves as sample studies (Biber, 1990). For instance, when Quinn and Mavromatis (2011) or myself (2013a) study the Bach chorales, we imply that the results of the population of Bach chorales are informative for the population of Eighteenth- and Nineteenth-Century European music. The chorales are by no means a random sample of that population.

Yet I worry that Billboard Hot 100 is analogous to the Bach Chorales in this respect: a small population imputed as a sample of a large population. While Condit-Schultz views the Billboard list as "the collective choices of millions of consumers," it might instead reflect the marketing success of a handful of influential record labels and the personal preferences of radio DJs. These preferences create narratives of rap history—for example, that the 1990s endow only "gangsta rap"—that are counterproductive to a full accounting of the genre. And I fear the correlation between the Billboard chart and the listening preferences of rap consumers is especially tenuous at moments of technological shift. For example, the Billboard 200 albums chart has, since 2014, included downloads, streaming, and online purchases, technologies that influenced rap music for at least a decade prior (Billboard Staff, 2014). The broader population of rapping extends to the deep cuts of ignored albums, the brief features on contemporary pop songs, the mixtapes of new and established artists, and even the ephemeral freestyles of battling emcees. 4 The population also extends across many "stylistic registers," to borrow a term from Robert Hatten (1994). These include the flows of club tracks as well as displays of phonetic and poetic virtuosity.

Of course, structuring this population in a way that enables sampling is a very complex task, and one Condit-Schultz shouldn't be expected to complete. But there are low-hanging fruits that might give a more refined picture of flow. In particular, I would advocate the inclusion of "features," verses of rapping on tracks by artists of other genres. What aspects of an emcee's flow change when they're performing for a more mainstream audience? Do different genres encourage similar behavior among emcees? At this point, more than three decades into its history, rap is not a monolithic genre and our corpus studies might try to capture its variety. 5

Even taking Condit-Schultz on his own terms, that he is interested in constructing a corpus of the most commercially successful emcees, he still invites some strange quirks because he transcribes every verse on a track, not just those by the artist of interest. Thus his 30 "highly successful rappers" are only 60% of the corpus. The other 40% are emcees of unknown reputation who happen to be on the successful emcees' tracks. And since tracks have different numbers of verses, he ends up with very different amounts of rapping for different artists, a disparity that, in my view, muddles some of his later comparisons. While it is interesting that Eminem exhibits higher entropy than The Black Eyed Peas, and this statement conforms to my understanding of their music, the corpus consists of more than five times the measures of Eminem than The Black Eyed Peas. And since the corpus contains only 3–5% of the output of these artists (namely their most successful tracks), I feel a nagging skepticism regarding the claim.

On Feature Encoding

Setting aside which verses or songs should be part of a corpus of rap music, I feel some issues of representation need to be addressed before the corpus is widely used. Chief among these is the nature of accent (or stress) in rapping. In phonetics research, the status of stress in spoken English remains contested, both with regards to the number of levels of stress and their relation to phonetic features (Liberman & Prince, 1977; Hall & Kleinhenz, 1999; Selkirk, 1986; Hayes, 1995). Condit-Schultz appears to annotate syllabic stress intuitively by combining the stress some syllables obtain through their status as accented syllables of multisyllabic words with the stress some syllables obtain through their relatively higher pitch height. The problem is that a lot of rapping—though not the excerpt in his Figure 4—consists of long strings of monosyllables in an audio environment in which determining pitch height is not possible. In my view, a more formal method for annotating accent is needed; I describe my own algorithm in the article in this issue at the end of Section II and include its implementation in Appendix 1. 6

On Historical Trends and Individual Style

Setting aside issues of sampling and representation, I find many of Condit-Schultz's analyses convincing and useful, though I want to highlight two issues. The first is the kind of story Condit-Schultz wants to tell with this data. Consider Figure 7, which shows rhyme density over time. The linear regression shows rhyme density rising, but at first glance it would seem that a curve could describe the data as well as a line: rhyme density peaked around the turn of the century and has since declined. The linear regression conforms to the standard narrative in rap scholarship, that rhyme density and other features of complexity increased as the Old School emcees of the 1980s gave way to New School emcees in the 1990s (Krims, 2001; Alim, 2003), and indeed the increase from 1982 to 2002 is much sharper than that from 2002 to 2014. So are contemporary emcees less complex than those at the turn of the millennium? In my view, rap music has blossomed into a more diverse genre in the past 15 years. Some current emcees are clearly uninterested in complex flows; others are crafting the most sophisticated verses yet heard. A corpus comprised of only Billboard Top 100 songs won't capture this diversification of the genre. In this case, the narrowness of the sampling paints, in my view, an unrepresentative statistical picture.

The second issue of note is Condit-Schultz's repeated findings of more variability among songs of an artist than between artists. I don't dispute this: rap music has always been a collection of genres rather than a monolithic one, as Adam Krims (2001) so convincingly demonstrated. Between these genres (e.g., the braggadocio, the slow jam, etc.) one should expect the flow of an artist to change. At the same time, it is a discouraging finding. Scholars and fans alike believe that something akin to artist's DNA is encoded in their flow, whatever style they're rapping. How large would a corpus like MCFlow need to be in order to capture individual style across songs? Condit-Schultz has assembled an impressive collection of data, but for this task it nevertheless seems not big enough.

In conclusion, Condit-Schultz has embarked on an impressive research agenda, one that he and others will need to continue. The challenge for researchers is to expand and collate corpora of rap music even if consensus on issues of sampling and representation proves elusive. At the same time, Condit-Schultz has already detailed the many summary statistics of flow in rap music and future work will need to engage more sophisticated measures of rap style. But Condit-Schultz has shown a path forward.


  1. Eminem has practically made a cottage industry of long rhyme classes featuring "orange," heard in both "Business" from the 2002 release The Eminem Show and "Brainless" from the 2013 release The Marshall Mathers LP 2.
  2. If syllables are conceptualized as consonant(s)-vowel-consonant(s), a frame rhyme pairs the consonances but changes the vowel. "Rhyme," without any other qualifier, is understood to mean repetition of the vowel and terminal consonant or consonant cluster (Brogan 1993, p. 1054).
  3. Of the features that I encode and Condit-Schultz does not is the possibility of a multisyllabic rhyme with medial intervening syllables (see also Alim 2003, p. 81).
  4. Using Billboard data is problematic methodology and also practically. The data is proprietary and the source from which Condit-Schultz scrubbed the data has shut down (due to a cease-and-desist letter) before the article is even published. This stresses the need for more robust open-source models of popularity, as we as greater use of those that currently exist (e.g., EchoNest).
  5. A greater focus on features would also change the claim that rap music is in decline. While Condit-Schultz shows a beat in rap music's representation on the Billboard charts in 2004, many more pop songs feature rap verses today.
  6. Similarly, without audio analysis of a capella tracks, I feel that Condit-Schultz's **tone and **melody spines are unverifiable and therefore should be avoided. Condit-Schultz himself does not discuss these spines. Perhaps this is because it is not his focus and perhaps it is because he was unable to find anything of interest concerning them. Either of these scenarios is fine, but for the sake of future research it would be good to know which is the case.
