INTRODUCTION

JUST before Bach's Kunst der Fuge was published, a year after the composer's death, by his son Carl Philipp Emanuel, a press Avertissement (7 May 1751) said it was not too much to call it a 'perfect work' (David & Mendel, 1998, pp. 236-238), 2 mentioning the 'four-part church hymn' that had added as an appendix, but omitting to mention that the final Contrapunctus (fugue) was actually incomplete. The preface to the publication itself noted that the work was unfinished because of Bach's illness, and several decades later C. P. E. Bach annotated the final bars of the composer's manuscript, 'NB While working on this fugue, in which the Name BACH appears in the countersubject, the author died' (David & Mendel, 1998, pp. 260). 3 Sales proved to be rather poor, 'about 30 copies' in the first five years (pp. 378), and Carl Philipp offered the plates to the publisher in September 1756, having decided to 'free' himself 'of any concern with it' (pp. 378).

Like some other Bach works, it probably remained in the collective musical memory for some while thereafter, as a contrapuntal tour-de-force, and Bach's sons mentioned it to colleagues in letters in the 1760s and 70s (David & Mendel, 1998, pp. 387-389); Samuel Wesley was personally given a copy by Carl Philipp (David & Mendel, 1998, pp. 493) and Charles Burney was aware of the work at least by the time he was contributing articles to Rees' Cyclopaedia at the very beginning of the 19th century. It is mentioned at length in Forkel's 1801 biography (David & Mendel, 1998, pp. 465-466), the description drawing heavily on Marpurg's 1752 Preface, and at about the same time complete editions were published by Vogt (Paris, 1801) and Nägeli (Zurich, 1802). However, the work remained in score, and with no real performance tradition having developed for this (apparently) most theoretical of works, 4 the necessity of completing the unfinished Contrapunctus was not evident.

Although the publication Preface seems clear, and derives from sources very close to Bach, modern scholars have continued to debate the status of Contrapunctus XIV, suggesting for example that it was completed but never copied out; that it does not belong to the sequence of fugues as originally planned; that it was never intended to conclude the Art of Fugue; that it was a triple- not a quadruple-fugue; or that it was deliberately left unfinished. 5 Numerous articles and books attest to the fascination with the problem Bach has bequeathed to us. 6

The first version which attempted to make Contrapunctus XIV 'performable' appears to be by Joseph Dietenhoffer, in about 1800 (Diettenhofer, n.d.). The Preface to his Set of Ten Miscellaneous Fugues notes 'no Person ever attempting to finish it; however, a Clause has been added to it at the Place where Bach left of [sic]', and the supplied ending is a simple two-bar perfect cadence marked 'Adagio' (fig.1). The turning point came many years later in 1880, when Gustav Nottebohm (1817-82) found that the three existing subjects could be combined with a missing fourth (Nottebohm, 1880, 1881), and solving this particular contrapunctal puzzle became a challenge any educated musician might attempt. Further impetus to an evolving performing tradition was given by the orchestrations of Wolfgang Graeser (1926) and Roger Vuataz (1937), followed by Gustav Leonhardt's influential book (Leonhardt, 1952) asserting that the work was intended for keyboard (it all fits under two hands) and supported by his 1953 Bach Guild recording on harpsichord. 7

To date, about two dozen performers and musicologists have composed conjectural completions of Contrapunctus XIV, including Alexandre Boëly, Hugo Riemann, Donald Francis Tovey, Helmut Walcha, Lionel Rogg, Henrik Dyhr, Geir Øyvind Eskeland, Zoltán Göncz, David Schulenberg, Davitt Moroney and Kevin Korsyn. 8 Some of these were made for recordings, and remain unpublished. Interestingly, while most have now been recorded, none seem to have achieved universal acceptance among performers; the challenge of supplying a convincing missing final page or two of Bach appears to be considerable. Individual scholars often have specific contrapuntal combinations they wish to explore, but from a performer's or listener's point of view, it is clear that some of these reconstructions do sound more (or less) like the 239 bars of real Bach that precede them, with the join sometimes being obvious. The purpose of this study is to use elements of information theory and entropy to explore to what extent it is possible to 'measure' the similarity of reconstruction to original, and thus determine which are most 'Bach-like' in respect of their voice-leading.

Sheet music by Dietenhoffer. More description below.

Figure 1: Joseph Dietenhoffer, Set of Ten Miscellaneous Fugues, p.18 (end)

INFORMATION THEORY FOR THE QUALITATIVE ANALYSIS OF MUSIC

Pearce (2007) presents an extensive review of early methods using information theory in quantitative analyses of music, starting from the mid 1950's. Even though early approaches have been criticized, they offer an interesting overview of the analytical possibilities. For example, Hiller and Bean (1966) found an increase of average entropy by analyzing four sonatas composed by Mozart, Beethoven, Berg and Hindemith between the 18th and 20th centuries. The sonatas were segmented analytically and, for each segment, monogram distributions of chromatic pitch classes were computed. Their results showed stylistic differences when comparing the entropy and redundancy figures for individual segments.

Critics of some early approaches have pointed out new possible paths for reflection that enrich the methods. For example, the way in which probabilities estimated from music samples do not consider that a listener's perception changes throughout the audition (e.g., from the first to the last note in the sample) and their state of knowledge and expectation changes dynamically as they experience each note in the music sample (Cohen, 1962). Dynamic measures of information have been proposed to address these issues, however, whether the results obtained through such measures can be extended beyond the analyzed corpus remains unclear (Pearce, 2007).

Another interesting criticism pointed out that many methods did not take into account the temporal structure of the music. Early approaches using Markov models were partially effective, but restricted by the computational power of computers. It was not until the mid-1990's that information-theoretic methods and statistical analyses were formally applied to music. However, deriving the long-term component requires a large corpus of music works. In cases where the dataset is small, it is still worth considering descriptors without temporal structure and trying to combine different dimensions to enrich the analysis.

ENTROPY FOR MUSICAL ANALYSIS

In communication systems, entropy measures the disorder and therefore the unpredictability of a stochastic process. If we analyze the messages produced by a stochastic source, entropy measures the amount of information that they contain. The more predictable a message is, the fewer symbols are needed to represent it and vice versa. Thus, the signs contained in a message characterize it. Music can be analyzed from this perspective. An extended discussion of the limitations and possibilities of applying information theory to music analysis can be found in Cohen (1962). In general, while entropy is inadequate to look at the semantic dimension of music, it is appropriate to analyze its syntactic dimension, and this is the approach here.

Intuitively, musicians all have an informal understanding of the amount of information contained in musical works. For instance, compare the works of Karlheinz Stockhausen or Gabriel Pareyón with those of Manuel Ponce or Maria Anna Mozart. While it has to be clarified that the human perception of complexity is biased by the level of familiarity that a person has with the musical works (Manzara et al., 1992), this informal understanding helps to grasp ideas such as predictability or redundancy in a message. In the syntactic dimension, entropy can be used to build numerical measurements of musical pieces which can, in turn, be used for tasks ranging from analysis of the musical form to stylistic comparisons.

For example, in Eck and Casagrande (2005), Shannon entropy is used to detect metrical structure. In Sakellariou et al. (2017), a maximum-entropy-based model is able to capture the statistics of a corpus of melodies; the extracted models can be used either for classification or generation of new pieces. In Pickens and Iliopoulos (2005), maximum entropy modeling is used to analyze, in the context of the Music Information Retrieval, 'how similar are two given pieces of music'. The framework does not have to assume independence of the extracted features (pitch, rhythm) used for the comparisons. The authors evaluate the framework analyzing polyphonic theme similarity over symbolic data (in MIDI format). In Perttu (2007), statistical descriptors built over symbolic data were used to analyze the increasing chromaticism of western music between 1600 and 1900.

Authorship is another task to which entropy measurements have been applied, although relatively few examples have been published. Indeed, if the syntactic dimension contains information about how a composer arranges the possible sounds following some sort of pattern, information theory could be used to compare similar patterns across works. For example in Kopiez et al. (2003), the compressibility of symbolic sequences is used to analyze authorship. According to the authors, 'The best compression rate of a data sequence is related to the self-similarity of the sequence and then to its complexity'.

In Kranenburg (2007), machine-learning algorithms are applied to identify disputed organ fugues in the catalogue of J. S. Bach (BWV 534/2, 536/2, 537/2, 555/2, 557/2, 560/2 and 565/2). The contenders are J. L. Krebs, J. P. Kellner and W. F. Bach, all of whom wrote in styles that were from time to time closely related to Bach. The study applied a nearest-neighbour classifier over several features extracted from the musical works. Among them, Kranenburg builds the following entropy measures computed according to Shannon's (1948) formula:

  1. Harmonic entropy, which considers the chord-types used. This feature works as a measure of the chord quality (without considering inversions). For instance, the F major and G major triads are considered to have the same sonority.
  2. Pitch entropy, which constructs a list of the occurrences of all pitches. Then, in order to consider the presence of each pitch within the musical work, the occurrences are weighted by their durations.
  3. Sonority entropy. Where sonority is defined as the type of a chord. For example, major triads are the same sonority, no matter its inversion, pitch, or doubling of tones. Each sonority is assigned a unique number. Then, for each sonority the total duration of all occurrences in the composition is calculated. Finally, the occurrence probabilities are estimated using this weighted frequency.

Below we present the analytical method used here. The constructed descriptors use the signs present in the works and therefore allow comparisons that concern only that dimension. However, when they are contrasted with the opinion of musicologial experts, they can support existing hypotheses or open new ones.

ANALYTICAL METHOD

The method applied here uses information theory to examine Bach's musical lines in the opening of Contrapunctus XIV, and then compare four completions, by Tovey (1931), Moroney (1989), Göncz (2006) and Korsyn (2016). 9 These were selected as representing an early and influential example (Tovey), the most recent version (Korsyn), and two others in between which are widely available from major publishers (Moroney and Göncz). The concept of 'relative entropy' is applied to this musical analytical question, to see to what extent and under what conditions it can be interpreted as representing 'distance', in the specific case of comparing melodic lines.

Frequency Distributions

In mathematical terms, series of musical notes can be conceived as the set of possible events that can occur, with each note as a category. The different categories are mutually exclusive. Within these considerations, the 'note distribution' of a single-voice melody is a count of the number of observations of each category (in other words, the number of times that each note appears in the melody). This is the frequency distribution (in the case of notes) associated with the melody. With such a distribution of notes, the probability mass function can be constructed, which gives the probability of occurrence of a value within a discrete set, like a set of notes. The sum of such probabilities equals one, and the probability mass function defines a discrete probability distribution.

Relative Entropy

The relative entropy, or Kullback-Leibler divergence, is a measure of how one probability distribution differs from another probability distribution which has been taken as a reference. For discrete probability distributions P and Q (with n categories pi and qi), the Kullback-Leibler divergence (or information distance) from Q to P is defined as in the following equation:

R= Σ i=1 n pi log pi qi

The entropies were calculated using the four voices of Bach's Contrapunctus XIV, bars 1-239, as the a priori distribution (Q). The considered variables were the Note classes (the different notes) and the Duration classes (the different durations). The Note class distribution is the resulting distribution obtained by identifying and counting the different pitches contained in the score. The Duration class distribution results from the analogous process over the Duration of each note. For the calculation of the relative entropies the scores were separated into their different voices. Then the relative entropies for Soprano, Alto, Tenor and Bass were calculated for the four completions; the result is shown to five decimal places, 10 with the smallest numbers representing those closer to the Bach original.

Data was prepared by typesetting the scores of Contrapunctus XIV and the four reconstructions in Sibelius, which were then exported as individual MIDI files to serve as numerical data input to bespoke analysis code written in the SuperCollider programming language. 11 For this repertoire, MIDI (which is effectively 12-note equal temperament) is suitable, although this would likely be a limitation for other repertoires (Medieval music, Indian classical music). Table 1 shows the results for the relative entropies between the Note classes probability distributions calculated for the independent voices and composers, while Table 2 shows the relative entropies calculated by using the probability distributions obtained for the Duration classes. In each Table, that version numerically closest to the Bach Contrapunctus XIV original is shown in bold.

Table 1. Relative entropies between note classes probability distributions calculated for the separate voices and composers
Version Soprano Alto Tenor Bass
Moroney 0.15337 0.15334 0.09640 0.13321
Korsyn 0.09035 0.05887 0.09605 0.07908
Tovey 0.01532 0.04651 0.03421 0.04358
Göncz 0.01602 0.01670 0.02828 0.01900
Table 2. Relative entropies between Duration classes probability distributions calculated for separate voices and composers
Version Soprano Alto Tenor Bass
Moroney 0.39463 0.34439 0.18304 0.18300
Korsyn 0.25006 0.22881 0.25006 0.30317
Tovey 0.24788 0.34369 0.24788 0.31910
Göncz 0.07337 0.16705 0.09531 0.26236

Tables 1 and 2 show the relative entropies of the Note classes versus the Duration classes for the separate voices, SATB. It can be seen that Göncz and Tovey consistently appear 'closest' to the a priori distribution (Bach), considering the dimension defined by the Note class distributions. However, in the dimension of the Duration class distributions this is no longer the case, as the durations used by Moroney (1998) resemble more those used by Bach. But examination of the scores raises an interesting question, as in some of the fugal codas, the Duration class distributions calculations are affected by the length of the final bass notes; that is, long pedal notes. As there are no such pedal notes in Bach's opening, and duration weighting here is significant, it was necessary to discount these and re-run the data, in order to make the a priori comparisons valid.

Table 3 shows the results, with the relative entropies of the Note classes versus the Duration classes calculated for the bass line by removing the pedal notes. It can now be seen that Tovey again appears as the one with least divergence from the Bach opening section pitch distribution.

Table 3. Relative entropies of the revised Duration classes in the Bass line
Version Bass without pedal notes
Moroney 0.36772
Korsyn 0.15187
Tovey 0.06670
Göncz 0.18232

When interpreting the distance between probability distributions, the way in which they are constructed must be borne in mind; that is, from counting the number of note occurrences. Supposing we take a C major melodic fragment (for example, CCDECCD) and create a new one by re-sorting its notes according to their degree in the musical scale (eg CCCCDDE). Now, when calculating the relative entropy of this new melody with respect to the original, they will be at a distance of zero (0.0), as they have the same probability distributions. However, it is clear that the melodies sound radically different. At the other extreme, considering the possible completions of a musical work, it is obvious that those in the same key (in other words using the same set of musical pitches) will be closer that any that do not. In other words, the selection of notes is not random, and further, the use of specific intervals, for example a third or a seventh, is a trace that characterizes musical style. In this sense, although the notion of 'distance' given by relative entropy is not conclusive, it provides us with information that, together with other dimensions of analysis, allows us to see how close the 'fingerprints' of two musical pieces are. Including the temporal dimension, it also states that two pieces are also at zero distance if they contain the same notes in the same proportions.

This methodology can be further extended to provide more information. For example, one extension of the probability density function to compare notes in different keys could be to consider the harmonic function of each note, instead of the notes themselves. It is also possible to consider other distributions, such as note durations, or more complex ones, like the number of times an eighth-note follows a sixty-fourth-note.

Comparison Examples

In order to understand the size of the 'distance' that the six-figure numbers in Tables 1-3 represent, four different fugues were then compared with the surviving Bach portion of Contrapunctus XIV (the same a priori data as before): Contrapunctus I from the Art of Fugue, Pachelbel's Fugue No.2 (c.1690), Mozart's G minor keyboard fugue (1782) and the final Fugue from Mendelssohn's Organ Sonata No.6 (1845). 12 All are normalized to D minor, 13 and just the soprano voice is used, to focus on the melodic component, to the same five decimal places.

Table 4. Soprano line comparison with fugues by Bach, Pachelbel, Mozart and Mendelssohn
Composer Note duration measure Note distribution measure
Bach, I 0.26236 0.03047
Pachelbel 0.36772 0.10093
Mozart 0.48551 0.09928
Mendelssohn 0.77319 0.04916

The results show that (as expected) Bach's Contrapunctus I is closest to the master-data from Contrapunctus XIV, followed by Pachelbel, Mozart and Mendelssohn (in chronological order) for the duration measure; the results for the distribution measure are different, where Mendelssohn's Bach-style Fugue now actually sits closer to Contrapunctus XIV than do the Pachelbel or Mozart works. The average of the numbers for the non-Bach composers in Table 4 is greater than those in the Table 1 above for the Contrapunctus XIV reconstructions, showing - as expected - that all these reconstructions are generally close to Bach's original.

CONCLUSION

This information-theoretical method for comparing the content of melodic lines with Bach's original is not designed as any measure of musical or other 'success' for the various reconstructions considered here. Many other components would have to come into play in order for such a statement to be meaningful, including faithfulness to Bach's contrapuntal practice, overall structure, organic development and so on. However, it does enable some comparison of the way in which each of the versions has combined the existing material with newly-created melodic lines, and see which are statistically closest to Bach's surviving original. From the data above, the Göncz version appears the strongest, but Tovey's 1931 completion makes a very good showing, a measure of his deep engagement with Bach as a teacher and scholar.

ACKNOWLEDGEMENTS

Pablo Padilla would like to thank PASPA at UNAM for financial support for a research visit to Clare Hall, University of Cambridge, during the summer of 2021. This article has been copyedited by Gabriele Cecchetti and layout edited by Jonathan Tang.

NOTES

  1. Correspondence can be addressed to: Francis Knights. Fitzwilliam College, Cambridge CB3 0DG, UK, fk240@cam.ac.uk.
    Return to Text
  2. For Marpurg's Preface to the 1752 edition, see David & Mendel (1998), pp. 375-377.
    Return to Text
  3. Illustrated in David and Mendel (1998), pp. 275.
    Return to Text
  4. Mozart did arrange one for string quartet in the early 1780s, the fruit of his friendship with Baron van Swieten (K404a/4ii, Contrapunctus 8); see David and Mendel (1998), pp. 488. The work may of course be performed incomplete, and has often been recorded as such; 'a dramatic gesture but one that invites a sentimental response', in the perceptive words of David Schulenberg (Schulenberg (2006), pp. 424).
    Return to Text
  5. For examples of some of these views, which vary considerably in plausibility, see Tovey (1931), Wolff (1975), Butler (1983), Dirksen (1994), Schulenberg (1995), Göncz (1997), Hughes (2006), Göncz (2013), Wilson (2014) and Korsyn (2016). The latter article gives an excellent historical overview, with praise for Tovey's now-neglected insights. A view of the work from a performance theory perspective can be found in Demeyre (2013).
    Return to Text
  6. There are over 220 entries on the Art of Fugue in the Bach Bibliography http://swb.bsz-bw.de, dating back 150 years.
    Return to Text
  7. Reissued on Vanguard Classics ATM-CD-1652 (2006); Contrapunctus XIV is performed unfinished.
    Return to Text
  8. For a description of a number of these, see Overduin (2001). One might also recall Busoni's Bach elaboration in his Fantasia Contrappuntistica (1910). Tovey's description of his own completion process can be found in Tovey (1931), pp. 45-49, and Korsyn (2016), Göncz (1997) and Göncz (2006) also explain the structuring of their versions.
    Return to Text
  9. It should be remembered that a great deal of the musical content of any reconstruction or completion of Contrapunctus XIV will consist of combinations of existing Bachian material. The analytical system applied allows for comparison of differing lengths of completion, with Moroney being the shortest and Tovey the longest here.
    Return to Text
  10. They are generated to 14 decimal places, and rounded to five here.
    Return to Text
  11. https://supercollider.github.io.
    Return to Text
  12. Pachelbel's Fugue on the Magnificat, Set 1 No.2, Mozart's Fugue in G minor K401 (composed at the same time he was arranging Bach's Contrapunctus 8 for string quartet) and Mendelssohn's Organ Sonata No.6 in D minor, Op.65/6.
    Return to Text
  13. For comparisons of this kind to be valid, keys must always be normalized. See Knights et al. (2017).
    Return to Text

REFERENCES

  • Butler, G. (1983). Ordering Problems in J. S. Bach's 'Art of Fugue' Resolved. The Musical Quarterly 69, 44-61. https://doi.org/10.1093/mq/LXIX.1.44
  • Cohen, J. E. (1962). Information theory and music. Behavioural Science, 7(2), 137-163. https://doi.org/10.1002/bs.3830070202
  • David, H. T. & Mendel, A. eds, rev Wolff, C. (1998). The New Bach Reader (New York).
  • Demeyere, E. (2013). Johann Sebastian Bach's Art of Fugue: Performance Practice Based on German Eighteenth-Century Theory (Leiden). https://doi.org/10.2307/j.ctt9qdwm9
  • Diettenhofer, J. (n.d.). A Set of Ten Miscellaneous Fugues … for the Organ or the Piano Forte (London).
  • Dirksen, P. (1994). Studien zur Kunst der Fuge von J. S. Bach (Wilhelmshaven).
  • Eck, D. and Casagrande, N. (2005). Finding Meter in Music Using an Autocorrelation Phase Matrix and Shannon Entropy. Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005), London, 11-15 September 2005, 504-509.
  • Göncz, Z. (1997). Reconstruction of the final contrapunctus of The Art of Fugue, International Journal of Musicology 6, 103-119.
  • Göncz, Z. (2006). Joh. Sebastian Bach: Contrapunctus 14 (Stuttgart).
  • Göncz, Z. (2013). Bach's Testament: on the philosophical and theological background of The Art of Fugue (New York).
  • Hughes, I. (2006) Accident or design?: New theories on the unfinished Contrapunctus 14 in J. S. Bach's The Art of Fugue BWV 1080, dissertation, Auckland University.
  • Knights, F., Padilla, P. & Tidhar, D. (2017). Chambonnières versus Louis Couperin: attributing the F major Chaconne. Harpsichord and Fortepiano 20(1), 28-32.
  • Kopiez, R., Lehmann, A. C., Wolther I. & Wolf, C. (2003). Musical style and authorship categorization by informative compressors. Proceedings of the 5th Triennial ESCOM Conference, 8-13 September 2003.
  • Korsyn, K. (2016). At the Margins of Music Theory, History, and Composition: Completing the Unfinished Fugue in Die Kunst der Fuge by J. S. Bach. Music Theory & Analysis 3(2), 115-143. https://doi.org/10.11116/MTA.3.2.1
  • Kranenburg, P. van (2007). On Measuring Musical Style - The Case of Some Disputed Organ Fugues in the J.S. Bach (BWV) Catalogue. Computing in Musicology 15.
  • Leonhardt, G. (1952). The Art of Fugue - Bach's Last Harpsichord Work: An Argument (The Hague).
  • Manzara, L. C., Witten, I. H. & James, M. (1992). On the Entropy of Music: An Experiment with Bach Chorale Melodies. Leonardo Music Journal, 2, 81-88. https://doi.org/10.2307/1513213
  • Moroney, D. (ed). (1989). J. S. Bach, The Art of Fugue (Munich).
  • Nottebohm, M. G. (1880). J. S. Bach's letzte Fuge. Musik-Welt 20 (1880), 232-236 and 21 (1881), 224-44.
  • Overduin, J. (2001). Nine Published Completions for Keyboard of BWV 1080, 19, from The Art of Fugue by J. S. Bach. The American Organist 35, 78-82.
  • Perttu, D. (2007). A Quantitative Study of Chromaticism. Empirical Musicology Review 2(2), 47-54. https://doi.org/10.18061/1811/24822
  • Pickens, J. & Iliopoulos, C. S. (2005). Markov Random Fields and Maximum Entropy Modeling for Music Information Retrieval. Proceedings of the 6th International Conference on Music Information Retrieval.
  • Schulenberg, D. (1995). J. S. Bach's The art of fugue, the work and its interpretation; The Riddle of Bach's last fugue. Notes, 51, 1317-1321. https://doi.org/10.2307/899122
  • Schulenberg, D. (2006). The Keyboard Music of J. S. Bach (2nd ed, New York),
  • Shannon, C. E. (1948). A Mathematical Theory of Communication, Bell System Technical Journal, 27(3), 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
  • Tovey, D. F. (ed) (1931a). Die Kunst der Fuge (The Art of Fuge) by Johann Sebastian Bach (Oxford).
  • Tovey, D. F. (1931b). A Companion to 'The Art of Fugue' (Oxford).
  • Wilson, G. (2014). Bach's Art of Fugue: suggestions for the last gap. Early Music, 42, 249-257. https://doi.org/10.1093/em/cau039
  • Wolff, C. (1975). The Last Fugue: Unfinished?: Seminar Report: Bach's 'Art of Fugue' - An Examination of the Sources. Current Musicology, 19, 71-77.
Return to Top of Page