MUSIC 674: History of Music Theory

Estimated study time: 2 hr

Table of contents

These notes draw on Thomas Christensen (ed.) The Cambridge History of Western Music Theory (2002), Leo Treitler (ed.) Strunk’s Source Readings in Music History (rev. ed., 1998), Joel Lester’s Between Modes and Keys: German Theory 1592–1802 (1989), Brian Hyer and Alexander Rehding (eds.) The Oxford Handbook of Neo-Riemannian Music Theories (2011), and supplementary materials from Yale University MUSI 720–721 graduate seminars and Indiana University T623–T624 doctoral music theory sequences.


Chapter 1: Ancient Greek Music Theory

The history of Western music theory does not begin with scales, chords, or notation. It begins with a string. The monochord — a single string stretched over a resonating box with a movable bridge — was the experimental apparatus through which the ancient Greeks discovered that musical intervals correspond to precise numerical ratios. This discovery, attributed in antiquity to Pythagoras of Samos (c. 570–495 BCE) and elaborated by generations of followers, established the foundational claim of one entire tradition in Western music theory: that music is, at its root, a branch of mathematics. That claim has never gone unchallenged, and the alternation between mathematical rationalism and empirical perceptionism defines much of music theory’s history from antiquity to the present.

The ancient Greek contribution to music theory is not a single unified doctrine but a complex and contested set of overlapping traditions. The Pythagorean tradition, grounded in ratio mathematics and cosmological speculation, provided music theory with its first systematic vocabulary and its first claim to scientific rigor. The Aristoxenian tradition, grounded in perceptual observation and trained musicianship, provided the first systematic challenge to that rigor and the first defense of the ear as a legitimate theoretical authority. The Platonic tradition embedded music theory within a broader philosophical and political program, making the question of what music does to the soul a matter of civic importance. And the Alexandrian tradition of Ptolemy, Nicomachus, and Aristides Quintilianus synthesized, compiled, and transmitted these earlier traditions in forms that would shape medieval and Renaissance music theory for more than a millennium. To understand any subsequent chapter in the history of music theory, one must understand how these ancient debates were framed and why they proved so durable.

1.1 Pythagoras and the Monochord

The practical operation of the monochord is straightforward. A string of a fixed length and tension produces a reference pitch. Placing the movable bridge at the midpoint divides the string into two equal segments, each of which vibrates at exactly twice the frequency of the whole. The ratio 2:1 corresponds to what we call the octave — the interval that ancient Greeks called the diapason (“through all”). Moving the bridge to divide the string in a 3:2 ratio produces the perfect fifth (diapente). A 4:3 division produces the perfect fourth (diatessaron). And the ratio 9:8, obtained by compounding a fifth upward with a fourth downward — that is, \(3/2 \div 4/3 = 9/8\) — yields the whole tone (tonus).

The Pythagorean Consonances. The four ratios 2:1, 3:2, 4:3, and 9:8 are the foundational intervals of Pythagorean tuning. All intervals in the Pythagorean system are generated by stacking perfect fifths (3:2) and octave-reducing the result. Mathematically, any Pythagorean interval has the form \(\left(\tfrac{3}{2}\right)^m \cdot \left(\tfrac{1}{2}\right)^n\) for non-negative integers \(m\) and \(n\). Consonance is defined by ratio simplicity: intervals whose ratios involve only the smallest integers are the most consonant.

The tetraktys — the triangular arrangement of the integers 1, 2, 3, 4 — held quasi-mystical significance for the Pythagorean brotherhood. The sum 1 + 2 + 3 + 4 = 10, the sacred number of completion. More importantly, these four integers contain within their pairwise ratios all four fundamental musical consonances: the octave (2:1), the fifth (3:2), the fourth (4:3), and the whole tone (9:8 = (3/2)/(4/3)). For the Pythagoreans this was not a coincidence but a revelation: the cosmos is ordered by number, and music is a sensible manifestation of that cosmic order. The tetraktys was reportedly sworn upon as an oath in the Pythagorean brotherhood, reflecting the degree to which mathematical and musical insight were fused in their philosophical worldview.

Pythagorean music theory was developed in systematic form not by Pythagoras himself — who left no writings — but by later thinkers including Philolaus of Croton (c. 470–385 BCE), Archytas of Tarentum (fl. 400–350 BCE), and Nicomachus of Gerasa (fl. second century CE). Archytas is especially important for music theory because he provided mathematical derivations of the three tetrachord genera within the Pythagorean framework of superparticular ratios — ratios of the form \((n+1):n\). The superparticular constraint gives Pythagorean interval theory a certain mathematical elegance but also a rigidity that would later provoke Aristoxenus’s empiricist reaction.

1.2 The Pythagorean Comma

One of the most consequential results of Pythagorean tuning is a small but irresolvable discrepancy that has driven the entire subsequent history of temperament. If one stacks twelve perfect fifths — moving up by a ratio of 3:2 twelve times — the resulting pitch should, after reduction by seven octaves, return to the starting pitch. But it does not quite. The mathematical statement of this discrepancy is:

\[ \left(\frac{3}{2}\right)^{12} = \frac{531441}{4096} \approx 129.746 \quad \text{while} \quad 2^7 = 128. \]

The ratio between these two quantities is the Pythagorean comma:

\[ \frac{531441}{524288} = \frac{3^{12}}{2^{19}} \approx 1.01364, \]

corresponding to approximately 23.46 cents (where 100 cents = one equal-tempered semitone). This gap is small enough to be nearly imperceptible in a brief melodic context, but catastrophic for a keyboard instrument that must play consonantly in all keys. The Pythagorean comma is not a defect of music or of mathematics; it is a consequence of the incommensurability of the logarithms of 2 and 3. No positive integer power of 2 ever exactly equals a positive integer power of 3, as follows from the uniqueness of prime factorizations. Every system of tuning in Western history — meantone temperament, various irregular temperaments, and eventually equal temperament — represents a different strategy for distributing or concealing this comma across the twelve intervals of a closed pitch cycle.

The Pythagorean comma is the original source of the temperament problem. Ancient Greek theorists were not concerned with keyboard tuning (which did not exist), but their mathematical framework made the comma's existence inevitable. Subsequent theorists from Zarlino to Kirnberger to Helmholtz each confronted this comma anew. Their different solutions reflect different philosophical commitments about the relative authority of mathematical purity versus practical musicality. The comma thus functions as a kind of recurring exam question that each generation of theorists must answer differently.

In practical terms, a Pythagorean keyboard tuning places all the comma into a single “wolf fifth” — an interval slightly smaller than a pure fifth by a full Pythagorean comma, typically at the edge of the chromatic system (often between G♯ and E♭) where it would rarely be used in common repertoire. The wolf fifth “howls” — hence its name — and represents the price paid for pure fifths everywhere else. Later meantone temperaments distributed the syntonic comma more evenly but introduced a different, larger wolf fifth. Equal temperament eliminates wolves entirely by spreading the Pythagorean comma uniformly across all twelve fifths, narrowing each by approximately 1.96 cents.

1.3 Aristoxenus and the Empiricist Alternative

Aristoxenus of Tarentum (c. 375–335 BCE), a student of Aristotle, mounted the most systematic ancient challenge to Pythagorean ratio theory. His Harmonika Stoicheia (Elementa harmonica) proposes a radically different foundation: intervals are not ratios between string lengths or vibration frequencies but magnitudes in a continuous pitch-space. The proper unit of measurement is not the ratio but the tone, and intervals are measured as multiples and fractions of tones.

Aristoxenian Interval Measurement. For Aristoxenus, the octave contains exactly six whole tones. This entails that the semitone is exactly half a tone, and that twelve semitones fill the octave. This is mathematically equivalent to equal temperament: each semitone corresponds to the ratio \(2^{1/12} \approx 1.05946\). The Pythagorean objection — that there is no rational number \(r\) such that \(r^2 = 9/8\) exactly — is irrelevant to Aristoxenus, who does not work in the domain of ratios at all. For Aristoxenus, the tone is a perceptual magnitude, not a mathematical ratio.

The distinction between Pythagorean and Aristoxenian approaches is not merely technical; it represents a deep epistemological split. For Pythagoreans, the authority of musical theory rests on mathematical demonstration. For Aristoxenus, it rests on trained musical perception: the mousikos (cultivated musician) whose ear has been educated to discriminate intervals reliably is the final arbiter. This tension between mathematical rationalism and empirical perception recurs throughout the history of music theory, resurfacing in Vincenzo Galilei’s challenge to Zarlino in the sixteenth century, in Helmholtz’s psychoacoustics, and in the empirical music cognition movement of the twentieth century.

Aristoxenus also provides the fullest surviving ancient discussion of the three tetrachord genera, specifying for each genus not a single fixed tuning but a range of acceptable inner-note positions. The enharmonic genus, he notes, is the most challenging for the ear and requires the most training to sing accurately. His interest is always in what the trained ear can discriminate, not in what mathematical ratios can specify. The Harmonika Stoicheia survives in fragmentary form but is supplemented by Aristoxenus’s Rhythmika Stoicheia, which applies the same magnitude-based approach to rhythm — making Aristoxenus the first systematic theorist of both pitch and rhythm in the Western tradition.

1.4 The Tetrachord Genera and the Greater Perfect System

Ancient Greek melodic theory was organized around the tetrachord — a four-note span filling the interval of a perfect fourth. The two outer notes of the tetrachord were fixed; the two inner notes were movable, yielding three distinct melodic characters called the genera (singular: genos). The diatonic genus places the inner notes so that the tetrachord contains two whole tones and one semitone, roughly equivalent to the top four notes of a modern major or natural minor scale. The chromatic genus uses a minor third at the top and two smaller intervals (approximately semitones) below. The enharmonic genus uses a major third at the top and two quarter-tones (dieses) below — intervals smaller than any semitone, perceptible only to a highly trained ear and now entirely outside the practical vocabulary of Western music.

The Greater Perfect System (GPS, systema teleion meizon) assembled two tetrachords plus an added lower tone into a two-octave pitch collection with fifteen distinct pitches. This system served as the universal reference framework for ancient Greek theoretical descriptions of melody, much as the modern major scale serves as the default reference for tonal theory pedagogy. A companion system, the Lesser Perfect System (systema teleion elasson), assembled two tetrachords plus an added tone into a different configuration, spanning an octave plus a fourth rather than two full octaves.

The GPS spans two octaves from proslambanomenos (the "added" lowest note) through nete hyperbolaion. Reading from low to high, the system consists of: a low added tone, then the tetrachord hypaton, then the tetrachord meson (sharing its top note with the hypaton's top note — the tone mese, "middle note"), then a disjunctive whole tone, then the tetrachord diezeugmenon, then the tetrachord hyperbolaion. The note mese is the conceptual center of the system and functions something like a tonic in later theory, though the analogy is imprecise. Later theorists combined the GPS and LPS into the immutable system (systema ametabolon), the most comprehensive ancient pitch framework.

Medieval theorists encountered the GPS through Boethius’s Latin presentation, but they stripped away most of the contextual details — particularly the role of the genera and the performance practice associated with each modal species — leaving only the abstract pitch framework. This impoverishment of the ancient heritage through the process of Latin transmission is a recurring theme in the history of music theory’s reception of antiquity.

1.5 The Harmoniai, Musical Ethos, and Plato

Ancient theorists distinguished between harmoniai — characteristic melodic species or modal patterns — and assigned them powerful ethos (character, moral quality). Plato’s Republic (Books III and IV) offers the most culturally influential ancient account of modal ethos. Plato’s Socrates argues that only the Dorian harmonia (characterized as austere, martial, and self-controlled) and the Phrygian (fervent and courageous but not intemperate) should be permitted in the ideal city-state. The Lydian harmoniai — including the “slack” or “convivial” varieties — are to be rejected because they produce softness and effeminacy of character. Mixed Lydian harmoniai associated with lamentation are equally undesirable.

These judgments reflect the broader Platonic doctrine that music does not merely express emotional states but actively shapes character. Since the goal of education is the formation of virtuous citizens, music education must be carefully controlled. The Timaeus extends musical mathematics into cosmology: the Demiurge constructs the World-Soul from mathematical ratios including those of musical consonance — the proportions 1, 2, 3, 4, 9, 8, 27 — inscribing mathematical harmony into the very fabric of the cosmos. The world, for Plato, is literally musical in its deep structure: the regular motions of the planets reflect the same ratios that appear in the perfect musical consonances.

Aristotle’s account of musical ethos in the Politics (Book VIII) is more nuanced than Plato’s. Aristotle distinguishes between music used for education, for relaxation, and for intellectual cultivation (diagoge), arguing that different harmoniai are appropriate for different social functions. Unlike Plato, Aristotle allows the Lydian mode in education because it is suited to children. He also discusses catharsis — the emotional purgation that tragic drama produces — and extends this concept to music, suggesting that the Phrygian harmonia has cathartic power for those prone to religious excitement.

Aristides Quintilianus (second or third century CE) provides the most comprehensive surviving ancient synthesis of music theory. His De musica weaves together Pythagorean number theory, Aristoxenian interval theory, doctrines of ethos, and Platonic cosmology into a single discursive treatise. He is the primary ancient source for descriptions of the enharmonic genus in performance practice and for extended discussion of modal ethos as it relates to education and rhetoric. The sheer breadth of his synthesis makes Aristides Quintilianus a useful endpoint for the ancient tradition: virtually every thread of ancient Greek music-theoretical discourse appears somewhere in his work.

The ancient Greek harmoniai should not be conflated with the medieval church modes despite superficial terminological similarity. Medieval theorists borrowed Greek modal names (Dorian, Phrygian, Lydian, etc.) but applied them to a different pitch collection and in some cases to different scale species. The confusion of ancient and medieval modal theory was a recurring problem in Renaissance scholarship and persists in popular music literature to this day. The clearest sign of this confusion is the common description of the medieval Dorian mode as "the D mode" — which is correct for medieval Dorian but has no reliable correspondence to what ancient Greeks meant by the Dorian harmonia.

1.6 Claudius Ptolemy and the Synthesis of Pythagorean and Empiricist Approaches

Claudius Ptolemy (c. 100–170 CE), the great Alexandrian astronomer and geographer, also wrote the most mathematically sophisticated ancient music-theoretical treatise: the Harmonika (three books). Ptolemy’s ambition was to synthesize the mathematical rigor of Pythagoreanism with the perceptual sensitivity of Aristoxenianism. He rejected both extremes: pure Pythagorean ratio theory was too removed from perceptual reality (the ear cannot verify all the proposed intervals as consonant), while pure Aristoxenian magnitude theory was too imprecise (the ear alone cannot reliably measure intervals to the accuracy that theory requires).

For Ptolemy, the standard of truth in harmonic science is a collaboration between reason and perception: reason proposes hypotheses in the form of ratios, perception confirms or refutes them through listening. Ptolemy’s procedure in the Harmonika is accordingly empirical in a specific sense: he derives systems of tuning ratios from mathematical first principles, then tests them against the trained ear’s judgment, adjusting where necessary.

Ptolemy's Diatonic Tunings. Ptolemy catalogued several different diatonic tunings, distinguished by the specific ratios assigned to the two movable inner notes of the diatonic tetrachord. His preferred tuning — sometimes called the "syntonic diatonic" or "Ptolemy's intense diatonic" — uses the ratios 9:8, 10:9, and 16:15 for the three intervals of the tetrachord from top to bottom. This tuning produces the just major scale that Zarlino would later justify through the senario: its thirds are the pure 5:4 and 6:5 that Renaissance theorists would classify as consonant.

Ptolemy’s Harmonika exercised significant influence through Boethius, who drew on it extensively in De institutione musica. The syntonic diatonic tuning, transmitted through Boethius and rediscovered in the Renaissance, provided the theoretical foundation for just intonation and the senario — making Ptolemy’s ancient treatise an indirect but real ancestor of Zarlino’s Renaissance harmonics. The continuity between ancient mathematical theory and Renaissance practice is thus tighter than the usual narrative of “rediscovery” suggests.

Nicomachus of Gerasa (fl. c. 100 CE) wrote an Enchiridion harmonikon (Manual of Harmonics) that is somewhat less rigorous than Ptolemy’s Harmonika but more accessible, and it was this text that Boethius drew upon most heavily for the numerical tables in De institutione musica. Nicomachus presents the Pythagorean system with an emphasis on its cosmological significance: the same ratios that govern musical intervals govern the motion of the planets and the structure of the human soul. His account of the legendary discovery of musical ratios by Pythagoras — involving the sounds of smiths’ hammers of different weights — has been transmitted through Boethius to virtually every subsequent music-theoretical summary of Pythagoreanism, even though the story is almost certainly apocryphal (string tension, not hammer weight, determines pitch, so the hammer story does not actually demonstrate what it claims to demonstrate).

The hammer-and-anvil story in Nicomachus/Boethius is one of the founding myths of Western music theory — a narrative that crystallizes a complex intellectual tradition into a single dramatic moment of discovery. Like many foundational myths, it is more revealing of the tradition's values than of its actual historical origins. The story presents mathematical ratio as the essence of music, and discovery as a matter of measuring physical quantities. Whether or not Pythagoras stood in a smithy, the story tells us that the Pythagorean tradition believed music was about number and discovery was about measurement. That belief, transmitted through Boethius, shaped Western music theory for more than a millennium.

Chapter 2: Medieval Theory — Boethius, Guido, and Modal Theory

The transmission of ancient Greek music theory to the medieval Latin West was not direct but mediated through a small number of encyclopedic texts, the most influential of which was produced not by a practicing musician but by a late Roman philosopher and statesman. Anicius Manlius Severinus Boethius (c. 480–524 CE) wrote his De institutione musica as part of a larger educational project aimed at preserving the Greek intellectual heritage for Latin readers at a moment when direct access to Greek texts was rapidly diminishing. That this text — and not any surviving Greek original — became the standard reference for music theory in medieval universities is one of the central ironies of Western intellectual history.

2.1 Boethius and the Threefold Division of Music

Boethius’s treatise derives its material primarily from Nicomachus of Gerasa and Claudius Ptolemy, both of whom work in the Pythagorean tradition. Boethius presents a threefold classification of music that became canonical in medieval thought for more than eight centuries:

Boethius's Three Kinds of Music.
  • Musica mundana ("cosmic music"): the inaudible harmony of the celestial spheres, the proportional relationships governing the motions of planets and the cycles of the seasons. This music is not literally heard but known through mathematical reason alone.
  • Musica humana ("human music"): the harmony of the human body and soul — the proportional relationship between the rational and irrational parts of the soul and between soul and body. Again, not literally heard but known through introspection and philosophical analysis.
  • Musica instrumentalis ("instrumental music"): the only kind that is actually heard — the music of voices and instruments. This is the lowest and least important kind for Boethius, because it is perceived by the senses rather than understood by the intellect alone.

This hierarchy reflects a Neoplatonic value system that privileges intellectual over sensory knowledge. For Boethius, the true musician (musicus) is the one who understands the mathematical ratios underlying musical intervals — the philosopher of music — not the performer who merely produces sounds by habit and training. The performer is compared to a builder who constructs what the architect has designed; the musicus is the architect. This distinction between theoretical and practical knowledge, elevated by Boethian authority, shaped medieval university curricula for centuries, determining that music’s place in the university was as a discipline of number rather than a performing art.

Music’s place in the quadrivium — alongside arithmetic, geometry, and astronomy — reflects its status as a mathematical discipline. The quadrivium, together with the trivium (grammar, rhetoric, and dialectic), constituted the seven liberal arts forming the basis of medieval university education. Music theory in the medieval university was not primarily concerned with how to sing or compose but with understanding the numerical ratios that constitute musical intervals. The practical arts of chant and polyphony were transmitted through a separate, largely oral and guild-based pedagogical tradition that only partially intersected with the university theoretical tradition.

Boethius’s execution in 524 CE — he was imprisoned and killed by the Ostrogothic king Theodoric on charges of treason — gives his intellectual legacy a particular poignancy. He wrote the De institutione musica as part of a projected encyclopedic transmission of all four quadrivial disciplines (of which only the arithmetic, geometry, and partial music survive), and the Consolation of Philosophy during his imprisonment. The Consolation, with its treatment of Fortune, necessity, and the providential order of the cosmos, resonated deeply with medieval readers and contributed to Boethius’s near-canonization as a philosophical martyr. His authority in both music theory and philosophy was thus reinforced by biographical circumstances that made him a figure of both intellectual and moral exemplarity.

The story of Boethius illustrates a broader pattern in the history of music theory: theoretical authority is not purely intellectual but is shaped by the social, institutional, and biographical circumstances of the theorist. Boethius's authority derived not only from his access to Greek sources but from his social position as a Roman senator and court official, his reputation as a Christian philosopher, and the dramatic circumstances of his death. Similarly, Rameau's authority as a theorist was enhanced by his simultaneous fame as an opera composer; Schenker's authority was reinforced by his personal connections to Brahms and his editorial work on Beethoven's manuscripts. Theory and biography are not separable.

2.2 Why Medieval Musicians Read Boethius

The paradox of Boethian authority is that practicing musicians and liturgical cantors needed practical guidance that Boethius’s abstract ratio theory could not provide. Yet Boethius remained the authoritative theoretical reference throughout the medieval period because he represented the link to ancient Greek mathematical wisdom. The result was a two-track tradition: a theoretical tradition citing Boethius for philosophical legitimacy, and a practical tradition developing independent pedagogical tools for singers. The two tracks occasionally intersect — when a practical writer like Guido of Arezzo invokes Boethian categories, or when a theoretical writer acknowledges that his ratio mathematics bears on actual musical practice — but they are never fully integrated before the Renaissance.

Medieval writers frequently cite Boethius not because his ratio mathematics was practically useful but because citation of ancient authority was itself a marker of scholarly legitimacy. The actual practical guidance for singers came from other sources — Guidonian pedagogy, tonaries, and the musica plana tradition — while Boethius provided the theoretical superstructure that justified music's place in the university curriculum. This disjunction between theoretical authority and practical pedagogy is a structuring feature of medieval music-theoretical culture.

Boethius’s text itself breaks off before completing its fifth book — apparently the manuscript tradition preserves only an unfinished work — and so the medieval reader encountered Boethian theory without the systematic treatment of the modes that would have appeared in the projected completion. This incompleteness contributed to the distance between Boethian ratio theory and the practical modal theory that medieval musicians actually needed, creating a gap that practical writers like Guido of Arezzo rushed to fill.

2.3 Guido of Arezzo and Hexachord Solmization

Guido of Arezzo (c. 991–1033) made the most consequential practical contribution to music pedagogy in the medieval period. His several treatises — including the Micrologus, the Epistola de ignoto cantu (addressed to a monk named Michael), and associated texts on the hexachord method — introduced or systematized the hexachord solmization system that formed the basis of Western sight-singing pedagogy for more than five centuries. Guido’s pedagogical innovation was practical in its origins and consequences: he reportedly claimed that through his method, a singer could learn a new chant in a day that would formerly have taken a week.

The hexachord is a six-note segment of the diatonic scale: ut–re–mi–fa–sol–la. The syllables derive from the opening syllables of successive phrases of the Ut queant laxis hymn to St. John the Baptist, each phrase of which begins one step higher than the last. Within the hexachord, the unique position of the semitone between mi and fa served as the singer’s orientation point. Once a singer internalized the sound of mi moving to fa, they could navigate any diatonic melody by identifying where that semitone fell relative to each note.

The Three Hexachord Types.
  • The hexachordum naturale (natural hexachord): ut = C, containing no flat or sharp, spanning C–D–E–F–G–A.
  • The hexachordum durum (hard hexachord): ut = G, containing B-natural, spanning G–A–B–C–D–E. The "hard" designation refers to the square shape of the letter b used for B-natural, the ancestor of the modern natural sign.
  • The hexachordum molle (soft hexachord): ut = F, containing B-flat, spanning F–G–A–B♭–C–D. The "soft" designation refers to the rounded shape of the letter b used for B-flat, the direct ancestor of the modern flat sign.

Hexachord mutation — the practice of switching syllable assignments from one hexachord to another on a pitch shared between them — allowed singers to extend the solmization system across the full compass of medieval monophonic and polyphonic repertory. For example, when ascending past the top of the natural hexachord (A = la), a singer could “mutate” on G (sol in the natural hexachord, ut in the hard hexachord) and continue upward with re, mi, fa, sol, la on A, B, C, D, E.

The Guidonian hand provided a mnemonic spatial encoding of the entire pitch system: each joint and fingertip of the left hand corresponded to a specific pitch, and the teacher could silently indicate pitches to singers by pointing. The hand encompasses the entire theoretical gamut from Gamma-ut (the lowest note, named after the Greek letter gamma for G below the staff, combined with the hexachord syllable ut) to e-la (the highest note in the standard gamut). This spatial-kinesthetic mnemonic is one of the most ingenious pedagogical inventions in the history of music education.

Musica ficta (“feigned music”) referred to pitches outside the standard hexachord system — inflections like raising a leading-tone to approach the octave more smoothly, or flattening a pitch to avoid the tritone — that skilled performers were expected to supply even when not notated. The conventions governing musica ficta are reconstructed from theoretical treatises and are a central issue in the performance of medieval and early Renaissance polyphony today.

2.4 Mensural Notation and the Notre-Dame School

A revolution in the notation of rhythm accompanied the development of elaborate polyphonic composition at Notre-Dame Cathedral in Paris during the late twelfth and early thirteenth centuries. The composers Leonin and Perotin, named in the Anonymous IV treatise, are credited with the great polyphonic collections — the Magnus Liber Organi — that required new notational resources to specify rhythmic differentiation among multiple simultaneous voices.

Johannes de Garlandia (fl. c. 1270) codified the rhythmic modes — six patterns of long and short values based on the metrical feet of classical Latin poetry (trochee, iamb, dactyl, anapest, spondee, tribrachys) — that governed the rhythmic organization of Notre-Dame polyphony. These modes were indicated not by individual note shapes but by patterns of note groupings in ligature notation, requiring performers to parse the notation contextually.

In the first rhythmic mode, the pattern long–breve–long–breve repeats continuously, creating a triple-meter feel since the long equals two breves and each long–breve group fills three breve durations. This ternary patterning reflects the medieval theological preference for the number three as representing the Trinity. All six rhythmic modes are fundamentally ternary in organization, making binary rhythmic grouping a later development requiring explicit notation as "imperfect" mensuration.

Franco of Cologne’s Ars cantus mensurabilis (c. 1280) established the first systematic and explicit notation of rhythmic duration independent of modal pattern. Franco’s system assigned each note shape a definite default duration: the long (longa), the breve (brevis), and the semibreve (semibrevis), with the long equaling three breves in perfect (ternary) mensuration or two breves in imperfect (binary) mensuration. This was a conceptual breakthrough: for the first time in Western history, a note’s shape alone could indicate its duration. The subsequent development of mensural notation by Philippe de Vitry in his Ars nova (c. 1322) added the minim and formalized the distinction between perfect and imperfect mensuration at multiple hierarchical levels — the mensuration system that would govern musical notation through the fifteenth century.

2.5 The Eight-Mode System and the Tonary Tradition

The eight-mode system (Latin: octoechos, from Greek) organized the repertory of Gregorian chant into eight modal categories. Each mode was defined by three properties: its final (the pitch on which melodies in that mode characteristically end), its reciting tone or tenor (a secondary pitch on which psalm tones dwell through much of a recitation), and its ambitus (the range of pitches typically used). Modes 1 and 2 share the final D; modes 3 and 4 share E; modes 5 and 6 share F; modes 7 and 8 share G. Within each pair, mode 1 (authentic) typically ranges from the final up an octave, while mode 2 (plagal) ranges from a fourth below the final to a fifth above it.

Mode 1 (Dorian authentic): final D, range approximately D to D an octave higher, reciting tone A. Mode 2 (Hypodorian plagal): final D, range approximately A below to A above, reciting tone F. The plagal modes are distinguished by the prefix "Hypo-" and by their lower ambitus relative to the authentic modes sharing the same final. In practice many chants mix authentic and plagal characteristics, and the theoretical neatness of the modal system is a retrospective classification imposed on an existing repertory rather than a compositional rule that generated that repertory.

The tonary tradition — collections of chants organized by mode, often with intonation formulas to introduce singers to each modal category — served as both a practical reference for liturgical performers and a pedagogical introduction to modal classification. The theoretical frameworks for understanding mode were elaborated in treatises such as the Dialogus de musica (c. 1000, formerly attributed to Odo of Cluny) and the anonymous Musica enchiriadis and Scolica enchiriadis (ninth century). The Musica enchiriadis is the earliest source to describe organum (early polyphony based on parallel or contrary-motion voice pairing) in systematic terms, providing the foundational vocabulary for all subsequent polyphonic theory.

The eight-mode system derives ultimately from the Byzantine octoechos, adapted and partially transformed as it moved westward. The specific pitch content of the Western modes, and their correspondence (or lack thereof) to any ancient Greek model, was debated by medieval theorists and remains a topic of scholarly discussion. The late medieval discovery that the modal finals did not correspond to what Boethius described as the ancient Greek tonoi sparked efforts — culminating in Glarean's Dodecachordon — to reconstruct the ancient system and reconcile it with medieval practice.

2.6 Johannes Tinctoris and the Late Medieval Transition

Johannes Tinctoris (c. 1435–1511) is one of the most prolific and analytically sophisticated theorists of the late medieval and early Renaissance period. His twelve surviving treatises address counterpoint, modes, notation, proportion, and the effects of music — a breadth that places him alongside Zarlino as a representative of comprehensive theoretical ambition. His Liber de arte contrapuncti (1477) is particularly significant: it is the first counterpoint treatise to codify the consonance status of the major third and sixth, explicitly acknowledging what composers had been doing in practice for several generations.

Tinctoris also authored the Terminorum musicae diffinitorium (c. 1472–73), the first printed music dictionary in Western history, defining over three hundred terms with precision and cross-referencing that anticipates modern lexicographical practice. His theoretical system retains the eight-mode framework but applies it to polyphonic music in ways that require substantially more flexibility than purely monophonic applications allowed. The tension in Tinctoris between conservative modal taxonomy and progressive harmonic practice is a microcosm of the larger sixteenth-century tension that Glarean and Zarlino would eventually resolve.

Tinctoris's famous declaration in the preface to Liber de arte contrapuncti that no music composed more than forty years ago is worth hearing — meaning nothing before c. 1437 — is both audacious and revealing. It places the English composers Dunstaple and Power at the origin of the "modern" consonant style, identifying their use of thirds and sixths as the foundational innovation of Renaissance polyphony. This declaration is also theoretically pointed: Tinctoris implicitly argues that the Pythagorean framework — which classified thirds as dissonant — was unable to account for the music that mattered most to contemporary listeners.

2.7 Marchetto da Padova and the Chromatic Question

Marchetto da Padova (fl. 1305–1326) is one of the most original and controversial theorists of the late medieval period. His Lucidarium (1318) and Pomerium (1319) address tuning and mensural notation respectively, and together they represent the most ambitious Italian contribution to medieval music theory before the fifteenth century.

Marchetto’s most provocative theoretical claim concerns the division of the whole tone. Where Pythagorean theory divided the whole tone (ratio 9:8) into two unequal semitones — the limma (ratio 256:243) and the apotome (ratio 2187:2048) — Marchetto proposed an alternative division into five equal parts called diaschismata. This fivefold division allowed Marchetto to define a chromatic semitone (two diaschismata = 2/5 of a tone) and a diatonic semitone (three diaschismata = 3/5 of a tone). The chromatic semitone in Marchetto’s system is actually smaller than the diatonic semitone, which is the opposite of the Pythagorean relationship — a mathematically anomalous result that has puzzled theorists ever since.

The interpretation of Marchetto's diaschismata has been debated intensively in late-twentieth-century medieval music scholarship. Some theorists (Jan Herlinger) argue that Marchetto was proposing a genuine alternative tuning system for chromatic notes, producing sharper-than-Pythagorean leading tones in cadential contexts. Others (Margaret Bent) argue that Marchetto's unusual semitone division was primarily a notational convention for specifying the direction of chromatic alteration rather than a precise tuning prescription. The debate illuminates the general difficulty of reconstructing tuning practice from theoretical texts that were written in a tradition where the relationship between mathematical specification and acoustic realization was never fully explicit.

Marchetto’s Pomerium is also notable for its treatment of mensural notation, which extends Franco of Cologne’s system and introduces Italian-style notation that would diverge significantly from the French Ars nova tradition of Philippe de Vitry. The Italian and French notational systems of the fourteenth century — each with its own conventions for representing rhythmic values and their subdivisions — represent two distinct theoretical responses to the same compositional demands of the Trecento and Ars nova styles. Their eventual synthesis in the fifteenth century, producing the unified mensural notation used by Dufay and Ockeghem, is itself a kind of theoretical resolution of competing medieval approaches.


Chapter 3: Renaissance Theory — Zarlino and the Senario

The sixteenth century witnessed a transformation in music-theoretical thinking whose consequences remain visible in every introductory music theory textbook. The shift from modal to tonal thinking — from an organization of melody into modal species to an organization of harmony around consonant triads and their progressions — did not happen abruptly or completely in any single theorist’s work. But two texts, written within a decade of each other, mark the pivotal moment: Heinrich Glarean’s Dodecachordon (1547) and Gioseffo Zarlino’s Le istitutioni harmoniche (1558). These two works together reframe the entire theoretical inheritance from Greece and the Middle Ages in light of the compositional practice of the mid-sixteenth century.

3.1 Glarean and the Twelve-Mode System

The standard eight-mode system, inherited from Boethius and the Carolingian tonary tradition, recognized eight modal finals (D, E, F, G in both authentic and plagal forms). Glarean, a Swiss humanist and friend of Erasmus, observed that the same theoretical logic that generated those eight modes could be extended to recognize modes on A and C as well. His Dodecachordon (“twelve strings,” 1547) proposed a twelve-mode system adding four new modes to the traditional eight: the Aeolian (authentic, final A) and its plagal Hypoaeolian, and the Ionian (authentic, final C) and its plagal Hypoionian.

The Ionian and Aeolian modes correspond precisely to what later theorists would call the major scale and the natural minor scale, respectively. Glarean's theoretical innovation thus provided modal legitimacy for the two tonal scales that would dominate Western music for the next three centuries, even though Glarean himself did not frame his addition in terms of "major" and "minor" tonality — those concepts were not yet articulated. The historical irony is that Glarean's most enduring legacy is precisely the two modes that would eventually displace the entire modal system he was trying to codify.

Glarean also identified the Locrian mode (authentic, final B) and Hypolocrian (plagal, final B) but rejected them both as practically unusable because the fifth above B is diminished rather than perfect, violating a foundational requirement for a stable modal final. This exclusion of B as a modal final is itself theoretically significant, foreshadowing the later treatment of the diminished triad on the seventh scale degree as an unstable, non-foundational harmony in tonal theory. Glarean’s reasoning about modal viability anticipates the criterion of “stable fifth” that underlies both the modal and tonal frameworks.

3.2 Zarlino and Le istitutioni harmoniche

Gioseffo Zarlino (1517–1590), maestro di cappella at St. Mark’s Basilica in Venice — the most prestigious church music position in Italy — produced in Le istitutioni harmoniche a synthesis that would dominate Italian and European music theory for more than a century. Zarlino’s treatise is structured in four books: the first treats number and ratio theory; the second treats counterpoint; the third treats modes; and the fourth treats text-setting and the relationship between music and words. This comprehensive scope — physical, mathematical, practical, and aesthetic — signals Zarlino’s ambition to produce a summa of music theory on the model of Boethius’s educational summa.

Zarlino’s central theoretical innovation was the senario: the claim that musical consonance is explained by the ratios within the first six integers.

The Senario. The six integers 1, 2, 3, 4, 5, 6 and their pairwise ratios generate all consonant intervals:
  • Octave: 2:1
  • Fifth: 3:2
  • Fourth: 4:3
  • Major third: 5:4
  • Minor third: 6:5
  • Major sixth: 5:3 (derived by compounding a major third and a fourth)
The crucial additions over the Pythagorean system are the ratios 5:4 (major third) and 6:5 (minor third). In Pythagorean tuning, the major third is 81:64 — a complex ratio falling outside the senario — and is classified as a dissonance. In just intonation based on the senario, the major third 5:4 is fully consonant.

Zarlino’s senario provided a mathematical justification for what composers had been doing in practice since the mid-fifteenth century: treating thirds and sixths as consonances and building harmonies from stacked thirds (triads). The shift from a two-voice, intervallic conception of harmony to a three-voice, triadic conception — evident in the music of Dufay, Josquin, and Willaert — received its theoretical rationalization in Zarlino’s senario. Willaert, notably, was Zarlino’s own teacher at St. Mark’s, making the Istitutioni partly an exercise in theorizing a living compositional tradition the author had inherited directly.

3.3 Just Intonation and the Syntonic Comma

The adoption of the major third as a just consonance at the ratio 5:4 creates a new and different comma. The Pythagorean major third of 81:64 and the just major third of 5:4 differ by the syntonic comma:

\[ \frac{81/64}{5/4} = \frac{81}{64} \cdot \frac{4}{5} = \frac{81}{80}. \]

This ratio of 81:80 corresponds to approximately 21.51 cents. Whenever a scale is constructed using just perfect fifths (3:2) and just major thirds (5:4), some intervals will be narrowed or widened by a syntonic comma relative to their Pythagorean counterparts. The interaction of the syntonic comma with the Pythagorean comma defines the entire landscape of Renaissance and Baroque tuning theory, creating a family of related problems that no single solution fully resolves.

Consider the just major scale built on C: C (1/1), D (9/8), E (5/4), F (4/3), G (3/2), A (5/3), B (15/8), C' (2/1). The major third C–E is 5:4 (consonant by the senario). The major third G–B is also 5:4. But the fifth D–A is \(5/3 \div 9/8 = 40/27\) — a narrow fifth, smaller than 3:2 by a syntonic comma. This illustrates why pure just intonation is unworkable on a fixed-pitch keyboard: some instances of what appear to be the same interval differ in size by a comma, and a keyboard with twelve fixed pitches per octave cannot accommodate both tunings simultaneously.

The practical impossibility of just intonation on keyboard instruments drove the development of meantone temperament, in which the syntonic comma is distributed across four fifths (each narrowed by 1/4 syntonic comma), producing a pure major third of exactly 5:4 from C to E while slightly mistuning the fifths. Quarter-comma meantone was the standard keyboard tuning from roughly 1500 to 1700, used throughout the period of Palestrina, Monteverdi, Frescobaldi, and Purcell. The meantone fifth measures approximately 696.6 cents, compared to the pure fifth of 702.0 cents and the equal-tempered fifth of 700.0 cents.

The transition from meantone to well-tempered and eventually equal-tempered tuning is closely connected to the expansion of harmonic vocabulary in the seventeenth and eighteenth centuries. Meantone temperament produces extremely pure major thirds in the most common keys (C, G, D, F, B♭) but progressively worse thirds and a catastrophic wolf fifth in distant keys. As composers began writing in keys farther from C — particularly in the more chromatic works of late Baroque composers — the wolf became increasingly obtrusive. The solution was well temperament (German: Wohltemperatur): a family of unequal temperaments in which all keys are playable but in which the more common keys have slightly purer intervals than the less common keys. J.S. Bach’s Wohltemperirtes Clavier (1722 and 1742) was composed to demonstrate and celebrate the practical usability of all 24 major and minor keys under such a temperament — not equal temperament, which would not become standard until the nineteenth century, but one of several circulating well temperaments available in his time. The exact tuning Bach intended remains disputed among organologists and performance-practice scholars.

3.4 Vincenzo Galilei and the Florentine Camerata

The most direct challenge to Zarlino’s authority came from his own former student. Vincenzo Galilei (c. 1520–1591), father of the astronomer Galileo, participated in the intellectual circle known as the Florentine Camerata, a group of humanists, musicians, and noblemen who met at the home of Count Giovanni de’ Bardi in the 1570s and 1580s. The Camerata’s project was the revival of ancient Greek practice — or what they believed ancient Greek practice to have been — particularly the claim that Greek tragedy and epic poetry were sung throughout, and that the music heightened the natural speech inflections of the text to produce overwhelming emotional effect.

In his Dialogo della musica antica et della moderna (1581), Galilei attacked Zarlino on two fronts. First, he challenged the senario’s claim to explain consonance empirically: Galilei argued, based on physical experiments, that the ratio determining consonance depends not only on string length (as Zarlino assumed from the monochord tradition) but also on string tension and cross-section, with different consonance ratios resulting from variation in each physical parameter. Second, Galilei argued that the ear — not mathematical ratio — is the proper judge of musical intervals, and that in practice lute fretting and vocal practice demonstrate equal temperament, not just intonation.

The Camerata's cultural project had consequences far beyond music theory. Their attempt to revive Greek text-expression through a new kind of vocal writing — solo voice with simple harmonic accompaniment, following the natural inflections of speech — produced the experiments that, by 1600, had generated opera as an art form. Dafne (Peri/Caccini, c. 1598) and Euridice (Peri, 1600) are direct products of this music-theoretical argument. The theory of opera is inseparable from the Renaissance debate about ratio versus perception, mathematical authority versus expressive truth.

3.5 Nicola Vicentino and the Archicembalo

Nicola Vicentino (1511–1575 or 1576) represents the most radical Renaissance application of the ancient Greek tetrachord genera. In L’antica musica ridotta alla moderna prattica (1555), Vicentino argued that the full range of ancient Greek expression — diatonic, chromatic, and enharmonic genera — should be available to contemporary composers. To make this possible, he designed and built the archicembalo (or arciorgano), a keyboard instrument with 31 keys per octave approximating 31-tone equal temperament, providing intervals corresponding to the ancient enharmonic genus including quarter-tones.

Vicentino famously participated in a public debate with the Portuguese theorist Vicente Lusitano in Rome in 1551 over whether a particular polyphonic motet was written in the diatonic, chromatic, or combined genera. Lusitano won the debate as judged by the Roman music community, but Vicentino’s theoretical proposals remained a provocative challenge to the mainstream diatonic orientation of Renaissance polyphonic practice. The archicembalo is one of the most extreme artifacts of the Renaissance impulse to recover and surpass ancient musical practice through technology, and it anticipates much later developments in microtonality and alternative tuning systems.

The 31-tone equal temperament that Vicentino's archicembalo approximates is remarkable in retrospect: it provides nearly pure major thirds (of approximately 387 cents, compared to the just 386 cents), making it in some respects a better temperament for Renaissance triadic music than standard 12-tone equal temperament. Twentieth-century theorists including Adriaan Daniel Fokker revived interest in 31-tone equal temperament precisely for this reason, and the system has attracted renewed attention from microtonal composers.

3.6 From Mode to Key: The Late Renaissance Transition

The shift from modal to tonal organization in Western music is one of the most discussed and least agreed-upon transitions in music history. Theorists and historians have variously located it in the early seventeenth century (the generation of Monteverdi), in the mid-seventeenth century (the generation of Lully and Schütz), or as late as the 1680s–1700s (the generation of Corelli and Purcell). What is clear is that the transition involved several interlocking changes: the regularization of bass motion by fifth, the treatment of the dominant seventh chord as a directed, resolution-seeking dissonance, the emergence of the major–minor distinction as the primary tonal polarity (replacing the distinction between authentic and plagal modal variants), and the development of modulation as a formal and expressive resource.

Johannes Lippius (1585–1612), a Lutheran theologian and music theorist, is credited with introducing the term trias harmonica (harmonic triad) in his Synopsis musicae novae (1612), providing the first explicit definition of the triad as a root-position three-note chord built from a fifth with an added third. Lippius’s triad concept — which preceded Rameau’s by more than a century — identified the root-position triad as the normative harmonic unit, with other configurations derived from it by what he called remissio (a reduction or simplification) — a concept closely related to what Rameau would later call inversion.

The Trias Harmonica. Lippius defined the trias harmonica as the root-position triad — a sonority built by stacking a major or minor third and a perfect fifth above a single bass note. The three members of the triad were called the radix (root), the medius (middle note, i.e., the third), and the summus (top note, i.e., the fifth). Lippius's identification of these three roles — root, third, fifth — as defining the triad's identity regardless of which member appears in the bass anticipates Rameau's inversion theory, though Lippius does not develop the concept of inversion as systematically.

Michael Praetorius (1571–1621) and Heinrich Baryphonus (1581–1655) extended Lippius’s triad concept into practical compositional theory, while the figured-bass tradition developing simultaneously in Italy provided composers with a new practical notation for harmonic progressions that increasingly oriented itself around triadic root positions and their inversions. The convergence of these theoretical and practical developments — the triad as theoretical unit, figured bass as practical notation, fifth-progression as harmonic syntax — constitutes the emergence of tonal harmonic theory as a system, even before Rameau gave it its definitive conceptual formulation.


Chapter 4: Rameau and the Foundations of Tonal Theory

When Jean-Philippe Rameau (1683–1764) published his Traité de l’harmonie in 1722, he was a relatively obscure provincial organist approaching forty, not yet the celebrated opera composer he would become. The Traité was received with a mixture of admiration and puzzlement: admiration for its systematic ambition, puzzlement at its sometimes obscure argumentation and its occasionally inconsistent application of its own principles. Yet no single text has more deeply shaped the conceptual vocabulary of Western harmonic theory. The term “tonic,” the concept of chord inversion, the identification of root progressions as the fundamental syntax of tonal harmony — all derive ultimately from Rameau’s sustained effort to found harmonic theory on rational principles.

4.1 The Fundamental Bass

Rameau’s central innovation was the basse fondamentale (fundamental bass): a conceptual, not necessarily sounded, bass line representing the succession of chord roots underlying any harmonic progression. The fundamental bass moves by intervals of a perfect fifth, fourth, or third; progressions by fifth (up or down) are the strongest, by fourth slightly weaker, by third the weakest. Any actual bass note that is not the root of its chord is understood as an elaboration of an underlying fundamental bass movement that has not been placed in the actual bass.

The Fundamental Bass. For Rameau, every chord is defined by its root — the note that would appear in the bass if the chord were in root position. The basse fondamentale is the imaginary bass line formed by connecting successive roots. When the actual bass coincides with the root, the chord is in root position. When the actual bass is the third of the chord, it is in first inversion; when it is the fifth, it is in second inversion. The fundamental bass moves the same way regardless of which inversion is sounding in the actual music.

This conceptual move had enormous consequences. Before Rameau, each configuration of intervals above a bass note was treated as a distinct entity in the figured-bass tradition: the \(\binom{6}{3}\) chord was different from the \(\binom{5}{3}\) chord, and the \(\binom{6}{4}\) was different again. There was no concept linking these configurations as versions of “the same chord.” Rameau’s inversion theory provided exactly that link: the \(\binom{6}{3}\) and the \(\binom{5}{3}\) on different bass notes are both representations of the same root-position triad, related by what he called renversement (inversion).

The deeper implication is that chord identity is defined by root, not by bass position. Root movement — fifth up, fifth down, third up, third down — becomes the primary syntactic structure of tonal harmony, with voice-leading in the upper parts serving to connect one root-defined chord to the next. Every subsequent chord taxonomy, from Weber’s Roman numerals through contemporary neo-Riemannian analysis, rests on this Rameauian foundation.

4.2 Chord Inversion

The concept of chord inversion is so thoroughly embedded in modern music theory pedagogy that it is difficult to appreciate how revolutionary it was in 1722. To understand the innovation, one must understand the prior framework of the figured-bass tradition that Rameau inherited.

In the figured-bass tradition, every chord was defined by the intervals it placed above its actual bass note. A chord with a third and fifth above the bass was a “five-three” chord; one with a third and sixth was a “six-three” chord; one with a fourth and sixth was a “six-four” chord. These were three distinct entities governed by different rules about their resolution and treatment. The theoretical question of how they were related had no standard answer.

Consider the notes C–E–G. In root position (C in the bass), this is a "five-three" chord. With E in the bass, the intervals above the bass are a third (G above E) and a sixth (C above E): a "six-three" chord. With G in the bass, the intervals above the bass are a fourth (C above G) and a sixth (E above G): a "six-four" chord. Rameau's claim: all three are the same C major chord, differentiated only by which member appears in the bass. This is now universally accepted in harmonic theory but was genuinely new in 1722 and required considerable argumentative effort to establish against the entrenched figured-bass tradition.

Rameau tested his inversion theory against a wide range of repertory, arguing that the fundamental bass he derived from any passage — regardless of the inversions appearing in the actual music — always moved by the expected harmonic progressions (primarily fifth motions). This empirical testing gave the theory inductive support, and Rameau was a careful analyst of the repertory he discussed, even if his theoretical prose could be dense and sometimes circular.

A further implication of the inversion theory, developed more fully in Rameau’s later work, is the concept of harmonic function. If all inversions of a chord share the same root and the same fundamental bass position, then they also share the same harmonic function — the same role in a progression. A six-four chord on the dominant, in Rameau’s framework, is not a chord of a completely different harmonic character from a five-three chord on the dominant; it is the same dominant function in a different voice-leading configuration. This functional equivalence of root-related chords paves the way directly for Riemann’s Funktionslehre, in which the concept of harmonic function is extended and systematized into the three-function T/D/S framework. The intellectual lineage from Rameau’s basse fondamentale to Riemann’s function theory, running through the German reception of French harmonic theory in the work of Vogler, Weber, and Hauptmann, is one of the clearest examples of theoretical cumulation in the history of music theory.

4.3 The Corps Sonore

Rameau was not content with a purely practical theory of chord classification. He sought a physical basis for harmonic consonance in the corps sonore — the resonating body, which Rameau identified with the overtone series. When a resonating body vibrates, it produces not only its fundamental pitch but a series of higher partials:

\[ f, \; 2f, \; 3f, \; 4f, \; 5f, \; 6f, \ldots \]

The pitches \(f, 2f, 4f\) form the octave series; \(2f, 3f\) form a perfect fifth; \(4f, 5f\) a major third; and \(4f, 5f, 6f\) taken together form a major triad in root position. Rameau argued that nature herself, through the corps sonore, generates the major triad: it is not a human construction but a physical given inscribed in the very behavior of vibrating matter. This physical grounding was meant to give harmonic theory a certainty comparable to that of physics or mathematics.

The derivation of the minor triad from the corps sonore was the persistent weakness in Rameau's physical theory. The overtone series generates a major triad, but the minor triad does not appear in the same natural way. Rameau offered inconsistent solutions to this problem across his career: in the Traité he derived the minor chord by adding a sixth below the fundamental (anticipating Riemann's later "undertone" theory); in Génération harmonique (1737) he offered a different and equally problematic derivation. The minor triad remained the theoretical Achilles' heel of every physical harmonic theory, from Rameau through Helmholtz, Riemann, and beyond.

The corps sonore argument also raised a deeper epistemological question that Rameau never fully resolved: is the overtone series a fact about nature that music theory must accommodate, or is it a fact about music that nature is fortunate to provide? If the former, then harmonic theory is a branch of natural philosophy and changes as our physical understanding improves. If the latter, then the physics is merely illustrative and the theory has an independent logical status. Rameau tended toward the former interpretation, which made his theory vulnerable to revision as physics advanced — as indeed occurred when Helmholtz’s more rigorous psychoacoustics superseded Rameau’s more speculative corps sonore argument in the mid-nineteenth century.

The Overtone Series and Just Intonation. The connection between Rameau's corps sonore and just intonation theory is direct: if the consonant intervals are those generated by the overtone series, and the overtone series generates frequencies at \(f, 2f, 3f, 4f, 5f, 6f\), then the consonant intervals are precisely those of just intonation — octave (2:1), fifth (3:2), fourth (4:3), major third (5:4), minor third (6:5). The corps sonore thus gives a physical derivation of the very ratios that Zarlino had derived mathematically from the senario. The convergence of physics and number theory on the same set of ratios was, for Rameau, powerful confirmation that harmonic theory was grounded in something real — not merely in custom or convention.

4.4 Rameau’s Later Theoretical Career

Rameau’s theoretical work did not stop with the Traité. The Nouveau système de musique théorique (1726) refined his treatment of the fundamental bass and introduced the concept of the double emploi (double employment). In Rameau’s analysis of the chord on the fourth scale degree (IV), the chord can function as either a subdominant or as a “second dominant” (a dominant of the dominant), depending on its harmonic context. The double emploi allows Rameau to account for the IV chord’s dual role in cadential progressions without abandoning the fundamental-bass framework, and it anticipates the concept of functional ambiguity that would later be central to Riemann’s function theory.

The Génération harmonique (1737) represents Rameau’s most ambitious attempt to ground harmonic theory in physics and mathematics. He draws on the experiments of Joseph Sauveur with string vibrations and overtones to argue that the corps sonore is the generative principle of all harmonic practice. In Démonstration du principe de l’harmonie (1750), he attempts a more philosophical synthesis, situating music theory as a branch of natural philosophy governed by the same rational principles as Cartesian science. These later works reveal Rameau increasingly drawn toward a priori rationalism at the expense of the empirical flexibility that had made the Traité analytically persuasive.

Jean le Rond d’Alembert’s Éléments de musique théorique et pratique suivant les principes de M. Rameau (1752) popularized and clarified Rameau’s system for the readers of the Encyclopédie. It was primarily through d’Alembert’s simplified presentation — which stripped away Rameau’s more speculative derivations and presented the practical theory in clear Enlightenment prose — that Rameau’s ideas circulated among educated European readers of the period.

4.5 The Legacy of Rameau

Rameau’s legacy permeates every aspect of modern tonal theory. The I–IV–V–I harmonic paradigm as the definition of tonal closure; the identification of chord roots as the primary carriers of harmonic syntax; the concept that inversions are secondary to roots; the idea that tonic, subdominant, and dominant are the three primary harmonic functions — these propositions, developed by Rameau and refined by his successors, are so deeply embedded in standard music theory pedagogy that most students encounter them without knowing they have an author. When an undergraduate theory student labels a chord as I\(^6\) (first inversion tonic) rather than writing a figured-bass number, they are performing an act of Rameauian analysis, whether or not they know it. The entire tradition of harmonic analysis as a discipline distinct from counterpoint and figured-bass pedagogy is Rameau’s invention.

4.6 Rameau’s Contemporaries: Sorge and Tartini

Rameau was not the only early eighteenth-century theorist attempting to derive harmonic consonance from physical and mathematical principles. Georg Andreas Sorge (1703–1778), a German organist and theorist, independently discovered the overtone series as the basis of consonance and published his own harmonic theory in Vorgemach der musicalischen Composition (1745–47), predating Rameau’s Génération harmonique in some respects. Sorge’s contribution has been largely overlooked because of Rameau’s greater fame and more comprehensive systematization, but the simultaneous and independent development in Germany testifies to how broadly the question of harmonic foundations was pressing on European theoretical consciousness in the early eighteenth century.

Giuseppe Tartini (1692–1770), the Italian violinist and composer, proposed a distinctive theory of harmony in his Trattato di musica secondo la vera scienza dell’armonia (1754). Tartini’s theory centered on the phenomenon of combination tones — the faint lower pitch audible when two higher pitches are sounded simultaneously — which he called the terzo suono (“third sound”). The combination tone of two pitches \(f_1\) and \(f_2\) (with \(f_1 > f_2\)) is the difference tone \(f_1 - f_2\). Tartini argued that the most consonant intervals are those whose combination tone reinforces rather than contradicts the harmonic structure of the interval.

For the interval of a perfect fifth (frequencies in ratio 3:2, say 300 Hz and 200 Hz), the difference tone is 300 − 200 = 100 Hz — the octave below the lower pitch, which reinforces the fundamental. For the major third (5:4, say 500 Hz and 400 Hz), the difference tone is 500 − 400 = 100 Hz — again the fundamental, two octaves below 400 Hz. For the minor sixth (8:5, say 800 Hz and 500 Hz), the difference tone is 300 Hz — a pitch not in the interval's harmonic structure, slightly disrupting the consonance. Tartini's combination-tone theory thus provides an alternative physical grounding for consonance hierarchy that partially corroborates and partially diverges from Rameau's overtone-series theory.

Chapter 5: German Theory from Kirnberger to Riemann

The reception and transformation of Rameauian harmonic theory in the German-speaking lands produced a distinct tradition characterized by greater emphasis on counterpoint, modal retention, and eventually a systematic functional vocabulary. Where French theory centered on the fundamental bass and the corps sonore, German theory centered on figured-bass pedagogy, the authority of Bach’s practice, and — in its culminating figure, Hugo Riemann — a grand philosophical synthesis of harmony, logic, and musical perception. The German tradition also engaged more directly with philosophy — with Kant, Hegel, and the tradition of German idealism — giving its music theory a more explicitly metaphysical flavor than its French counterpart.

5.1 Kirnberger and the Bach Legacy

Johann Philipp Kirnberger (1721–1783) studied with J.S. Bach in Leipzig and devoted much of his theoretical career to codifying and justifying Bach’s harmonic and contrapuntal practice. His Die Kunst des reinen Satzes in der Musik (The Art of Strict Musical Composition, 1771–79) is a comprehensive figured-bass and counterpoint treatise that treats Bach’s chorales and two-part inventions as normative models, making the implicit claim that the greatest theoretical authority is not ancient precedent or mathematical derivation but the practice of the greatest living master.

Kirnberger's idealization of Bach as the supreme authority for contrapuntal practice established a paradigm of historical legitimation that would recur throughout German music theory. Riemann later appealed to Beethoven in a similar way. The identification of a great master's practice as the empirical norm from which theory should be derived — rather than deriving practice from a priori theoretical principles — reflects a distinctively German empirico-historical orientation in music theory, one that stands in sharp contrast to Rameau's rationalist derivation of harmony from the corps sonore.

Kirnberger also made contributions to tuning theory, proposing Kirnberger III temperament — a compromise between pure just intonation and equal temperament, preserving pure fifths and thirds in frequently used keys while distributing impurities into less common ones. This temperament is favored by some period-instrument performers for Bach’s keyboard music. Kirnberger’s theoretical conservatism led him to resist more adventurous harmonic practices of his contemporaries and to advocate for relatively strict voice-leading in the Bachian mold.

5.1b Johann Mattheson and the German Figured-Bass Tradition

Johann Mattheson (1681–1764), a Hamburg musician and prolific writer, represents the other face of early German music theory: not the speculative mathematical tradition descending from Boethius but the practical rhetorical tradition that treated music composition as a branch of rhetoric. His Das neu-eröffnete Orchestre (1713), Critica musica (1722–25), and the massive Der vollkommene Capellmeister (The Complete Music Director, 1739) collectively constitute the most comprehensive treatment of the practical aspects of music composition in the German Baroque.

For Mattheson, musical composition is governed by the same principles as persuasive oration: it must have a dispositio (formal arrangement), an inventio (invention of material), and an elaboration of that material through the musical equivalent of rhetorical figures (Figuren). The doctrine of musical figures (Figurenlehre) — a systematic catalog of melodic and harmonic devices analogous to rhetorical figures like anaphora, antithesis, and climax — was developed by a series of German theorists from Joachim Burmeister (1564–1629) through Johann David Heinichen (1683–1729) and Mattheson himself.

Musical Figures and Affective Theory. In the German Baroque Figurenlehre, specific melodic and harmonic patterns correspond to specific emotional or rhetorical effects. The suspiratio (a short rest followed by a short note) represents sighing; the catachresis (unexpected dissonance) expresses violence or grief; the anabasis (a rising figure) expresses joy or ascent; the catabasis (a falling figure) expresses sadness or descent. These figures connect music theory to the broader Baroque doctrine of the Affektenlehre (theory of affects), which held that music could represent and arouse specific emotional states through systematic means.

The Affektenlehre and Figurenlehre traditions represent a different strand of music-theoretical thinking from the ratio-based and function-based traditions charted in this course. Where Rameau grounds harmonic theory in physics and Schenker grounds it in deep voice-leading structure, the German rhetorical tradition grounds compositional practice in emotional expressiveness and communicative effectiveness. The rhetorical strand of music theory has been revived in late-twentieth-century scholarship by theorists including Leonard Ratner (Classic Music, 1980), who developed the concept of musical topics (characteristic styles and genres used as expressive signifiers within Classical-era compositions), and Wye Allanbrook (Rhythmic Gesture in Mozart, 1983), who applied topic theory systematically to Mozart’s operas.

5.2 Gottfried Weber and Roman Numeral Analysis

Gottfried Weber (1779–1839) produced in his Versuch einer geordneten Theorie der Tonsetzkunst (Attempt at a Systematic Theory of Musical Composition, 1817–21) the most comprehensive harmonic theory of the early nineteenth century and the direct ancestor of the Roman numeral analytical notation that is now standard in American undergraduate pedagogy. Weber’s approach was deliberately empirical and inductive: he documented harmonic practice as he found it in the common-practice repertory rather than deriving it from a single generating principle.

Weber systematically labeled chords by the scale degree of their root, using Roman numerals (I through VII for the seven diatonic scale degrees). He extended this system to secondary dominants — chords that function as dominants of scale degrees other than the tonic. The notation V/V (read “five of five”) designates the dominant of the dominant; V/IV is the dominant of the subdominant, and so on.

Secondary Dominants in Weber's System. A secondary dominant is a chord that functions as the dominant of a chord other than the tonic. In C major, the secondary dominant of V (V/V) is a major triad or dominant seventh chord built on D, functioning momentarily as if G were a local tonic. Weber's notation extends the Roman numeral system to represent temporary tonicization without committing to a full modulation. This hierarchical slash notation has proven far more durable than Rameau's fundamental bass for practical analytical purposes and remains standard in American pedagogy.

Weber’s system is more flexible than Rameau’s fundamental bass in its ability to represent chromatic harmonies through the secondary dominant notation, but it is less theoretically motivated. Weber is content to document and label what he finds in the repertory; explaining why these progressions work the way they do is a task he largely leaves to his successors. This descriptive flexibility has pedagogical advantages but theoretical costs that later theorists, particularly Riemann, would attempt to address through a more principled functional vocabulary.

5.3 A.B. Marx and the Sonata Principle

Adolf Bernhard Marx (1795–1866) shifted German music theory from harmonic syntax toward formal architecture. His four-volume Die Lehre von der musikalischen Komposition (The Theory of Musical Composition, 1837–47) defined the sonata form as the central organizational principle of the Viennese Classical style, and it introduced the terminology that American music theory textbooks still use today.

Marx coined the terms Hauptsatz (main theme), Seitensatz (secondary theme), Schlussgruppe (closing group), and used the term Durchführung (development) for the developmental section. He conceived sonata form not as a key-scheme or formal template but as a dramatic process: the Hauptsatz establishes a character or energy; the Seitensatz provides contrast and opposition; the development creates conflict and instability through fragmentation and tonal destabilization; the recapitulation achieves resolution and synthesis.

The Beethoven symphony — particularly the Eroica (No. 3) and the Fifth — served for Marx as the paradigm of organic form, a term derived from Goethe’s nature philosophy and German Romantic aesthetics. An organism does not have its parts assembled externally but generates them from within through a single generative principle; Marx argued that the greatest musical works similarly develop all their material from a single motivic seed through an organic process. This organicist aesthetic was enormously influential on subsequent German music theory and criticism, most notably on the tradition of motivic-thematic analysis that runs from Marx through Schoenberg’s concept of Grundgestalt to Rudolph Réti’s thematic analysis in the twentieth century.

Marx’s impact was also felt outside Germany, particularly through Eduard Hanslick (1825–1904), whose Vom Musikalisch-Schönen (On the Musically Beautiful, 1854) is the foundational text of musical formalism — the claim that music has no meaning beyond its own structural processes and that the “content” of music is “tonally moving forms.” Hanslick’s formalism is a response to the program music aesthetics of Liszt and Wagner, who claimed that music could express specific extra-musical meanings (narratives, images, ideas). For Hanslick, such claims confuse music with poetry or painting; genuine musical meaning is purely structural, and the form-based tradition of German music theory from Marx onward provides the conceptual vocabulary for describing that structural meaning. The debate between Hanslick’s formalism and Wagner’s expressive aesthetics is a music-theoretical debate as much as an aesthetic one: it concerns the proper object of music theory and the appropriate concepts for describing what music does.

5.4 Hauptmann and Hegelian Dialectics

Moritz Hauptmann (1792–1868), a student of Spohr and a colleague of Mendelssohn at the Leipzig Conservatory, brought Hegelian philosophy explicitly into music theory. His Die Natur der Harmonik und der Metrik (The Nature of Harmony and Meter, 1853) argued that harmonic relationships are structured by the same dialectical logic that Hegel used to analyze historical and philosophical processes: the movement from unity through opposition to synthesis.

For Hauptmann, the tonic (I) is the thesis — the initial, undifferentiated unity. The dominant (V) is the antithesis — difference, the fifth relationship pulling away from the tonic. The subdominant (IV) is the synthesis — a new unity incorporating the opposition of the first two, since IV contains the same fifth relationship as V but directed downward from the tonic rather than upward. Together, tonic, dominant, and subdominant form a dialectical triad whose resolution in the cadence (S–D–T or IV–V–I) mirrors the Hegelian movement of thought.

Hauptmann's Hegelian framework exercised significant influence on Hugo Riemann's systematic thinking, even though Riemann ultimately moved away from Hauptmann's explicitly dialectical language toward the more functionalist terminology of T, D, and S. The idea that tonic, dominant, and subdominant are not merely three scale degrees among seven but the three foundational categories from which all harmonic meaning derives is shared by Hauptmann and Riemann, and both trace it ultimately to the structural asymmetry of the fifth relationship as the primary generator of tonal space.

5.5 Hugo Riemann: Function Theory and Harmonic Dualism

Hugo Riemann (1849–1919) is the most encyclopedic music theorist in the Western tradition since Gioseffo Zarlino. His writings span harmonic theory, counterpoint pedagogy, music history, acoustics, music psychology, and musicology. His most enduring theoretical contribution is the system of Funktionslehre (function theory), developed across several treatises culminating in the Vereinfachte Harmonielehre (Simplified Harmony, 1893) and Handbuch der Harmonielehre (1887).

Riemann’s function theory reduces the entire harmonic vocabulary of tonal music to three fundamental functions: Tonic (T), Dominant (D), and Subdominant (S). Every chord in a key, no matter how chromatic, can be analyzed as one of these three functions or as a chromatic variant indicated by letter superscripts and subscripts. The analytical power of the system is that it focuses attention on harmonic function — what a chord does in a progression — rather than on scale-degree identity alone.

Riemann's Function Labels. In Riemann's notation, the following labels are standard for C major:
  • T: tonic triad (I = C–E–G)
  • Tp: tonic parallel — the relative minor, vi (A–C–E)
  • Tg: tonic "Gegenklang" (counter-sound) — the mediant, iii (E–G–B)
  • D: dominant triad (V = G–B–D)
  • Dp: dominant parallel — the supertonic or leading-tone triad depending on context
  • S: subdominant triad (IV = F–A–C)
  • Sp: subdominant parallel — the supertonic, ii (D–F–A)
Seventh chords and chromatic alterations are indicated by further superscript and subscript notation, allowing the system to represent a very wide range of chromatic harmony within the T/D/S framework.

Underlying Riemann’s function theory is a deeper and more controversial commitment: harmonic dualism. Riemann argued that the major and minor triads are not merely scale-based formations but represent two symmetrically opposite manifestations of the same acoustic principle. The major triad is generated “upward” from a root; the minor triad, Riemann claimed, is generated “downward” from a conceptual “root” (the Klangvertreter, “tone representative”) at the top. In C minor, for example, the minor triad C–E♭–G is understood as built downward from G, making G its structural generator.

Riemann's harmonic dualism was controversial from the start and is generally rejected in contemporary music theory. The "undertone series" he invoked to justify downward generation — a mirror image of the overtone series — has no physical basis: unlike overtones, undertones do not spontaneously occur in vibrating bodies. Nevertheless, Riemann's dualism generated the analytical apparatus of Neo-Riemannian theory (Chapter 7), which exploits the symmetrical properties of the tone-net without committing to dualism's physical or psychological claims. The tonnetz is a direct inheritance from Riemann's dualistic imagination.

Riemann’s notational system remains in use in German-language university music theory curricula today, coexisting with the Roman numeral tradition descended from Weber. The two systems are not easily reconciled: Roman numeral analysis is primarily scale-degree based, while Riemann’s function labels are primarily function based and abstract away from specific scale degrees and inversions. Both traditions remain active and generative in contemporary research and pedagogy, and the tension between them is itself theoretically productive.

5.6 Hermann von Helmholtz and the Science of Consonance

Hermann von Helmholtz (1821–1894) was a physicist and physiologist rather than a music theorist in the traditional sense, but his Die Lehre von den Tonempfindungen als physiologische Grundlage für die Theorie der Musik (On the Sensations of Tone as a Physiological Basis for the Theory of Music, 1863) is one of the most important single contributions to music-theoretical thought in the nineteenth century. Helmholtz’s central thesis is that consonance and dissonance are physiological phenomena determined by the relationship between the overtone structures of simultaneously sounding tones.

When two tones sound simultaneously, their upper partials may or may not coincide. When partials of the two tones are close but not identical in frequency, they produce beats — periodic amplitude fluctuations perceived as roughness. Consonance corresponds to minimal roughness (partials coincide or are far apart in frequency); dissonance corresponds to maximal roughness (partials produce slow, audible beats).

Helmholtz's Theory of Consonance. Two tones are consonant to the extent that their overtone series share common partials or have no partials near enough to produce perceptible beats. The perfect fifth (3:2) is highly consonant because its lower partials coincide at the third harmonic of the lower note and the second harmonic of the upper note (i.e., 3 × f₁ = 2 × f₂). The major third (5:4) is moderately consonant because coinciding partials first appear at the fifth and fourth harmonics respectively. Intervals outside the senario — like the Pythagorean major third (81:64) — produce beats among their partials and are heard as dissonant.

Helmholtz’s theory provided a new and more rigorous physical basis for what Zarlino had derived from the senario: the consonance of the major third 5:4 is now explained not by the simplicity of the ratio but by the physiological phenomenon of beats in the auditory system. This shift from number theory to physiology as the grounding of musical consonance is one of the major transitions in music theory’s intellectual history, opening the discipline to the methods of experimental science.

Helmholtz also investigated the acoustic basis of timbre (which he analyzed in terms of overtone structure), the construction of musical scales across cultures, and the history of tuning systems. His Tonempfindungen is an encyclopedic work that influenced subsequent work in psychoacoustics, ethnomusicology, and music cognition, making Helmholtz — alongside Rameau and Riemann — one of the principal architects of the scientific study of music.


Chapter 6: Schenker and the Organicist Tradition

Heinrich Schenker (1868–1935) developed the most influential and most contested theory of tonal music produced in the twentieth century. His work reshaped graduate music theory pedagogy, particularly in the United States, and the analytical method named after him — Schenkerian analysis — is now simultaneously a standard tool in academic music theory and a focal point of sharp debate about the politics of the classical canon, the boundaries of analytical applicability, and the discipline’s responsibility to its historical origins.

6.1 Biography and Intellectual Context

Schenker was born in Wisniowczyk, Galicia (then part of the Austro-Hungarian Empire, now Ukraine), received his training in law and music at the University of Vienna, and spent his adult life in Vienna as a pianist, teacher, editor, and theorist. His intellectual formation was shaped by the conservative Viennese cultural milieu of the fin de siècle: the world of Brahms (whom he knew personally and whose encouragement he prized), of German idealism, and of a profound suspicion toward musical modernism.

Schenker’s relationship to Brahms — he played his own piano compositions for the aging composer — deeply informed his sense of the German classical tradition as a living standard against which all subsequent music must be measured. His antagonism toward Reger, Strauss, and especially Schoenberg was intense and personal. He edited critical editions of Bach keyboard works and Beethoven piano sonatas, adding extensive analytical commentary, and these editorial projects were as important to his theoretical project as his systematic treatises. The Erläuterungsausgaben (annotated editions) of the last five Beethoven piano sonatas remain important analytical documents in their own right.

6.2 Harmonielehre (1906)

Schenker’s first major theoretical work, the Harmonielehre (published as Part I of a projected Neue musikalische Theorien und Phantasien), is not primarily a theory of chord progressions in the Roman numeral sense. Its central concept is the Stufe (scale step or degree): a harmonic entity representing a complete tonal region within the key, not merely a momentary vertical chord. Where Roman numeral analysis treats each chord as an instantaneous event, the Stufe is a prolonged area of harmonic space that may persist across many surface chords before yielding to the next Stufe.

The Stufe. A Stufe in Schenker's theory is a scale degree that functions as a prolonged harmonic area — a region of tonal space — rather than a momentary vertical sonority. When Schenker analyzes a passage and identifies its harmonic structure as "I – V – I," he means that the music prolongs these three scale-degree areas through a succession of surface-level chords and voice-leading elaborations. Multiple surface chords can be subordinate to and elaborating a single Stufe. This is the beginning of the hierarchical thinking that will become the Ursatz in Schenker's later work.

The Harmonielehre also contains Schenker’s analysis of the scale as a mental (ursächlich) representation — not a physical given but a conceptual construct that the trained musical mind brings to the experience of tonal music. The diatonic major scale is not a collection of equally weighted notes but a hierarchically organized system in which some degrees are structurally primary and others ornamental, a principle systematically developed in his later work. The text is famously demanding: Schenker’s prose is dense and polemical, his examples are extensive, and his implicit assumptions often exceed what he makes explicit.

Schenker’s concept of the Stufe has an important polemical dimension: it was directed against the prevailing figured-bass-based harmonic theory of his time, particularly as represented in the harmonic textbooks of Simon Sechter (1788–1867) — the teacher of Bruckner and Schubert’s late teacher — and their continuation in the work of Anton Bruckner’s own theory teaching. Sechter’s system, based on root-progression theory derived from Rameau, treated every chord as a distinct harmonic entity with a determinate root, and analyzed complex passages by assigning a root to every chord. Schenker objected that this approach produced analyses of absurd complexity for simple passages: a short Beethoven phrase might receive a dozen or more Roman numeral labels where Schenker would identify only two or three Stufen, with the intermediate chords serving as passing or neighboring elaborations of a single sustained harmonic area.

The debate between Schenker's Stufen-based harmonic analysis and Sechter's root-progression analysis is a debate about the appropriate level of harmonic abstraction. Sechter asks: what is the root of each chord? Schenker asks: what harmonic area does this passage prolong? Both are legitimate questions, but they produce different analytical descriptions of the same music. The choice between them depends on what aspects of harmonic structure one takes to be most musically significant — a question that remains genuinely open in contemporary music theory pedagogy.

6.3 Kontrapunkt (1910–22)

Schenker’s two-volume Kontrapunkt undertakes a rereading of strict species counterpoint as the foundation of free tonal composition. Unlike most counterpoint pedagogy, which treats strict counterpoint as a pedagogical exercise largely distinct from free composition, Schenker argues that the voice-leading relationships codified by Fux (and behind Fux by Palestrina) in strict counterpoint are literally operative — in a conceptually elaborated or “prolonged” form — in every tonal masterwork. This is one of Schenker’s most heterodox and difficult claims.

The key concept is prolongation: a structural interval or voice-leading motion can be extended over a much longer span of musical time through the interpolation of elaborating figures. Passing tones, neighbor tones, and arpeggiations that in strict counterpoint are local ornaments become, in free composition, the mechanisms by which short structural voice-leading motions are expanded into passages of indefinite length. The simplest two-voice strict counterpoint, prolonged through these mechanisms, is for Schenker the deep structural reality of a multi-movement symphony.

6.4 Der freie Satz (1935)

The culminating synthesis of Schenker’s theoretical project, Der freie Satz (Free Composition, posthumously published 1935 and translated into English by Ernst Oster in 1979), presents the complete theory of musical structure in terms of three hierarchical levels and two fundamental voice-leading archetypes.

The Ursatz and Structural Levels.
  • Ursatz (fundamental structure): the deepest structural level of any tonal work, consisting of two components: the Urlinie (fundamental line) — a stepwise melodic descent from scale degree \(\hat{3}\), \(\hat{5}\), or \(\hat{8}\) down to \(\hat{1}\) in the soprano voice — and the Bassbrechung (bass arpeggiation) — a motion from I up to V and back to I in the bass. Every tonal masterwork has exactly one Ursatz.
  • Hintergrund (background): the level at which the Ursatz operates. Background structure is universal: all tonal works share the same background, differing only in which Urlinie descent form obtains.
  • Mittelgrund (middleground): intermediate structural levels at which the Ursatz is elaborated by prolongations — arpeggiations, linear progressions, voice-exchanges — expanding the time-span over which background events unfold.
  • Vordergrund (foreground): the surface of the music as actually heard, with all its ornamental figuration, motivic detail, and rhythmic articulation.

The analytical procedure of Schenkerian analysis is essentially a process of “reduction” — progressively removing foreground elaborations to reveal the middleground prolongations, then removing those to reveal the background Ursatz. Graphical representation uses open noteheads (whole notes) for structural tones, filled noteheads for subordinate tones, and slurs and beams to indicate prolongational spans and linear progressions. The reductive graph is the primary medium of Schenkerian analysis, and learning to read and produce such graphs is the central task of Schenkerian pedagogy.

A Schenkerian analysis of the opening of a Bach chorale might identify the surface as a series of four-voice chords. A foreground reduction eliminates passing tones and neighbor notes within individual voices. A middleground reduction shows a linear progression from \(\hat{3}\) descending through \(\hat{2}\) to \(\hat{1}\), supported by an I–V–I harmonic motion. The background is the Ursatz itself: Urlinie \(\hat{3}\)–\(\hat{2}\)–\(\hat{1}\) over the Bassbrechung I–V–I. The entire analytical project demonstrates that the surface diversity of the music flows from this single, simple background structure through systematic prolongation.

6.5 Schenker’s Cultural Politics

Any serious engagement with Schenker’s work must confront his cultural politics. Schenker was an ardent German nationalist and an explicit antisemite. His diaries, now published and translated (the online Schenker Documents Online project), contain numerous expressions of racial contempt extending to major composers and performers. His published writings repeatedly claim that the capacity for deep tonal thinking — the capacity to generate organic masterworks from a background Ursatz — is uniquely Germanic, exemplified by Bach, Haydn, Mozart, Beethoven, and Brahms.

Schenker's racism is not incidental to his theory but structurally related to it. The claim that only certain composers produce genuine masterworks, and that only the German cultural tradition can generate those composers, underwrites the analytical claim that Schenkerian analysis applies to a determinate and bounded canon. When post-Schenkerian theorists extend the method to non-German repertory, they implicitly reject this ethnocentric scaffolding while retaining the analytical machinery. Whether the analytical machinery is truly separable from the ideology that generated it remains an open and genuinely difficult question in current scholarship.

6.6 American Reception and Post-Schenkerian Critiques

Felix Salzer’s Structural Hearing (1952) was the first comprehensive English-language presentation of Schenkerian analysis, and it deliberately extended the method to medieval polyphony and contemporary music. Allen Forte and Steven Gilbert’s Introduction to Schenkerian Analysis (1982) codified an orthodox version of the method for graduate pedagogy. Forte’s terminological choices — including the concept of “interruption” for the I–V || I–V–I background pattern — became standard in American doctoral programs.

Critical responses have been wide-ranging and substantive. Joseph Straus’s “The Problem of Prolongation in Post-Tonal Music” (1987) argued that the concept of prolongation, as Schenker defined it, requires conditions of triadic tonality that do not obtain in atonal music; attempts to apply Schenkerian analysis to Schoenberg or Bartók are therefore conceptually incoherent unless the concept of prolongation is fundamentally reconceived. Feminist critiques have examined the gendering of musical form in Schenker’s theory: Susan McClary and others have pointed out that the discourse of structural “penetration” and “masculine” structural descent versus “feminine” ornamental elaboration in Schenker’s writing reflects and reinforces gender ideologies that have nothing to do with the musical structures being described. Philip Ewell’s “Music Theory’s White Racial Frame” (2020), delivered as the Society for Music Theory plenary address, argued that American music theory’s privileging of Schenkerian analysis is inseparable from the discipline’s historical exclusion of non-white composers, theorists, and repertories, and that addressing this exclusion requires more than simply diversifying the analytical canon while leaving the methods unchanged. The subsequent exchange — published as a multi-article symposium in the Journal of Schenkerian Studies — drew responses ranging from careful scholarly engagement to statements that many readers found indefensible in their dismissiveness; the editorial handling of those responses became itself a subject of controversy and led to significant institutional consequences within the discipline.

6.7 Expanding the Schenkerian Method: Prolongation in Extended Tonal Music

Felix Salzer’s decision to apply Schenkerian concepts to medieval and Renaissance music was not merely an act of scholarly imperialism; it was a genuine theoretical argument. Salzer contended that structural hearing — the ability to perceive hierarchical levels of musical structure — is a property of trained musical perception in general, not exclusively of Viennese Classical tonal perception. Medieval polyphony, he argued, exhibits prolongational structures: a voice-leading motion in one part can be prolonged through elaboration in a passage of several measures, just as in Bach or Beethoven.

This claim has provoked substantial methodological debate. William Rothstein and others have argued that Salzer’s approach misidentifies ornamental elaboration for structural prolongation, applying the formal categories of Schenkerian analysis to repertory that lacks the triadic-tonal harmonic framework those categories presuppose. The debate illuminates a fundamental ambiguity in Schenker’s own concept of prolongation: it is never entirely clear whether prolongation is a property of the tonal system (requiring triadic tonality as a precondition) or a property of voice-leading in general (potentially operating across any pitch system that has registral and directional preferences).

The debate over the extension of prolongation theory beyond common-practice tonality connects to broader questions in music theory about the relationship between theory and its object. Is Schenkerian analysis a theory of a specific historical style (Viennese tonality, roughly 1700–1900), or a general theory of hierarchical voice-leading? Schenker himself held the former view; Salzer and subsequent theorists have explored the latter. The answer matters practically: if the former, then Schenkerian analysis is a historical discipline; if the latter, it aspires to the status of a general music theory. Neither position has achieved consensus.

Carl Schachter’s analytical writings, collected in Unfoldings: Essays in Schenkerian Theory and Analysis (1999), represent perhaps the most nuanced application of Schenkerian concepts to a wide range of tonal repertory, including music by Chopin, Brahms, and Schubert that Schenker himself rarely analyzed in depth. Schachter’s work demonstrates how prolongational concepts can illuminate music at the edges of common-practice tonality without either abandoning Schenkerian principles or applying them mechanically. His analyses consistently show how rhythmic and formal processes interact with prolongational structure — a dimension of musical organization that the purely pitch-based Schenkerian graphic notation tends to underrepresent.

6.8 Form Theory: Caplin, Hepokoski, and Darcy

Schenker’s influence on music theory generated not only a tradition of voice-leading analysis but also a renewed interest in musical form — in the large-scale architecture of tonal works and the relationship between formal units and harmonic structure. William Caplin’s Classical Form: A Theory of Formal Functions for the Instrumental Music of Haydn, Mozart, and Beethoven (1998) represents the most systematic recent treatment of form in the Viennese Classical style.

Caplin’s theory is organized around the concept of formal function: the role that a passage of music plays within a larger formal design. His primary unit is the theme, which itself is articulated into smaller functional units: presentation phrases (introducing the basic idea and its repetition), continuation phrases (developing and fragmenting the idea), and cadential phrases (providing harmonic and formal closure). These smaller units combine into complete formal types (the sentence, the period, the hybrid) that in turn function as large-scale sections within movements.

The Sentence and the Period. Caplin's theory identifies two fundamental theme types:
  • The sentence: a two-part structure consisting of a presentation (which states a basic idea and repeats it, usually at the same or a different harmonic level) followed by a continuation (which develops the idea through fragmentation and harmonic acceleration) and a cadence.
  • The period: a two-part structure consisting of an antecedent phrase (which ends with a weak, typically half cadence) followed by a consequent phrase (which repeats or varies the antecedent's opening and ends with a strong, typically authentic cadence). The period creates a question-and-answer structure whose rhetorical logic Caplin traces to the antecedent-consequent phrase pairs of Baroque rhetoric.

James Hepokoski and Warren Darcy’s Elements of Sonata Theory: Norms, Types, and Deformations in the Late-Eighteenth-Century Sonata (2006) offers a complementary but substantially different approach to Classical form. Where Caplin emphasizes local formal function, Hepokoski and Darcy emphasize genre conventions and their systematic deformation. Their theory is explicitly dialogic: a sonata movement is not simply a realization of a formal template but a negotiation between the norms of the genre and the specific choices a composer makes in any given instance.

The concept of deformation — a departure from genre norms that acquires expressive meaning precisely by violating what the genre leads listeners to expect — is central to their approach. A passage that ends in the wrong key, that fails to produce an expected cadence, or that returns to the recapitulation in an unexpected way is not a mistake but a communicative act, exploiting the listener’s internalized knowledge of the sonata norm to create surprise, tension, or irony. This dialogic model of form connects Hepokoski and Darcy’s work to the broader tradition of rhetorical music theory (Mattheson, the Figurenlehre tradition) and to the reception-theoretic turn in musicology associated with Lawrence Kramer and Susan McClary.


Chapter 7: Twentieth-Century Theory — Serialism, Set Theory, and Transformational Theory

The collapse of functional tonality in the early twentieth century was not merely a compositional event but a theoretical crisis. If the harmonic syntax that Rameau had articulated, Riemann had systematized, and Schenker had claimed to find operating at the deepest structural level of all great music was no longer operative in new composition, what principles governed musical organization? Three major theoretical responses emerged across the twentieth century: the extension of serial thinking to a general theory of pitch-class organization, the development of pitch-class set theory as an analytical tool for atonal repertory, and the emergence of transformational theory as a general mathematical framework for musical relationships of all kinds.

7.1 Schoenberg’s Twelve-Tone Method

Arnold Schoenberg (1874–1951) developed the twelve-tone method between approximately 1920 and 1923 as a “method of composing with twelve tones related only to one another” — a phrase that signals both the method’s structural principle and its implicit rejection of the tonal hierarchy in which all twelve tones are related to a single privileged tonic. The method is built around the tone row (German: Reihe), an ordering of all twelve pitch classes in a fixed sequence serving as the generative material for an entire composition.

The Twelve-Tone Row and Its Transformations. Given a prime row \(P_0 = (p_0, p_1, \ldots, p_{11})\), the four canonical forms are:
  • Prime (P): the row in its original order. \(P_n\) is P transposed by \(n\) semitones.
  • Inversion (I): the row inverted — each ascending interval is replaced by a descending interval of the same size. \(I_n\) is I transposed by \(n\) semitones.
  • Retrograde (R): the row reversed in order. \(R_n\) is P reversed and transposed.
  • Retrograde-Inversion (RI): the inversion reversed. \(RI_n\) is I reversed and transposed.
This yields up to 48 row forms, often displayed in the twelve-by-twelve row table from which all compositional pitch material is derived.

Schoenberg insisted that the twelve-tone method did not abolish musical sense but reorganized it: the row replaces the scale as the source of pitch relationships, counterpoint and register still create texture and hierarchy, and large-scale form still articulates musical time. Combinatoriality — the property of certain row hexachords that, combined with a transformed version, yield an aggregate of all twelve pitch classes — was an important structural resource for Schoenberg and became central to Babbitt’s subsequent theorization.

Schoenberg’s use of the method across his late works demonstrates remarkable flexibility. The String Quartet No. 4 (1936) uses the twelve-tone row to generate music with many surface similarities to late Romantic harmonic language; the Piano Concerto (1942) achieves a large-scale formal clarity that Schoenberg explicitly linked to Classical models. These works complicate any simple narrative treating the twelve-tone method as the negation of all prior musical values.

Schoenberg’s students Alban Berg (1885–1935) and Anton Webern (1883–1945) adopted the twelve-tone method but used it in markedly different ways, producing music whose divergence demonstrates the method’s compositional flexibility. Berg’s twelve-tone works — the Violin Concerto (1935), Lulu (1935) — retain obvious surface connections to tonal harmony, using rows that generate tonal triads and using large-scale tonal planning to articulate formal structure. Webern’s twelve-tone works — the Symphony Op. 21 (1928), the Variations Op. 27 (1936) — are characterized by extreme brevity, pointillistic texture, and rigorous application of the row’s symmetrical properties, producing music of great structural economy. The contrast between Berg’s and Webern’s twelve-tone aesthetics generated two distinct lineages in post-war new music: a Bergian line emphasizing expressive continuity with the Romantic tradition, and a Webernian line emphasizing structural rigor and the emancipation of new music from its historical past.

7.2 Babbitt and the Formalization of Twelve-Tone Theory

Milton Babbitt (1916–2011) transformed Schoenberg’s compositional method into a rigorous theoretical system, drawing on set theory, group theory, and combinatorics. His Princeton doctoral dissertation, The Function of Set Structure in the Twelve-Tone System (written 1946, accepted 1992 — the long delay reflects the music department’s initial resistance to mathematical music theory as a legitimate dissertation topic), established the foundations of what would become pitch-class set theory.

Babbitt introduced the term pitch class to refer to the equivalence class of all pitches related by octave transposition: C4, C5, and C3 are all instances of pitch class C. This abstraction allowed him to treat twelve-tone rows as mathematical objects in modular arithmetic (arithmetic mod 12) and to analyze their structural properties precisely.

In mod-12 arithmetic, the twelve pitch classes are represented by integers 0–11 (C = 0, C♯ = 1, D = 2, … , B = 11). A transposition by interval \(n\) is addition mod 12: \(T_n(x) = (x + n) \bmod 12\). An inversion is \(I_n(x) = (n - x) \bmod 12\). These operations form a group — the T/I group — with 24 elements (12 transpositions and 12 inversions), and row analysis becomes an exercise in the coset structure of this group acting on the set of all row orderings.

Babbitt extended serial organization beyond pitch to rhythm through the time-point system: rhythmic positions within a measure are mapped onto the same mod-12 arithmetic, so that a twelve-tone row can simultaneously determine both pitch and rhythmic structure. This “total serialism” — extending serial organization to dynamics, articulation, register, and timbre — was pursued in parallel by Pierre Boulez and Karlheinz Stockhausen in Europe, though with somewhat different theoretical frameworks. Babbitt’s own compositions from the 1950s onward — All Set (1957), the Partitions (1957), and the string quartets — demonstrate the compositional possibilities of this expanded serialist framework.

7.3 Forte and Pitch-Class Set Theory

Allen Forte (1926–2014) published The Structure of Atonal Music in 1973, providing a comprehensive analytical system for the atonal repertory of Schoenberg, Webern, and Berg from roughly 1908 to 1923. Forte’s system — pitch-class set theory — treats any collection of pitch classes as a set subject to analysis through its prime form and interval vector. The system provides a common vocabulary for comparing harmonic collections across different atonal works, replacing the scale-degree framework of tonal analysis with a more abstract combinatorial framework.

Prime Form and the Forte Catalog. Given any set of pitch classes, its prime form is the most compact normal form obtained by applying transposition and inversion equivalence. Forte catalogued all 224 distinct prime forms of sets with cardinality 3 through 9, naming each by cardinality and index: set 3-11 is the set class of all major and minor triads (prime form [0,3,7] and [0,4,7], collapsed to a single class under inversion equivalence), while set 4-28 is the diminished seventh chord (prime form [0,3,6,9]).

The interval vector of a set is a six-element array \([ic_1, ic_2, ic_3, ic_4, ic_5, ic_6]\) counting how many times each interval class (1 through 6) appears between pairs of elements. For the major/minor triad (3-11): interval vector [001110], indicating one instance each of interval classes 3, 4, and 5.

Forte’s system attracted both enthusiastic adoption and sustained critique. George Perle argued that Forte’s prime forms did not adequately capture the way composers actually worked with twelve-tone rows, since Forte’s equivalences conflate transpositionally and inversionally related sets that function very differently in compositional practice. David Lewin pointed out that treating transposition and inversion as equivalences — so that a major triad and a minor triad are the same set class (3-11) — conflates entities that function very differently in musical context, and proposed a richer transformational framework maintaining their distinctness as musically significant objects.

Robert Morris’s Composition with Pitch Classes: A Theory of Compositional Design (1987) extended Forte’s framework in the direction of compositional theory, developing the concepts of pitch-class space (a geometric representation of pitch-class relationships) and voice-leading in pitch-class set terms. Morris’s work represents an attempt to bridge the gap between analytical set theory and compositional practice — to show that the same mathematical framework that explains the structure of atonal analysis also provides tools for compositional planning.

The influence of Forte’s Structure of Atonal Music on graduate music theory curricula in North America was enormous. For roughly two decades (the late 1970s through the early 1990s), set theory was the dominant analytical paradigm for twentieth-century music in American doctoral programs. Its decline relative to transformational and neo-Riemannian theory in the 1990s and 2000s was not because set theory was shown to be wrong but because alternative frameworks — particularly Lewin’s transformation theory — proved capable of asking more musically interesting questions about the same repertory. Set theory remains a fundamental tool, but it is now more often used as one component of a broader analytical toolkit than as a comprehensive method in itself.

A set-theoretic analysis of the opening of Webern's Variations for Piano, Op. 27 (1936) would identify the pitch-class sets used in each phrase, determine their prime forms and interval vectors, and demonstrate how recurring set classes create cohesion across the movement. A transformational analysis of the same passage would instead map the operations (transpositions and inversions) connecting successive sets, showing how the entire movement can be understood as a network of transformations radiating from a small set of generating operations. The set-theoretic analysis emphasizes harmonic vocabulary; the transformational analysis emphasizes compositional process. Both are illuminating; neither is complete.

7.4 Lewin and Transformational Theory

David Lewin (1933–2003) is the most mathematically sophisticated and philosophically ambitious music theorist of the twentieth century. His Generalized Musical Intervals and Transformations (1987) proposed a fundamental reconception of what music theory is about.

Generalized Interval System (GIS). A GIS consists of a set \(S\) of musical objects, a mathematical group \(G\) of intervals, and a function \(\text{int}: S \times S \to G\) assigning an interval to every ordered pair of objects, satisfying \(\text{int}(s, u) = \text{int}(s, t) \cdot \text{int}(t, u)\) for all \(s, t, u \in S\). The standard pitch-class GIS has \(S = G = \mathbb{Z}_{12}\) and \(\text{int}(s, t) = t - s \pmod{12}\). Any collection of musical objects with a group-valued "distance" function constitutes a GIS, making the framework applicable to pitch, rhythm, timbre, and other musical dimensions simultaneously.

Lewin’s deeper proposal was philosophical. He argued that the GIS framework, while mathematically natural, embeds a problematic spatial metaphysics treating musical objects as fixed positions in a space and intervals as distances between them — a metaphysics derived from our experience of moving through physical space. An alternative framework treats music as a network of transformations — operations that carry one musical object to another. The analyst’s question shifts from “what is this note?” to “what transformation connects this note to that one?” — from spatial to processual thinking about musical experience.

Transformation networks — directed graphs whose nodes are labeled with musical objects and whose arrows are labeled with transformations — represent this processual conception analytically. A passage of Brahms harmony can be represented as a network of transpositions and inversions; a Schenkerian reduction can be recast as a transformation network; a twelve-tone row table is a transformation network. Lewin’s framework is a meta-theory subsuming many existing analytical approaches as special cases, and it has generated an enormous body of subsequent research in mathematical music theory.

7.5 Neo-Riemannian Theory and the Tonnetz

Neo-Riemannian theory emerged in the 1990s from the convergence of Lewin’s transformational framework with renewed analytical interest in the chromatic harmonic language of late Romantic music — Schubert, Brahms, Wagner, Liszt. Its central figures are Brian Hyer and Richard Cohn. The name refers to Hugo Riemann’s harmonic dualism, from which neo-Riemannian theory borrows the symmetrical treatment of major and minor triads while abandoning Riemann’s physical and psychological claims about undertones.

The PLR Transformations and the Tonnetz. The three basic neo-Riemannian transformations act on the 24 major and minor triads:
  • P (Parallel): maps a triad to its parallel major or minor. C major ↔ C minor. One voice (the third) moves by semitone; root and fifth are held as common tones.
  • L (Leading-tone exchange): maps a major triad to the minor triad a major third above, and vice versa. C major ↔ E minor. The fifth moves down by semitone to become the root of the minor triad.
  • R (Relative): maps a triad to its relative major or minor. C major ↔ A minor. The root moves up by whole tone to become the fifth of the minor triad.
The Tonnetz (tone-net) is a two-dimensional toroidal graph in which pitches are arranged in a lattice with perfect fifths along one axis and major thirds along another. Every major and minor triad occupies a triangle, and the PLR transformations correspond to reflections across the three edges of that triangle.
The "hexatonic systems" identified by Richard Cohn in Schubert's instrumental music involve cycles of alternating P and L transformations: C major → C minor (P) → A♭ major (L) → A♭ minor (P) → E major (L) → E minor (P) → C major (L). This cycle visits six triads, covers three major thirds that together span an octave (C, E, G♯/A♭), and returns to the starting triad after six transformations. The voice-leading in each step is maximally smooth: one voice moves by semitone while two voices are held as common tones. Cohn argues that Schubert systematically exploits this voice-leading parsimony, and that neo-Riemannian analysis reveals a structural logic that Roman numeral analysis — which would label many of these progressions as remote modulations — cannot capture.

The tonnetz as a geometric representation has antecedents in Euler’s Tentamen novae theoriae musicae (1739) and in Riemann’s own theoretical writings, but its modern neo-Riemannian form is primarily the work of Hyer and Cohn. The geometric representation has been extended computationally to explore voice-leading spaces for seventh chords, ninth chords, and other chord types, generating a rich mathematical theory of harmonic proximity and voice-leading efficiency. This work has produced important analytical results for Wagnerian harmony, Liszt’s late piano music, and the chromatic style of Schubert’s late instrumental works.

7.5b Spectral Music and the Theorization of Timbre

While neo-Riemannian theory was reviving and transforming aspects of the Riemannian tradition, a parallel theoretical development was emerging from the spectral music movement in France. Spectral music — associated with composers including Gérard Grisey (1946–1998) and Tristan Murail (b. 1947), and theorized at the Institut de Recherche et Coordination Acoustique/Musique (IRCAM) in Paris — takes the overtone series as its primary compositional material. Rather than building musical structure from scales, modes, or twelve-tone rows, spectral composers build it directly from the acoustic properties of sounds: the specific frequency ratios, durations, and amplitudes of individual partials.

Theorizing spectral music requires a conceptual vocabulary quite different from either Schenkerian or set-theoretic analysis. The fundamental concepts include spectral analysis (the decomposition of a complex sound into its component frequencies), inharmonicity (the deviation of upper partials from whole-number multiples of the fundamental — characteristic of bells, metallophones, and certain wind instruments), and temporal envelope (the amplitude contour of a sound from attack through decay, sustain, and release).

Spectral Composition and the Overtone Series. In spectral composition, a harmonic spectrum on a fundamental pitch \(f\) generates pitches at \(f, 2f, 3f, 4f, 5f, \ldots\) which, when translated into musical notation (rounded to the nearest quarter-tone if necessary), provide the pitch and interval material for an entire composition. Grisey's Partiels (1975), for eighteen musicians, is structured around the spectrum of an E2 played on trombone: the opening chords of the work literally spell out the first sixteen partials of this spectrum, with the upper partials progressively "out of tune" relative to equal temperament because they fall between equal-tempered semitones.

Spectral music theory connects to the broader history of music theory in revealing ways. Rameau’s appeal to the corps sonore — the claim that the overtone series is the physical foundation of harmonic consonance — is both vindicated and complicated by spectral music: vindicated because spectral composers take the overtone series seriously as a compositional resource, complicated because they find in it not just the major triad but an entire world of microtonal, inharmonic, and spectrally complex sounds that Rameau never considered.

Spectral music also raises the question of whether conventional music-theoretical notation — staff notation with twelve-tone equal temperament as the default — is adequate for the music it is meant to describe. Spectral scores routinely use quarter-tone notation, graphic approximations, or extended playing instructions to specify pitches that lie between the semitones of standard equal temperament. The notation problem is simultaneously a music-theoretical problem: the theoretical concepts adequate to spectral music may require analytical tools that have not yet been fully developed.

Murail’s theoretical writings, collected in Modèles et artifices (2004), provide the most systematic account of spectral compositional technique from the inside. His concept of the modulant — a pitch or sound whose spectrum serves as the harmonic raw material for a passage of music — is the spectral analogue of the tonal “tonic,” functioning as the generative center around which musical events are organized. The theoretical vocabulary of spectral music — spectra, modulants, temporal envelopes, and their interactions — represents a genuinely new contribution to music theory, not merely an application of existing tools to new music.

7.6 Lerdahl, Jackendoff, and Generative Music Theory

Fred Lerdahl and Ray Jackendoff’s A Generative Theory of Tonal Music (1983) brought the methods of generative linguistics — particularly Noam Chomsky’s distinction between competence (implicit grammatical knowledge) and performance (actual linguistic behavior) — to bear on tonal music. Their theory aims to describe the implicit musical knowledge of an “experienced listener” through a set of well-formedness rules (specifying which structural descriptions are logically possible) and preference rules (specifying which possible description is most preferred given the musical input).

The theory posits four hierarchical components: grouping structure (how musical events are segmented into motives, phrases, and sections), metrical structure (the hierarchy of strong and weak beats at multiple metric levels), time-span reduction (a hierarchy of structural importance across successive time spans, analogous to Schenkerian reduction but derived from explicit preference rules rather than analytical intuition), and prolongational reduction (a hierarchy of tonal tension and relaxation connecting temporally distant but structurally related events).

Lerdahl and Jackendoff explicitly model their theory on Chomsky's generative grammar, but the analogy is imperfect in ways they acknowledge. Language has a combinatorial syntax generating unlimited sentences from a finite vocabulary; tonal music's "grammar" generates a more constrained set of structures. The preference rules — specifying what an experienced listener prefers rather than what is logically required — make the theory empirical rather than purely formal, opening it to experimental testing. Lerdahl's subsequent Tonal Pitch Space (2001) extends the theory into a quantitative account of tonal tension and attraction, assigning numerical tension values to every chord in a tonal context based on hierarchical distance in the pitch-class space.

7.7 Current Debates and Future Directions

The early twenty-first century finds music theory in a state of productive self-examination touching simultaneously on its methods, its materials, and its politics. Empirical music cognition — the experimental study of how listeners actually perceive and process musical structures — provides a scientific check on theoretical claims previously argued from introspection or analytical authority alone. Experimental work by researchers including David Huron (Sweet Anticipation, 2006) and Carol Krumhansl (Cognitive Foundations of Musical Pitch, 1990) has tested claims about key-finding, expectation, and tonal hierarchy against behavioral and neuroimaging data, sometimes confirming and sometimes revising the claims of traditional theory.

Computational tools including the Humdrum toolkit (David Huron) and the music21 Python library (Michael Cuthbert) enable corpus analysis of large repertories, testing claims about harmonic frequency and progression patterns against statistical evidence from thousands of pieces. This work has produced important revisions of received wisdom: corpus studies have revealed that the “common practice” harmonic conventions described in textbooks are not uniformly distributed across the repertory but vary significantly by composer, genre, and historical period.

The politics of the canon — the question of whose music gets theorized, and by whose analytical standards — has become central to the discipline’s self-understanding. The historical dominance of a German-Austro-Hungarian repertory in music theory pedagogy reflects historical exclusions whose intellectual costs are increasingly recognized. Research in the theory of jazz harmony (Steve Larson, Henry Martin, Kent Williams), popular music theory (Richard Middleton, John Covach, Walter Everett), and non-Western music theory (building on ethnomusicological foundations) is expanding the discipline’s analytical toolkit and its sense of what counts as musically interesting structure worth theorizing.

Philip Ewell's 2020 plenary address to the Society for Music Theory, "Music Theory's White Racial Frame," prompted the most intense public debate in the discipline's history, centering on the relationship between music theory's intellectual methods and its demographic and canonical exclusions. The responses — including defenses of the Schenkerian tradition as separable from Schenker's personal ideology, and critiques arguing that the two are structurally inseparable — reflect genuine and unresolved tensions in the field. Whatever one's position, the debate has permanently altered the discipline's self-understanding and opened new directions for research in the history and politics of music theory as an intellectual institution.

The broadening of music theory’s analytical scope beyond the European common-practice canon has required not only political will but genuine theoretical invention. The harmonic language of jazz, for example, shares some features with common-practice tonality (functional progressions, tonal centers, voice-leading norms) but also differs in fundamental ways that existing theory did not adequately capture.

Steve Larson’s Analyzing Jazz: A Schenkerian Approach (2009) argued that jazz improvisation exhibits prolongational structures analogous to Schenkerian middleground motions, with improvisers navigating between structural chord tones through elaborating passing and neighbor motions. Henry Martin’s work on Charlie Parker and bebop identified motivic-harmonic relationships that persist across improvised variations, showing how jazz improvisation generates new melodic material through systematic transformation of underlying patterns. Kent Williams and Mark Levine (the latter through practical instructional texts like The Jazz Theory Book, 1995) developed the pedagogical theory of jazz harmony, codifying concepts like the tritone substitution — replacing a dominant seventh chord with the dominant seventh chord whose root is a tritone away, since the two chords share the same guide tones (the third and seventh, which merely exchange roles) — and modal interchange (borrowing chords from parallel modes).

Tritone Substitution. In a V–I resolution in C major, the dominant chord G7 (G–B–D–F) can be replaced by D♭7 (D♭–F–A♭–C♭/B). The guide tones of G7 (B = third, F = seventh) become the guide tones of D♭7 (F = third, B/C♭ = seventh) exchanged in position. Both chords resolve to C major (i) with strong voice-leading: F moves down by semitone to E (the third of C major), and B or C♭ moves down by semitone or stays to become C (the root). The tritone substitution is thus a smooth voice-leading operation — a special case of the kind of parsimonious voice-leading that neo-Riemannian theory studies in a different harmonic context.

Walter Everett’s The Foundations of Rock: From “Blue Suede Shoes” to “Suite: Judy Blue Eyes” (2008) applied Schenkerian and set-theoretic tools to the analysis of rock harmony, identifying the characteristic features of rock’s harmonic language — modal mixture, pentatonic scale relations, power-chord progressions — and showing how they differ from common-practice harmonic norms. John Covach and Andrew Flory’s What’s That Sound? An Introduction to Rock and Its History provides a model for integrating rock harmonic theory into undergraduate pedagogy.

The theoretical analysis of non-Western music presents different challenges. Victor Kofi Agawu’s African Rhythm: A Northern Ewe Perspective (1995) and Representing African Music (2003) both engage with the theoretical frameworks that Western ethnomusicology has applied to African music and argue that many such frameworks distort the music through inappropriate imposition of Western analytical categories. Agawu advocates instead for analytical approaches that emerge from within the musical traditions being studied, drawing on indigenous theoretical concepts where they exist. This methodological argument — whether analysis should use externally derived or internally derived theoretical frameworks — is a version of the broader epistemological debate that runs through the history of music theory from Pythagoras to Aristoxenus.

The expansion of music theory's analytical scope to include jazz, rock, and non-Western music is not simply an additive process — a matter of applying existing tools to new repertory. It requires genuine theoretical revision, because the tools developed for common-practice European music often presuppose features of that repertory (triadic harmony, metric regularity, staff notation, tonal hierarchy) that may not be present in or appropriate to other musical traditions. The challenge of "expanding the canon" is therefore also a challenge of expanding the theoretical toolkit, and this expansion is itself a form of music-theoretical progress.

7.9 The History of Music Theory as Intellectual History

The history of music theory is not merely a chronicle of analytical methods and their development; it is an intellectual history in the full sense — a history of ideas about what music is, how it works, and why it matters. Each of the frameworks surveyed in these chapters embodies a specific set of philosophical commitments: about the relationship between mathematics and perception, between structure and expression, between the authority of tradition and the demands of new compositional practice.

The Pythagorean tradition committed music theory to mathematics as its grounding discipline and to cosmology as its ultimate context. Aristoxenus committed it to perception and trained musicianship. Boethius committed medieval theory to the authority of ancient learning and the priority of the intellect over the senses. Zarlino committed Renaissance theory to the reconciliation of mathematical justification with compositional practice. Rameau committed tonal theory to the derivation of harmonic syntax from physical law. Schenker committed organicist theory to the identification of deep structural unity as the defining property of musical greatness. Babbitt and Forte committed post-tonal theory to mathematical rigor and the methods of formal science. Lewin committed transformational theory to the processual, relational character of musical experience.

Consider how each theoretical framework handles the diminished seventh chord — the sonority built from four notes each a minor third apart: C–E♭–G♭–B♭♭ (or enharmonically C–E♭–F♯–A). For Pythagorean theory, this chord is an accumulation of dissonance: its ratios are complex and its interval content includes no pure consonances. For Zarlino's senario, it falls entirely outside the consonance-generating integers. For Rameau, it is a dominant ninth chord with the root omitted (treating B♭♭ as an enharmonic for A, the ninth of a B♭ dominant in third inversion). For Riemann, it is a D7 chord — a dominant function regardless of root position or enharmonic spelling. For Schenker, it is typically a passing chord within a prolongation, not a structurally independent Stufe. For Forte, it is set class 4-28, interval vector [004002], recognizable as a chord with a unique acoustic symmetry (four-fold division of the octave by minor thirds). For neo-Riemannian theory, its four-fold symmetry makes it a pivotal chord in the octatonic system, connecting triads in ways that PLR transformations on the tonnetz can map precisely. Each theory reveals something different and genuinely illuminating about the same sonority.

The multiplicity of perspectives is not a sign of theoretical failure but of theoretical richness. Music theory is at its most productive when it maintains awareness of its own perspectival character — when it knows that the categories it uses are choices made against a background of alternatives, and that different choices illuminate different aspects of a complex phenomenon. The recurring debates in music theory’s history — ratio vs. perception, mathematical structure vs. expressive function, depth vs. surface, canonical repertory vs. expanded canon — are not merely professional controversies but genuine philosophical problems about the nature of musical experience and the conditions of musical understanding.

7.10 Methodological Pluralism and the Future of the Discipline

Contemporary music theory is characterized by a healthy, if sometimes contentious, methodological pluralism. Schenkerian analysis, pitch-class set theory, transformational theory, neo-Riemannian theory, topic theory, form-functional analysis, empirical corpus analysis, and psychoacoustic music cognition coexist within a single discipline, each illuminating different aspects of musical experience and each making different philosophical assumptions about what counts as musical structure, musical meaning, and musical knowledge.

This pluralism has not always been comfortable. In the 1980s and 1990s, debates between Schenkerian analysts and set theorists over the appropriate methodology for post-tonal music could be acrimonious. In the 2000s and 2010s, debates between speculative theorists and empirical music cognition researchers over the empirical testability of theoretical claims raised fundamental methodological questions. In the 2020s, debates about whose music gets analyzed and by whose standards have raised political and ethical questions that cut to the foundations of the discipline’s self-understanding.

The most productive response to methodological plurality in music theory is not to choose a single method and apply it exclusively but to cultivate fluency across multiple analytical frameworks, using each where it is most illuminating and maintaining awareness of what each method presupposes and what it leaves invisible. An analyst who knows only Schenkerian theory will miss aspects of harmonic color and timbral texture that spectral analysis reveals; an analyst who knows only set theory will miss aspects of voice-leading continuity and tonal hierarchy that Schenkerian analysis reveals; an analyst who knows only empirical corpus statistics will miss aspects of expressive particularity that close analytical reading of individual works reveals. Each tool has its strengths, and the discipline as a whole is more powerful than any single method within it.

Looking forward, several developments seem likely to shape music theory’s next generation. Machine learning and corpus analysis will continue to expand the scale at which harmonic and melodic patterns can be studied, potentially revealing regularities in large repertories (popular music, world music, music of the distant past) that have been invisible to analysis of individual works. Cross-cultural music theory — the comparative study of music-theoretical systems from non-Western traditions (Indian raga theory, maqam theory from the Arab world and Turkey, Chinese and Japanese modal systems) alongside Western theory — will require new conceptual frameworks that are neither purely Western nor purely relativistic. Computational composition and artificial intelligence raise new theoretical questions: if a machine can generate music indistinguishable (to listeners) from human composition, what does this tell us about what music theory has been modeling? Does it model the cognitive processes of composers and listeners, or abstract structural properties of musical surfaces, or something else entirely?

The theoretical study of algorithmic and AI-generated music forces a return to the foundational questions of music theory: What is a musical rule? Is a "rule" a description of observed regularities, a prescription for correct practice, a model of cognitive processes, or a constraint on possible musical objects? These questions have been present since antiquity — the Pythagorean system is simultaneously descriptive (these are the ratios of consonant intervals) and prescriptive (consonant intervals ought to be built from these ratios) — but computational music production makes them newly urgent, because a machine must be given explicit rules, and the formulation of those rules requires the theorist to decide precisely what kind of claim a rule is.

The questions that motivated the earliest Greek music theorists — what is the relationship between number and sound, between mathematical structure and expressive power, between reason and perception — remain at the center of music theory’s intellectual project. They have been answered differently by every generation, and every answer has revealed something true and something partial. The history of music theory is, in this sense, not a linear progress toward final answers but a continuing conversation — interrupted by discoveries, distorted by ideologies, expanded by new voices and new repertories — about what it means to understand music.

The history of Western music theory, from the Pythagorean monochord to the neo-Riemannian tonnetz, from Boethian musica mundana to transformational networks and computational corpus analysis, is a history of recurring questions: What is the relationship between mathematical structure and sonic experience? What authority should mathematical derivation have over the trained ear? Whose musical practices deserve theoretical articulation, and by whose standards? What counts as structural depth versus surface ornament, and who decides? These questions do not admit of final answers, but their history — the sequence of frameworks, polemics, revisions, and expansions charted in these pages — constitutes one of the richest intellectual traditions in the history of Western thought about the arts. The discipline’s current self-examination, however uncomfortable, is itself a continuation of this tradition: music theory has always been, among other things, an argument about what matters in music and why.

What unites Pythagoras’s experiment with the monochord, Guido’s hexachord hand, Zarlino’s senario, Rameau’s basse fondamentale, Schenker’s Ursatz, and Lewin’s transformation network is not any single answer to these questions but a common conviction: that music is not just sound but structure, and that structure can be understood, described, and argued about with rigor and precision. The specific forms that rigor and precision take have changed dramatically across two and a half millennia of music-theoretical thought. They will continue to change. But the conviction that musical understanding is possible — that the trained mind can grasp something real and transmissible about how music works — is the continuous thread that runs from the Pythagorean Brotherhood through every tradition, debate, and revision chronicled in these pages. Graduate study in the history of music theory is, ultimately, an initiation into that conviction and into the responsibility it carries: to understand music deeply, to analyze it rigorously, and to remain open to the possibility that the categories we use today will one day require revision by thinkers we have not yet imagined.

Back to top