MUSIC 674: History of Music Theory
Estimated study time: 2 hr
Table of contents
These notes draw on Thomas Christensen (ed.) The Cambridge History of Western Music Theory (2002), Leo Treitler (ed.) Strunk’s Source Readings in Music History (rev. ed., 1998), Joel Lester’s Between Modes and Keys: German Theory 1592–1802 (1989), Brian Hyer and Alexander Rehding (eds.) The Oxford Handbook of Neo-Riemannian Music Theories (2011), and supplementary materials from Yale University MUSI 720–721 graduate seminars and Indiana University T623–T624 doctoral music theory sequences.
Chapter 1: Ancient Greek Music Theory
The history of Western music theory does not begin with scales, chords, or notation. It begins with a string. The monochord — a single string stretched over a resonating box with a movable bridge — was the experimental apparatus through which the ancient Greeks discovered that musical intervals correspond to precise numerical ratios. This discovery, attributed in antiquity to Pythagoras of Samos (c. 570–495 BCE) and elaborated by generations of followers, established the foundational claim of one entire tradition in Western music theory: that music is, at its root, a branch of mathematics. That claim has never gone unchallenged, and the alternation between mathematical rationalism and empirical perceptionism defines much of music theory’s history from antiquity to the present.
The ancient Greek contribution to music theory is not a single unified doctrine but a complex and contested set of overlapping traditions. The Pythagorean tradition, grounded in ratio mathematics and cosmological speculation, provided music theory with its first systematic vocabulary and its first claim to scientific rigor. The Aristoxenian tradition, grounded in perceptual observation and trained musicianship, provided the first systematic challenge to that rigor and the first defense of the ear as a legitimate theoretical authority. The Platonic tradition embedded music theory within a broader philosophical and political program, making the question of what music does to the soul a matter of civic importance. And the Alexandrian tradition of Ptolemy, Nicomachus, and Aristides Quintilianus synthesized, compiled, and transmitted these earlier traditions in forms that would shape medieval and Renaissance music theory for more than a millennium. To understand any subsequent chapter in the history of music theory, one must understand how these ancient debates were framed and why they proved so durable.
1.1 Pythagoras and the Monochord
The practical operation of the monochord is straightforward. A string of a fixed length and tension produces a reference pitch. Placing the movable bridge at the midpoint divides the string into two equal segments, each of which vibrates at exactly twice the frequency of the whole. The ratio 2:1 corresponds to what we call the octave — the interval that ancient Greeks called the diapason (“through all”). Moving the bridge to divide the string in a 3:2 ratio produces the perfect fifth (diapente). A 4:3 division produces the perfect fourth (diatessaron). And the ratio 9:8, obtained by compounding a fifth upward with a fourth downward — that is, \(3/2 \div 4/3 = 9/8\) — yields the whole tone (tonus).
The tetraktys — the triangular arrangement of the integers 1, 2, 3, 4 — held quasi-mystical significance for the Pythagorean brotherhood. The sum 1 + 2 + 3 + 4 = 10, the sacred number of completion. More importantly, these four integers contain within their pairwise ratios all four fundamental musical consonances: the octave (2:1), the fifth (3:2), the fourth (4:3), and the whole tone (9:8 = (3/2)/(4/3)). For the Pythagoreans this was not a coincidence but a revelation: the cosmos is ordered by number, and music is a sensible manifestation of that cosmic order. The tetraktys was reportedly sworn upon as an oath in the Pythagorean brotherhood, reflecting the degree to which mathematical and musical insight were fused in their philosophical worldview.
Pythagorean music theory was developed in systematic form not by Pythagoras himself — who left no writings — but by later thinkers including Philolaus of Croton (c. 470–385 BCE), Archytas of Tarentum (fl. 400–350 BCE), and Nicomachus of Gerasa (fl. second century CE). Archytas is especially important for music theory because he provided mathematical derivations of the three tetrachord genera within the Pythagorean framework of superparticular ratios — ratios of the form \((n+1):n\). The superparticular constraint gives Pythagorean interval theory a certain mathematical elegance but also a rigidity that would later provoke Aristoxenus’s empiricist reaction.
1.2 The Pythagorean Comma
One of the most consequential results of Pythagorean tuning is a small but irresolvable discrepancy that has driven the entire subsequent history of temperament. If one stacks twelve perfect fifths — moving up by a ratio of 3:2 twelve times — the resulting pitch should, after reduction by seven octaves, return to the starting pitch. But it does not quite. The mathematical statement of this discrepancy is:
\[ \left(\frac{3}{2}\right)^{12} = \frac{531441}{4096} \approx 129.746 \quad \text{while} \quad 2^7 = 128. \]The ratio between these two quantities is the Pythagorean comma:
\[ \frac{531441}{524288} = \frac{3^{12}}{2^{19}} \approx 1.01364, \]corresponding to approximately 23.46 cents (where 100 cents = one equal-tempered semitone). This gap is small enough to be nearly imperceptible in a brief melodic context, but catastrophic for a keyboard instrument that must play consonantly in all keys. The Pythagorean comma is not a defect of music or of mathematics; it is a consequence of the incommensurability of the logarithms of 2 and 3. No positive integer power of 2 ever exactly equals a positive integer power of 3, as follows from the uniqueness of prime factorizations. Every system of tuning in Western history — meantone temperament, various irregular temperaments, and eventually equal temperament — represents a different strategy for distributing or concealing this comma across the twelve intervals of a closed pitch cycle.
In practical terms, a Pythagorean keyboard tuning places all the comma into a single “wolf fifth” — an interval slightly smaller than a pure fifth by a full Pythagorean comma, typically at the edge of the chromatic system (often between G♯ and E♭) where it would rarely be used in common repertoire. The wolf fifth “howls” — hence its name — and represents the price paid for pure fifths everywhere else. Later meantone temperaments distributed the syntonic comma more evenly but introduced a different, larger wolf fifth. Equal temperament eliminates wolves entirely by spreading the Pythagorean comma uniformly across all twelve fifths, narrowing each by approximately 1.96 cents.
1.3 Aristoxenus and the Empiricist Alternative
Aristoxenus of Tarentum (c. 375–335 BCE), a student of Aristotle, mounted the most systematic ancient challenge to Pythagorean ratio theory. His Harmonika Stoicheia (Elementa harmonica) proposes a radically different foundation: intervals are not ratios between string lengths or vibration frequencies but magnitudes in a continuous pitch-space. The proper unit of measurement is not the ratio but the tone, and intervals are measured as multiples and fractions of tones.
The distinction between Pythagorean and Aristoxenian approaches is not merely technical; it represents a deep epistemological split. For Pythagoreans, the authority of musical theory rests on mathematical demonstration. For Aristoxenus, it rests on trained musical perception: the mousikos (cultivated musician) whose ear has been educated to discriminate intervals reliably is the final arbiter. This tension between mathematical rationalism and empirical perception recurs throughout the history of music theory, resurfacing in Vincenzo Galilei’s challenge to Zarlino in the sixteenth century, in Helmholtz’s psychoacoustics, and in the empirical music cognition movement of the twentieth century.
Aristoxenus also provides the fullest surviving ancient discussion of the three tetrachord genera, specifying for each genus not a single fixed tuning but a range of acceptable inner-note positions. The enharmonic genus, he notes, is the most challenging for the ear and requires the most training to sing accurately. His interest is always in what the trained ear can discriminate, not in what mathematical ratios can specify. The Harmonika Stoicheia survives in fragmentary form but is supplemented by Aristoxenus’s Rhythmika Stoicheia, which applies the same magnitude-based approach to rhythm — making Aristoxenus the first systematic theorist of both pitch and rhythm in the Western tradition.
1.4 The Tetrachord Genera and the Greater Perfect System
Ancient Greek melodic theory was organized around the tetrachord — a four-note span filling the interval of a perfect fourth. The two outer notes of the tetrachord were fixed; the two inner notes were movable, yielding three distinct melodic characters called the genera (singular: genos). The diatonic genus places the inner notes so that the tetrachord contains two whole tones and one semitone, roughly equivalent to the top four notes of a modern major or natural minor scale. The chromatic genus uses a minor third at the top and two smaller intervals (approximately semitones) below. The enharmonic genus uses a major third at the top and two quarter-tones (dieses) below — intervals smaller than any semitone, perceptible only to a highly trained ear and now entirely outside the practical vocabulary of Western music.
The Greater Perfect System (GPS, systema teleion meizon) assembled two tetrachords plus an added lower tone into a two-octave pitch collection with fifteen distinct pitches. This system served as the universal reference framework for ancient Greek theoretical descriptions of melody, much as the modern major scale serves as the default reference for tonal theory pedagogy. A companion system, the Lesser Perfect System (systema teleion elasson), assembled two tetrachords plus an added tone into a different configuration, spanning an octave plus a fourth rather than two full octaves.
Medieval theorists encountered the GPS through Boethius’s Latin presentation, but they stripped away most of the contextual details — particularly the role of the genera and the performance practice associated with each modal species — leaving only the abstract pitch framework. This impoverishment of the ancient heritage through the process of Latin transmission is a recurring theme in the history of music theory’s reception of antiquity.
1.5 The Harmoniai, Musical Ethos, and Plato
Ancient theorists distinguished between harmoniai — characteristic melodic species or modal patterns — and assigned them powerful ethos (character, moral quality). Plato’s Republic (Books III and IV) offers the most culturally influential ancient account of modal ethos. Plato’s Socrates argues that only the Dorian harmonia (characterized as austere, martial, and self-controlled) and the Phrygian (fervent and courageous but not intemperate) should be permitted in the ideal city-state. The Lydian harmoniai — including the “slack” or “convivial” varieties — are to be rejected because they produce softness and effeminacy of character. Mixed Lydian harmoniai associated with lamentation are equally undesirable.
These judgments reflect the broader Platonic doctrine that music does not merely express emotional states but actively shapes character. Since the goal of education is the formation of virtuous citizens, music education must be carefully controlled. The Timaeus extends musical mathematics into cosmology: the Demiurge constructs the World-Soul from mathematical ratios including those of musical consonance — the proportions 1, 2, 3, 4, 9, 8, 27 — inscribing mathematical harmony into the very fabric of the cosmos. The world, for Plato, is literally musical in its deep structure: the regular motions of the planets reflect the same ratios that appear in the perfect musical consonances.
Aristotle’s account of musical ethos in the Politics (Book VIII) is more nuanced than Plato’s. Aristotle distinguishes between music used for education, for relaxation, and for intellectual cultivation (diagoge), arguing that different harmoniai are appropriate for different social functions. Unlike Plato, Aristotle allows the Lydian mode in education because it is suited to children. He also discusses catharsis — the emotional purgation that tragic drama produces — and extends this concept to music, suggesting that the Phrygian harmonia has cathartic power for those prone to religious excitement.
Aristides Quintilianus (second or third century CE) provides the most comprehensive surviving ancient synthesis of music theory. His De musica weaves together Pythagorean number theory, Aristoxenian interval theory, doctrines of ethos, and Platonic cosmology into a single discursive treatise. He is the primary ancient source for descriptions of the enharmonic genus in performance practice and for extended discussion of modal ethos as it relates to education and rhetoric. The sheer breadth of his synthesis makes Aristides Quintilianus a useful endpoint for the ancient tradition: virtually every thread of ancient Greek music-theoretical discourse appears somewhere in his work.
1.6 Claudius Ptolemy and the Synthesis of Pythagorean and Empiricist Approaches
Claudius Ptolemy (c. 100–170 CE), the great Alexandrian astronomer and geographer, also wrote the most mathematically sophisticated ancient music-theoretical treatise: the Harmonika (three books). Ptolemy’s ambition was to synthesize the mathematical rigor of Pythagoreanism with the perceptual sensitivity of Aristoxenianism. He rejected both extremes: pure Pythagorean ratio theory was too removed from perceptual reality (the ear cannot verify all the proposed intervals as consonant), while pure Aristoxenian magnitude theory was too imprecise (the ear alone cannot reliably measure intervals to the accuracy that theory requires).
For Ptolemy, the standard of truth in harmonic science is a collaboration between reason and perception: reason proposes hypotheses in the form of ratios, perception confirms or refutes them through listening. Ptolemy’s procedure in the Harmonika is accordingly empirical in a specific sense: he derives systems of tuning ratios from mathematical first principles, then tests them against the trained ear’s judgment, adjusting where necessary.
Ptolemy’s Harmonika exercised significant influence through Boethius, who drew on it extensively in De institutione musica. The syntonic diatonic tuning, transmitted through Boethius and rediscovered in the Renaissance, provided the theoretical foundation for just intonation and the senario — making Ptolemy’s ancient treatise an indirect but real ancestor of Zarlino’s Renaissance harmonics. The continuity between ancient mathematical theory and Renaissance practice is thus tighter than the usual narrative of “rediscovery” suggests.
Nicomachus of Gerasa (fl. c. 100 CE) wrote an Enchiridion harmonikon (Manual of Harmonics) that is somewhat less rigorous than Ptolemy’s Harmonika but more accessible, and it was this text that Boethius drew upon most heavily for the numerical tables in De institutione musica. Nicomachus presents the Pythagorean system with an emphasis on its cosmological significance: the same ratios that govern musical intervals govern the motion of the planets and the structure of the human soul. His account of the legendary discovery of musical ratios by Pythagoras — involving the sounds of smiths’ hammers of different weights — has been transmitted through Boethius to virtually every subsequent music-theoretical summary of Pythagoreanism, even though the story is almost certainly apocryphal (string tension, not hammer weight, determines pitch, so the hammer story does not actually demonstrate what it claims to demonstrate).
Chapter 2: Medieval Theory — Boethius, Guido, and Modal Theory
The transmission of ancient Greek music theory to the medieval Latin West was not direct but mediated through a small number of encyclopedic texts, the most influential of which was produced not by a practicing musician but by a late Roman philosopher and statesman. Anicius Manlius Severinus Boethius (c. 480–524 CE) wrote his De institutione musica as part of a larger educational project aimed at preserving the Greek intellectual heritage for Latin readers at a moment when direct access to Greek texts was rapidly diminishing. That this text — and not any surviving Greek original — became the standard reference for music theory in medieval universities is one of the central ironies of Western intellectual history.
2.1 Boethius and the Threefold Division of Music
Boethius’s treatise derives its material primarily from Nicomachus of Gerasa and Claudius Ptolemy, both of whom work in the Pythagorean tradition. Boethius presents a threefold classification of music that became canonical in medieval thought for more than eight centuries:
- Musica mundana ("cosmic music"): the inaudible harmony of the celestial spheres, the proportional relationships governing the motions of planets and the cycles of the seasons. This music is not literally heard but known through mathematical reason alone.
- Musica humana ("human music"): the harmony of the human body and soul — the proportional relationship between the rational and irrational parts of the soul and between soul and body. Again, not literally heard but known through introspection and philosophical analysis.
- Musica instrumentalis ("instrumental music"): the only kind that is actually heard — the music of voices and instruments. This is the lowest and least important kind for Boethius, because it is perceived by the senses rather than understood by the intellect alone.
This hierarchy reflects a Neoplatonic value system that privileges intellectual over sensory knowledge. For Boethius, the true musician (musicus) is the one who understands the mathematical ratios underlying musical intervals — the philosopher of music — not the performer who merely produces sounds by habit and training. The performer is compared to a builder who constructs what the architect has designed; the musicus is the architect. This distinction between theoretical and practical knowledge, elevated by Boethian authority, shaped medieval university curricula for centuries, determining that music’s place in the university was as a discipline of number rather than a performing art.
Music’s place in the quadrivium — alongside arithmetic, geometry, and astronomy — reflects its status as a mathematical discipline. The quadrivium, together with the trivium (grammar, rhetoric, and dialectic), constituted the seven liberal arts forming the basis of medieval university education. Music theory in the medieval university was not primarily concerned with how to sing or compose but with understanding the numerical ratios that constitute musical intervals. The practical arts of chant and polyphony were transmitted through a separate, largely oral and guild-based pedagogical tradition that only partially intersected with the university theoretical tradition.
Boethius’s execution in 524 CE — he was imprisoned and killed by the Ostrogothic king Theodoric on charges of treason — gives his intellectual legacy a particular poignancy. He wrote the De institutione musica as part of a projected encyclopedic transmission of all four quadrivial disciplines (of which only the arithmetic, geometry, and partial music survive), and the Consolation of Philosophy during his imprisonment. The Consolation, with its treatment of Fortune, necessity, and the providential order of the cosmos, resonated deeply with medieval readers and contributed to Boethius’s near-canonization as a philosophical martyr. His authority in both music theory and philosophy was thus reinforced by biographical circumstances that made him a figure of both intellectual and moral exemplarity.
2.2 Why Medieval Musicians Read Boethius
The paradox of Boethian authority is that practicing musicians and liturgical cantors needed practical guidance that Boethius’s abstract ratio theory could not provide. Yet Boethius remained the authoritative theoretical reference throughout the medieval period because he represented the link to ancient Greek mathematical wisdom. The result was a two-track tradition: a theoretical tradition citing Boethius for philosophical legitimacy, and a practical tradition developing independent pedagogical tools for singers. The two tracks occasionally intersect — when a practical writer like Guido of Arezzo invokes Boethian categories, or when a theoretical writer acknowledges that his ratio mathematics bears on actual musical practice — but they are never fully integrated before the Renaissance.
Boethius’s text itself breaks off before completing its fifth book — apparently the manuscript tradition preserves only an unfinished work — and so the medieval reader encountered Boethian theory without the systematic treatment of the modes that would have appeared in the projected completion. This incompleteness contributed to the distance between Boethian ratio theory and the practical modal theory that medieval musicians actually needed, creating a gap that practical writers like Guido of Arezzo rushed to fill.
2.3 Guido of Arezzo and Hexachord Solmization
Guido of Arezzo (c. 991–1033) made the most consequential practical contribution to music pedagogy in the medieval period. His several treatises — including the Micrologus, the Epistola de ignoto cantu (addressed to a monk named Michael), and associated texts on the hexachord method — introduced or systematized the hexachord solmization system that formed the basis of Western sight-singing pedagogy for more than five centuries. Guido’s pedagogical innovation was practical in its origins and consequences: he reportedly claimed that through his method, a singer could learn a new chant in a day that would formerly have taken a week.
The hexachord is a six-note segment of the diatonic scale: ut–re–mi–fa–sol–la. The syllables derive from the opening syllables of successive phrases of the Ut queant laxis hymn to St. John the Baptist, each phrase of which begins one step higher than the last. Within the hexachord, the unique position of the semitone between mi and fa served as the singer’s orientation point. Once a singer internalized the sound of mi moving to fa, they could navigate any diatonic melody by identifying where that semitone fell relative to each note.
- The hexachordum naturale (natural hexachord): ut = C, containing no flat or sharp, spanning C–D–E–F–G–A.
- The hexachordum durum (hard hexachord): ut = G, containing B-natural, spanning G–A–B–C–D–E. The "hard" designation refers to the square shape of the letter b used for B-natural, the ancestor of the modern natural sign.
- The hexachordum molle (soft hexachord): ut = F, containing B-flat, spanning F–G–A–B♭–C–D. The "soft" designation refers to the rounded shape of the letter b used for B-flat, the direct ancestor of the modern flat sign.
Hexachord mutation — the practice of switching syllable assignments from one hexachord to another on a pitch shared between them — allowed singers to extend the solmization system across the full compass of medieval monophonic and polyphonic repertory. For example, when ascending past the top of the natural hexachord (A = la), a singer could “mutate” on G (sol in the natural hexachord, ut in the hard hexachord) and continue upward with re, mi, fa, sol, la on A, B, C, D, E.
The Guidonian hand provided a mnemonic spatial encoding of the entire pitch system: each joint and fingertip of the left hand corresponded to a specific pitch, and the teacher could silently indicate pitches to singers by pointing. The hand encompasses the entire theoretical gamut from Gamma-ut (the lowest note, named after the Greek letter gamma for G below the staff, combined with the hexachord syllable ut) to e-la (the highest note in the standard gamut). This spatial-kinesthetic mnemonic is one of the most ingenious pedagogical inventions in the history of music education.
Musica ficta (“feigned music”) referred to pitches outside the standard hexachord system — inflections like raising a leading-tone to approach the octave more smoothly, or flattening a pitch to avoid the tritone — that skilled performers were expected to supply even when not notated. The conventions governing musica ficta are reconstructed from theoretical treatises and are a central issue in the performance of medieval and early Renaissance polyphony today.
2.4 Mensural Notation and the Notre-Dame School
A revolution in the notation of rhythm accompanied the development of elaborate polyphonic composition at Notre-Dame Cathedral in Paris during the late twelfth and early thirteenth centuries. The composers Leonin and Perotin, named in the Anonymous IV treatise, are credited with the great polyphonic collections — the Magnus Liber Organi — that required new notational resources to specify rhythmic differentiation among multiple simultaneous voices.
Johannes de Garlandia (fl. c. 1270) codified the rhythmic modes — six patterns of long and short values based on the metrical feet of classical Latin poetry (trochee, iamb, dactyl, anapest, spondee, tribrachys) — that governed the rhythmic organization of Notre-Dame polyphony. These modes were indicated not by individual note shapes but by patterns of note groupings in ligature notation, requiring performers to parse the notation contextually.
Franco of Cologne’s Ars cantus mensurabilis (c. 1280) established the first systematic and explicit notation of rhythmic duration independent of modal pattern. Franco’s system assigned each note shape a definite default duration: the long (longa), the breve (brevis), and the semibreve (semibrevis), with the long equaling three breves in perfect (ternary) mensuration or two breves in imperfect (binary) mensuration. This was a conceptual breakthrough: for the first time in Western history, a note’s shape alone could indicate its duration. The subsequent development of mensural notation by Philippe de Vitry in his Ars nova (c. 1322) added the minim and formalized the distinction between perfect and imperfect mensuration at multiple hierarchical levels — the mensuration system that would govern musical notation through the fifteenth century.
2.5 The Eight-Mode System and the Tonary Tradition
The eight-mode system (Latin: octoechos, from Greek) organized the repertory of Gregorian chant into eight modal categories. Each mode was defined by three properties: its final (the pitch on which melodies in that mode characteristically end), its reciting tone or tenor (a secondary pitch on which psalm tones dwell through much of a recitation), and its ambitus (the range of pitches typically used). Modes 1 and 2 share the final D; modes 3 and 4 share E; modes 5 and 6 share F; modes 7 and 8 share G. Within each pair, mode 1 (authentic) typically ranges from the final up an octave, while mode 2 (plagal) ranges from a fourth below the final to a fifth above it.
The tonary tradition — collections of chants organized by mode, often with intonation formulas to introduce singers to each modal category — served as both a practical reference for liturgical performers and a pedagogical introduction to modal classification. The theoretical frameworks for understanding mode were elaborated in treatises such as the Dialogus de musica (c. 1000, formerly attributed to Odo of Cluny) and the anonymous Musica enchiriadis and Scolica enchiriadis (ninth century). The Musica enchiriadis is the earliest source to describe organum (early polyphony based on parallel or contrary-motion voice pairing) in systematic terms, providing the foundational vocabulary for all subsequent polyphonic theory.
2.6 Johannes Tinctoris and the Late Medieval Transition
Johannes Tinctoris (c. 1435–1511) is one of the most prolific and analytically sophisticated theorists of the late medieval and early Renaissance period. His twelve surviving treatises address counterpoint, modes, notation, proportion, and the effects of music — a breadth that places him alongside Zarlino as a representative of comprehensive theoretical ambition. His Liber de arte contrapuncti (1477) is particularly significant: it is the first counterpoint treatise to codify the consonance status of the major third and sixth, explicitly acknowledging what composers had been doing in practice for several generations.
Tinctoris also authored the Terminorum musicae diffinitorium (c. 1472–73), the first printed music dictionary in Western history, defining over three hundred terms with precision and cross-referencing that anticipates modern lexicographical practice. His theoretical system retains the eight-mode framework but applies it to polyphonic music in ways that require substantially more flexibility than purely monophonic applications allowed. The tension in Tinctoris between conservative modal taxonomy and progressive harmonic practice is a microcosm of the larger sixteenth-century tension that Glarean and Zarlino would eventually resolve.
2.7 Marchetto da Padova and the Chromatic Question
Marchetto da Padova (fl. 1305–1326) is one of the most original and controversial theorists of the late medieval period. His Lucidarium (1318) and Pomerium (1319) address tuning and mensural notation respectively, and together they represent the most ambitious Italian contribution to medieval music theory before the fifteenth century.
Marchetto’s most provocative theoretical claim concerns the division of the whole tone. Where Pythagorean theory divided the whole tone (ratio 9:8) into two unequal semitones — the limma (ratio 256:243) and the apotome (ratio 2187:2048) — Marchetto proposed an alternative division into five equal parts called diaschismata. This fivefold division allowed Marchetto to define a chromatic semitone (two diaschismata = 2/5 of a tone) and a diatonic semitone (three diaschismata = 3/5 of a tone). The chromatic semitone in Marchetto’s system is actually smaller than the diatonic semitone, which is the opposite of the Pythagorean relationship — a mathematically anomalous result that has puzzled theorists ever since.
Marchetto’s Pomerium is also notable for its treatment of mensural notation, which extends Franco of Cologne’s system and introduces Italian-style notation that would diverge significantly from the French Ars nova tradition of Philippe de Vitry. The Italian and French notational systems of the fourteenth century — each with its own conventions for representing rhythmic values and their subdivisions — represent two distinct theoretical responses to the same compositional demands of the Trecento and Ars nova styles. Their eventual synthesis in the fifteenth century, producing the unified mensural notation used by Dufay and Ockeghem, is itself a kind of theoretical resolution of competing medieval approaches.
Chapter 3: Renaissance Theory — Zarlino and the Senario
The sixteenth century witnessed a transformation in music-theoretical thinking whose consequences remain visible in every introductory music theory textbook. The shift from modal to tonal thinking — from an organization of melody into modal species to an organization of harmony around consonant triads and their progressions — did not happen abruptly or completely in any single theorist’s work. But two texts, written within a decade of each other, mark the pivotal moment: Heinrich Glarean’s Dodecachordon (1547) and Gioseffo Zarlino’s Le istitutioni harmoniche (1558). These two works together reframe the entire theoretical inheritance from Greece and the Middle Ages in light of the compositional practice of the mid-sixteenth century.
3.1 Glarean and the Twelve-Mode System
The standard eight-mode system, inherited from Boethius and the Carolingian tonary tradition, recognized eight modal finals (D, E, F, G in both authentic and plagal forms). Glarean, a Swiss humanist and friend of Erasmus, observed that the same theoretical logic that generated those eight modes could be extended to recognize modes on A and C as well. His Dodecachordon (“twelve strings,” 1547) proposed a twelve-mode system adding four new modes to the traditional eight: the Aeolian (authentic, final A) and its plagal Hypoaeolian, and the Ionian (authentic, final C) and its plagal Hypoionian.
Glarean also identified the Locrian mode (authentic, final B) and Hypolocrian (plagal, final B) but rejected them both as practically unusable because the fifth above B is diminished rather than perfect, violating a foundational requirement for a stable modal final. This exclusion of B as a modal final is itself theoretically significant, foreshadowing the later treatment of the diminished triad on the seventh scale degree as an unstable, non-foundational harmony in tonal theory. Glarean’s reasoning about modal viability anticipates the criterion of “stable fifth” that underlies both the modal and tonal frameworks.
3.2 Zarlino and Le istitutioni harmoniche
Gioseffo Zarlino (1517–1590), maestro di cappella at St. Mark’s Basilica in Venice — the most prestigious church music position in Italy — produced in Le istitutioni harmoniche a synthesis that would dominate Italian and European music theory for more than a century. Zarlino’s treatise is structured in four books: the first treats number and ratio theory; the second treats counterpoint; the third treats modes; and the fourth treats text-setting and the relationship between music and words. This comprehensive scope — physical, mathematical, practical, and aesthetic — signals Zarlino’s ambition to produce a summa of music theory on the model of Boethius’s educational summa.
Zarlino’s central theoretical innovation was the senario: the claim that musical consonance is explained by the ratios within the first six integers.
- Octave: 2:1
- Fifth: 3:2
- Fourth: 4:3
- Major third: 5:4
- Minor third: 6:5
- Major sixth: 5:3 (derived by compounding a major third and a fourth)
Zarlino’s senario provided a mathematical justification for what composers had been doing in practice since the mid-fifteenth century: treating thirds and sixths as consonances and building harmonies from stacked thirds (triads). The shift from a two-voice, intervallic conception of harmony to a three-voice, triadic conception — evident in the music of Dufay, Josquin, and Willaert — received its theoretical rationalization in Zarlino’s senario. Willaert, notably, was Zarlino’s own teacher at St. Mark’s, making the Istitutioni partly an exercise in theorizing a living compositional tradition the author had inherited directly.
3.3 Just Intonation and the Syntonic Comma
The adoption of the major third as a just consonance at the ratio 5:4 creates a new and different comma. The Pythagorean major third of 81:64 and the just major third of 5:4 differ by the syntonic comma:
\[ \frac{81/64}{5/4} = \frac{81}{64} \cdot \frac{4}{5} = \frac{81}{80}. \]This ratio of 81:80 corresponds to approximately 21.51 cents. Whenever a scale is constructed using just perfect fifths (3:2) and just major thirds (5:4), some intervals will be narrowed or widened by a syntonic comma relative to their Pythagorean counterparts. The interaction of the syntonic comma with the Pythagorean comma defines the entire landscape of Renaissance and Baroque tuning theory, creating a family of related problems that no single solution fully resolves.
The practical impossibility of just intonation on keyboard instruments drove the development of meantone temperament, in which the syntonic comma is distributed across four fifths (each narrowed by 1/4 syntonic comma), producing a pure major third of exactly 5:4 from C to E while slightly mistuning the fifths. Quarter-comma meantone was the standard keyboard tuning from roughly 1500 to 1700, used throughout the period of Palestrina, Monteverdi, Frescobaldi, and Purcell. The meantone fifth measures approximately 696.6 cents, compared to the pure fifth of 702.0 cents and the equal-tempered fifth of 700.0 cents.
The transition from meantone to well-tempered and eventually equal-tempered tuning is closely connected to the expansion of harmonic vocabulary in the seventeenth and eighteenth centuries. Meantone temperament produces extremely pure major thirds in the most common keys (C, G, D, F, B♭) but progressively worse thirds and a catastrophic wolf fifth in distant keys. As composers began writing in keys farther from C — particularly in the more chromatic works of late Baroque composers — the wolf became increasingly obtrusive. The solution was well temperament (German: Wohltemperatur): a family of unequal temperaments in which all keys are playable but in which the more common keys have slightly purer intervals than the less common keys. J.S. Bach’s Wohltemperirtes Clavier (1722 and 1742) was composed to demonstrate and celebrate the practical usability of all 24 major and minor keys under such a temperament — not equal temperament, which would not become standard until the nineteenth century, but one of several circulating well temperaments available in his time. The exact tuning Bach intended remains disputed among organologists and performance-practice scholars.
3.4 Vincenzo Galilei and the Florentine Camerata
The most direct challenge to Zarlino’s authority came from his own former student. Vincenzo Galilei (c. 1520–1591), father of the astronomer Galileo, participated in the intellectual circle known as the Florentine Camerata, a group of humanists, musicians, and noblemen who met at the home of Count Giovanni de’ Bardi in the 1570s and 1580s. The Camerata’s project was the revival of ancient Greek practice — or what they believed ancient Greek practice to have been — particularly the claim that Greek tragedy and epic poetry were sung throughout, and that the music heightened the natural speech inflections of the text to produce overwhelming emotional effect.
In his Dialogo della musica antica et della moderna (1581), Galilei attacked Zarlino on two fronts. First, he challenged the senario’s claim to explain consonance empirically: Galilei argued, based on physical experiments, that the ratio determining consonance depends not only on string length (as Zarlino assumed from the monochord tradition) but also on string tension and cross-section, with different consonance ratios resulting from variation in each physical parameter. Second, Galilei argued that the ear — not mathematical ratio — is the proper judge of musical intervals, and that in practice lute fretting and vocal practice demonstrate equal temperament, not just intonation.
3.5 Nicola Vicentino and the Archicembalo
Nicola Vicentino (1511–1575 or 1576) represents the most radical Renaissance application of the ancient Greek tetrachord genera. In L’antica musica ridotta alla moderna prattica (1555), Vicentino argued that the full range of ancient Greek expression — diatonic, chromatic, and enharmonic genera — should be available to contemporary composers. To make this possible, he designed and built the archicembalo (or arciorgano), a keyboard instrument with 31 keys per octave approximating 31-tone equal temperament, providing intervals corresponding to the ancient enharmonic genus including quarter-tones.
Vicentino famously participated in a public debate with the Portuguese theorist Vicente Lusitano in Rome in 1551 over whether a particular polyphonic motet was written in the diatonic, chromatic, or combined genera. Lusitano won the debate as judged by the Roman music community, but Vicentino’s theoretical proposals remained a provocative challenge to the mainstream diatonic orientation of Renaissance polyphonic practice. The archicembalo is one of the most extreme artifacts of the Renaissance impulse to recover and surpass ancient musical practice through technology, and it anticipates much later developments in microtonality and alternative tuning systems.
3.6 From Mode to Key: The Late Renaissance Transition
The shift from modal to tonal organization in Western music is one of the most discussed and least agreed-upon transitions in music history. Theorists and historians have variously located it in the early seventeenth century (the generation of Monteverdi), in the mid-seventeenth century (the generation of Lully and Schütz), or as late as the 1680s–1700s (the generation of Corelli and Purcell). What is clear is that the transition involved several interlocking changes: the regularization of bass motion by fifth, the treatment of the dominant seventh chord as a directed, resolution-seeking dissonance, the emergence of the major–minor distinction as the primary tonal polarity (replacing the distinction between authentic and plagal modal variants), and the development of modulation as a formal and expressive resource.
Johannes Lippius (1585–1612), a Lutheran theologian and music theorist, is credited with introducing the term trias harmonica (harmonic triad) in his Synopsis musicae novae (1612), providing the first explicit definition of the triad as a root-position three-note chord built from a fifth with an added third. Lippius’s triad concept — which preceded Rameau’s by more than a century — identified the root-position triad as the normative harmonic unit, with other configurations derived from it by what he called remissio (a reduction or simplification) — a concept closely related to what Rameau would later call inversion.
Michael Praetorius (1571–1621) and Heinrich Baryphonus (1581–1655) extended Lippius’s triad concept into practical compositional theory, while the figured-bass tradition developing simultaneously in Italy provided composers with a new practical notation for harmonic progressions that increasingly oriented itself around triadic root positions and their inversions. The convergence of these theoretical and practical developments — the triad as theoretical unit, figured bass as practical notation, fifth-progression as harmonic syntax — constitutes the emergence of tonal harmonic theory as a system, even before Rameau gave it its definitive conceptual formulation.
Chapter 4: Rameau and the Foundations of Tonal Theory
When Jean-Philippe Rameau (1683–1764) published his Traité de l’harmonie in 1722, he was a relatively obscure provincial organist approaching forty, not yet the celebrated opera composer he would become. The Traité was received with a mixture of admiration and puzzlement: admiration for its systematic ambition, puzzlement at its sometimes obscure argumentation and its occasionally inconsistent application of its own principles. Yet no single text has more deeply shaped the conceptual vocabulary of Western harmonic theory. The term “tonic,” the concept of chord inversion, the identification of root progressions as the fundamental syntax of tonal harmony — all derive ultimately from Rameau’s sustained effort to found harmonic theory on rational principles.
4.1 The Fundamental Bass
Rameau’s central innovation was the basse fondamentale (fundamental bass): a conceptual, not necessarily sounded, bass line representing the succession of chord roots underlying any harmonic progression. The fundamental bass moves by intervals of a perfect fifth, fourth, or third; progressions by fifth (up or down) are the strongest, by fourth slightly weaker, by third the weakest. Any actual bass note that is not the root of its chord is understood as an elaboration of an underlying fundamental bass movement that has not been placed in the actual bass.
This conceptual move had enormous consequences. Before Rameau, each configuration of intervals above a bass note was treated as a distinct entity in the figured-bass tradition: the \(\binom{6}{3}\) chord was different from the \(\binom{5}{3}\) chord, and the \(\binom{6}{4}\) was different again. There was no concept linking these configurations as versions of “the same chord.” Rameau’s inversion theory provided exactly that link: the \(\binom{6}{3}\) and the \(\binom{5}{3}\) on different bass notes are both representations of the same root-position triad, related by what he called renversement (inversion).
The deeper implication is that chord identity is defined by root, not by bass position. Root movement — fifth up, fifth down, third up, third down — becomes the primary syntactic structure of tonal harmony, with voice-leading in the upper parts serving to connect one root-defined chord to the next. Every subsequent chord taxonomy, from Weber’s Roman numerals through contemporary neo-Riemannian analysis, rests on this Rameauian foundation.
4.2 Chord Inversion
The concept of chord inversion is so thoroughly embedded in modern music theory pedagogy that it is difficult to appreciate how revolutionary it was in 1722. To understand the innovation, one must understand the prior framework of the figured-bass tradition that Rameau inherited.
In the figured-bass tradition, every chord was defined by the intervals it placed above its actual bass note. A chord with a third and fifth above the bass was a “five-three” chord; one with a third and sixth was a “six-three” chord; one with a fourth and sixth was a “six-four” chord. These were three distinct entities governed by different rules about their resolution and treatment. The theoretical question of how they were related had no standard answer.
Rameau tested his inversion theory against a wide range of repertory, arguing that the fundamental bass he derived from any passage — regardless of the inversions appearing in the actual music — always moved by the expected harmonic progressions (primarily fifth motions). This empirical testing gave the theory inductive support, and Rameau was a careful analyst of the repertory he discussed, even if his theoretical prose could be dense and sometimes circular.
A further implication of the inversion theory, developed more fully in Rameau’s later work, is the concept of harmonic function. If all inversions of a chord share the same root and the same fundamental bass position, then they also share the same harmonic function — the same role in a progression. A six-four chord on the dominant, in Rameau’s framework, is not a chord of a completely different harmonic character from a five-three chord on the dominant; it is the same dominant function in a different voice-leading configuration. This functional equivalence of root-related chords paves the way directly for Riemann’s Funktionslehre, in which the concept of harmonic function is extended and systematized into the three-function T/D/S framework. The intellectual lineage from Rameau’s basse fondamentale to Riemann’s function theory, running through the German reception of French harmonic theory in the work of Vogler, Weber, and Hauptmann, is one of the clearest examples of theoretical cumulation in the history of music theory.
4.3 The Corps Sonore
Rameau was not content with a purely practical theory of chord classification. He sought a physical basis for harmonic consonance in the corps sonore — the resonating body, which Rameau identified with the overtone series. When a resonating body vibrates, it produces not only its fundamental pitch but a series of higher partials:
\[ f, \; 2f, \; 3f, \; 4f, \; 5f, \; 6f, \ldots \]The pitches \(f, 2f, 4f\) form the octave series; \(2f, 3f\) form a perfect fifth; \(4f, 5f\) a major third; and \(4f, 5f, 6f\) taken together form a major triad in root position. Rameau argued that nature herself, through the corps sonore, generates the major triad: it is not a human construction but a physical given inscribed in the very behavior of vibrating matter. This physical grounding was meant to give harmonic theory a certainty comparable to that of physics or mathematics.
The corps sonore argument also raised a deeper epistemological question that Rameau never fully resolved: is the overtone series a fact about nature that music theory must accommodate, or is it a fact about music that nature is fortunate to provide? If the former, then harmonic theory is a branch of natural philosophy and changes as our physical understanding improves. If the latter, then the physics is merely illustrative and the theory has an independent logical status. Rameau tended toward the former interpretation, which made his theory vulnerable to revision as physics advanced — as indeed occurred when Helmholtz’s more rigorous psychoacoustics superseded Rameau’s more speculative corps sonore argument in the mid-nineteenth century.
4.4 Rameau’s Later Theoretical Career
Rameau’s theoretical work did not stop with the Traité. The Nouveau système de musique théorique (1726) refined his treatment of the fundamental bass and introduced the concept of the double emploi (double employment). In Rameau’s analysis of the chord on the fourth scale degree (IV), the chord can function as either a subdominant or as a “second dominant” (a dominant of the dominant), depending on its harmonic context. The double emploi allows Rameau to account for the IV chord’s dual role in cadential progressions without abandoning the fundamental-bass framework, and it anticipates the concept of functional ambiguity that would later be central to Riemann’s function theory.
The Génération harmonique (1737) represents Rameau’s most ambitious attempt to ground harmonic theory in physics and mathematics. He draws on the experiments of Joseph Sauveur with string vibrations and overtones to argue that the corps sonore is the generative principle of all harmonic practice. In Démonstration du principe de l’harmonie (1750), he attempts a more philosophical synthesis, situating music theory as a branch of natural philosophy governed by the same rational principles as Cartesian science. These later works reveal Rameau increasingly drawn toward a priori rationalism at the expense of the empirical flexibility that had made the Traité analytically persuasive.
Jean le Rond d’Alembert’s Éléments de musique théorique et pratique suivant les principes de M. Rameau (1752) popularized and clarified Rameau’s system for the readers of the Encyclopédie. It was primarily through d’Alembert’s simplified presentation — which stripped away Rameau’s more speculative derivations and presented the practical theory in clear Enlightenment prose — that Rameau’s ideas circulated among educated European readers of the period.
4.5 The Legacy of Rameau
Rameau’s legacy permeates every aspect of modern tonal theory. The I–IV–V–I harmonic paradigm as the definition of tonal closure; the identification of chord roots as the primary carriers of harmonic syntax; the concept that inversions are secondary to roots; the idea that tonic, subdominant, and dominant are the three primary harmonic functions — these propositions, developed by Rameau and refined by his successors, are so deeply embedded in standard music theory pedagogy that most students encounter them without knowing they have an author. When an undergraduate theory student labels a chord as I\(^6\) (first inversion tonic) rather than writing a figured-bass number, they are performing an act of Rameauian analysis, whether or not they know it. The entire tradition of harmonic analysis as a discipline distinct from counterpoint and figured-bass pedagogy is Rameau’s invention.
4.6 Rameau’s Contemporaries: Sorge and Tartini
Rameau was not the only early eighteenth-century theorist attempting to derive harmonic consonance from physical and mathematical principles. Georg Andreas Sorge (1703–1778), a German organist and theorist, independently discovered the overtone series as the basis of consonance and published his own harmonic theory in Vorgemach der musicalischen Composition (1745–47), predating Rameau’s Génération harmonique in some respects. Sorge’s contribution has been largely overlooked because of Rameau’s greater fame and more comprehensive systematization, but the simultaneous and independent development in Germany testifies to how broadly the question of harmonic foundations was pressing on European theoretical consciousness in the early eighteenth century.
Giuseppe Tartini (1692–1770), the Italian violinist and composer, proposed a distinctive theory of harmony in his Trattato di musica secondo la vera scienza dell’armonia (1754). Tartini’s theory centered on the phenomenon of combination tones — the faint lower pitch audible when two higher pitches are sounded simultaneously — which he called the terzo suono (“third sound”). The combination tone of two pitches \(f_1\) and \(f_2\) (with \(f_1 > f_2\)) is the difference tone \(f_1 - f_2\). Tartini argued that the most consonant intervals are those whose combination tone reinforces rather than contradicts the harmonic structure of the interval.
Chapter 5: German Theory from Kirnberger to Riemann
The reception and transformation of Rameauian harmonic theory in the German-speaking lands produced a distinct tradition characterized by greater emphasis on counterpoint, modal retention, and eventually a systematic functional vocabulary. Where French theory centered on the fundamental bass and the corps sonore, German theory centered on figured-bass pedagogy, the authority of Bach’s practice, and — in its culminating figure, Hugo Riemann — a grand philosophical synthesis of harmony, logic, and musical perception. The German tradition also engaged more directly with philosophy — with Kant, Hegel, and the tradition of German idealism — giving its music theory a more explicitly metaphysical flavor than its French counterpart.
5.1 Kirnberger and the Bach Legacy
Johann Philipp Kirnberger (1721–1783) studied with J.S. Bach in Leipzig and devoted much of his theoretical career to codifying and justifying Bach’s harmonic and contrapuntal practice. His Die Kunst des reinen Satzes in der Musik (The Art of Strict Musical Composition, 1771–79) is a comprehensive figured-bass and counterpoint treatise that treats Bach’s chorales and two-part inventions as normative models, making the implicit claim that the greatest theoretical authority is not ancient precedent or mathematical derivation but the practice of the greatest living master.
Kirnberger also made contributions to tuning theory, proposing Kirnberger III temperament — a compromise between pure just intonation and equal temperament, preserving pure fifths and thirds in frequently used keys while distributing impurities into less common ones. This temperament is favored by some period-instrument performers for Bach’s keyboard music. Kirnberger’s theoretical conservatism led him to resist more adventurous harmonic practices of his contemporaries and to advocate for relatively strict voice-leading in the Bachian mold.
5.1b Johann Mattheson and the German Figured-Bass Tradition
Johann Mattheson (1681–1764), a Hamburg musician and prolific writer, represents the other face of early German music theory: not the speculative mathematical tradition descending from Boethius but the practical rhetorical tradition that treated music composition as a branch of rhetoric. His Das neu-eröffnete Orchestre (1713), Critica musica (1722–25), and the massive Der vollkommene Capellmeister (The Complete Music Director, 1739) collectively constitute the most comprehensive treatment of the practical aspects of music composition in the German Baroque.
For Mattheson, musical composition is governed by the same principles as persuasive oration: it must have a dispositio (formal arrangement), an inventio (invention of material), and an elaboration of that material through the musical equivalent of rhetorical figures (Figuren). The doctrine of musical figures (Figurenlehre) — a systematic catalog of melodic and harmonic devices analogous to rhetorical figures like anaphora, antithesis, and climax — was developed by a series of German theorists from Joachim Burmeister (1564–1629) through Johann David Heinichen (1683–1729) and Mattheson himself.
The Affektenlehre and Figurenlehre traditions represent a different strand of music-theoretical thinking from the ratio-based and function-based traditions charted in this course. Where Rameau grounds harmonic theory in physics and Schenker grounds it in deep voice-leading structure, the German rhetorical tradition grounds compositional practice in emotional expressiveness and communicative effectiveness. The rhetorical strand of music theory has been revived in late-twentieth-century scholarship by theorists including Leonard Ratner (Classic Music, 1980), who developed the concept of musical topics (characteristic styles and genres used as expressive signifiers within Classical-era compositions), and Wye Allanbrook (Rhythmic Gesture in Mozart, 1983), who applied topic theory systematically to Mozart’s operas.
5.2 Gottfried Weber and Roman Numeral Analysis
Gottfried Weber (1779–1839) produced in his Versuch einer geordneten Theorie der Tonsetzkunst (Attempt at a Systematic Theory of Musical Composition, 1817–21) the most comprehensive harmonic theory of the early nineteenth century and the direct ancestor of the Roman numeral analytical notation that is now standard in American undergraduate pedagogy. Weber’s approach was deliberately empirical and inductive: he documented harmonic practice as he found it in the common-practice repertory rather than deriving it from a single generating principle.
Weber systematically labeled chords by the scale degree of their root, using Roman numerals (I through VII for the seven diatonic scale degrees). He extended this system to secondary dominants — chords that function as dominants of scale degrees other than the tonic. The notation V/V (read “five of five”) designates the dominant of the dominant; V/IV is the dominant of the subdominant, and so on.
Weber’s system is more flexible than Rameau’s fundamental bass in its ability to represent chromatic harmonies through the secondary dominant notation, but it is less theoretically motivated. Weber is content to document and label what he finds in the repertory; explaining why these progressions work the way they do is a task he largely leaves to his successors. This descriptive flexibility has pedagogical advantages but theoretical costs that later theorists, particularly Riemann, would attempt to address through a more principled functional vocabulary.
5.3 A.B. Marx and the Sonata Principle
Adolf Bernhard Marx (1795–1866) shifted German music theory from harmonic syntax toward formal architecture. His four-volume Die Lehre von der musikalischen Komposition (The Theory of Musical Composition, 1837–47) defined the sonata form as the central organizational principle of the Viennese Classical style, and it introduced the terminology that American music theory textbooks still use today.
Marx coined the terms Hauptsatz (main theme), Seitensatz (secondary theme), Schlussgruppe (closing group), and used the term Durchführung (development) for the developmental section. He conceived sonata form not as a key-scheme or formal template but as a dramatic process: the Hauptsatz establishes a character or energy; the Seitensatz provides contrast and opposition; the development creates conflict and instability through fragmentation and tonal destabilization; the recapitulation achieves resolution and synthesis.
The Beethoven symphony — particularly the Eroica (No. 3) and the Fifth — served for Marx as the paradigm of organic form, a term derived from Goethe’s nature philosophy and German Romantic aesthetics. An organism does not have its parts assembled externally but generates them from within through a single generative principle; Marx argued that the greatest musical works similarly develop all their material from a single motivic seed through an organic process. This organicist aesthetic was enormously influential on subsequent German music theory and criticism, most notably on the tradition of motivic-thematic analysis that runs from Marx through Schoenberg’s concept of Grundgestalt to Rudolph Réti’s thematic analysis in the twentieth century.
Marx’s impact was also felt outside Germany, particularly through Eduard Hanslick (1825–1904), whose Vom Musikalisch-Schönen (On the Musically Beautiful, 1854) is the foundational text of musical formalism — the claim that music has no meaning beyond its own structural processes and that the “content” of music is “tonally moving forms.” Hanslick’s formalism is a response to the program music aesthetics of Liszt and Wagner, who claimed that music could express specific extra-musical meanings (narratives, images, ideas). For Hanslick, such claims confuse music with poetry or painting; genuine musical meaning is purely structural, and the form-based tradition of German music theory from Marx onward provides the conceptual vocabulary for describing that structural meaning. The debate between Hanslick’s formalism and Wagner’s expressive aesthetics is a music-theoretical debate as much as an aesthetic one: it concerns the proper object of music theory and the appropriate concepts for describing what music does.
5.4 Hauptmann and Hegelian Dialectics
Moritz Hauptmann (1792–1868), a student of Spohr and a colleague of Mendelssohn at the Leipzig Conservatory, brought Hegelian philosophy explicitly into music theory. His Die Natur der Harmonik und der Metrik (The Nature of Harmony and Meter, 1853) argued that harmonic relationships are structured by the same dialectical logic that Hegel used to analyze historical and philosophical processes: the movement from unity through opposition to synthesis.
For Hauptmann, the tonic (I) is the thesis — the initial, undifferentiated unity. The dominant (V) is the antithesis — difference, the fifth relationship pulling away from the tonic. The subdominant (IV) is the synthesis — a new unity incorporating the opposition of the first two, since IV contains the same fifth relationship as V but directed downward from the tonic rather than upward. Together, tonic, dominant, and subdominant form a dialectical triad whose resolution in the cadence (S–D–T or IV–V–I) mirrors the Hegelian movement of thought.
5.5 Hugo Riemann: Function Theory and Harmonic Dualism
Hugo Riemann (1849–1919) is the most encyclopedic music theorist in the Western tradition since Gioseffo Zarlino. His writings span harmonic theory, counterpoint pedagogy, music history, acoustics, music psychology, and musicology. His most enduring theoretical contribution is the system of Funktionslehre (function theory), developed across several treatises culminating in the Vereinfachte Harmonielehre (Simplified Harmony, 1893) and Handbuch der Harmonielehre (1887).
Riemann’s function theory reduces the entire harmonic vocabulary of tonal music to three fundamental functions: Tonic (T), Dominant (D), and Subdominant (S). Every chord in a key, no matter how chromatic, can be analyzed as one of these three functions or as a chromatic variant indicated by letter superscripts and subscripts. The analytical power of the system is that it focuses attention on harmonic function — what a chord does in a progression — rather than on scale-degree identity alone.
- T: tonic triad (I = C–E–G)
- Tp: tonic parallel — the relative minor, vi (A–C–E)
- Tg: tonic "Gegenklang" (counter-sound) — the mediant, iii (E–G–B)
- D: dominant triad (V = G–B–D)
- Dp: dominant parallel — the supertonic or leading-tone triad depending on context
- S: subdominant triad (IV = F–A–C)
- Sp: subdominant parallel — the supertonic, ii (D–F–A)
Underlying Riemann’s function theory is a deeper and more controversial commitment: harmonic dualism. Riemann argued that the major and minor triads are not merely scale-based formations but represent two symmetrically opposite manifestations of the same acoustic principle. The major triad is generated “upward” from a root; the minor triad, Riemann claimed, is generated “downward” from a conceptual “root” (the Klangvertreter, “tone representative”) at the top. In C minor, for example, the minor triad C–E♭–G is understood as built downward from G, making G its structural generator.
Riemann’s notational system remains in use in German-language university music theory curricula today, coexisting with the Roman numeral tradition descended from Weber. The two systems are not easily reconciled: Roman numeral analysis is primarily scale-degree based, while Riemann’s function labels are primarily function based and abstract away from specific scale degrees and inversions. Both traditions remain active and generative in contemporary research and pedagogy, and the tension between them is itself theoretically productive.
5.6 Hermann von Helmholtz and the Science of Consonance
Hermann von Helmholtz (1821–1894) was a physicist and physiologist rather than a music theorist in the traditional sense, but his Die Lehre von den Tonempfindungen als physiologische Grundlage für die Theorie der Musik (On the Sensations of Tone as a Physiological Basis for the Theory of Music, 1863) is one of the most important single contributions to music-theoretical thought in the nineteenth century. Helmholtz’s central thesis is that consonance and dissonance are physiological phenomena determined by the relationship between the overtone structures of simultaneously sounding tones.
When two tones sound simultaneously, their upper partials may or may not coincide. When partials of the two tones are close but not identical in frequency, they produce beats — periodic amplitude fluctuations perceived as roughness. Consonance corresponds to minimal roughness (partials coincide or are far apart in frequency); dissonance corresponds to maximal roughness (partials produce slow, audible beats).
Helmholtz’s theory provided a new and more rigorous physical basis for what Zarlino had derived from the senario: the consonance of the major third 5:4 is now explained not by the simplicity of the ratio but by the physiological phenomenon of beats in the auditory system. This shift from number theory to physiology as the grounding of musical consonance is one of the major transitions in music theory’s intellectual history, opening the discipline to the methods of experimental science.
Helmholtz also investigated the acoustic basis of timbre (which he analyzed in terms of overtone structure), the construction of musical scales across cultures, and the history of tuning systems. His Tonempfindungen is an encyclopedic work that influenced subsequent work in psychoacoustics, ethnomusicology, and music cognition, making Helmholtz — alongside Rameau and Riemann — one of the principal architects of the scientific study of music.
Chapter 6: Schenker and the Organicist Tradition
Heinrich Schenker (1868–1935) developed the most influential and most contested theory of tonal music produced in the twentieth century. His work reshaped graduate music theory pedagogy, particularly in the United States, and the analytical method named after him — Schenkerian analysis — is now simultaneously a standard tool in academic music theory and a focal point of sharp debate about the politics of the classical canon, the boundaries of analytical applicability, and the discipline’s responsibility to its historical origins.
6.1 Biography and Intellectual Context
Schenker was born in Wisniowczyk, Galicia (then part of the Austro-Hungarian Empire, now Ukraine), received his training in law and music at the University of Vienna, and spent his adult life in Vienna as a pianist, teacher, editor, and theorist. His intellectual formation was shaped by the conservative Viennese cultural milieu of the fin de siècle: the world of Brahms (whom he knew personally and whose encouragement he prized), of German idealism, and of a profound suspicion toward musical modernism.
Schenker’s relationship to Brahms — he played his own piano compositions for the aging composer — deeply informed his sense of the German classical tradition as a living standard against which all subsequent music must be measured. His antagonism toward Reger, Strauss, and especially Schoenberg was intense and personal. He edited critical editions of Bach keyboard works and Beethoven piano sonatas, adding extensive analytical commentary, and these editorial projects were as important to his theoretical project as his systematic treatises. The Erläuterungsausgaben (annotated editions) of the last five Beethoven piano sonatas remain important analytical documents in their own right.
6.2 Harmonielehre (1906)
Schenker’s first major theoretical work, the Harmonielehre (published as Part I of a projected Neue musikalische Theorien und Phantasien), is not primarily a theory of chord progressions in the Roman numeral sense. Its central concept is the Stufe (scale step or degree): a harmonic entity representing a complete tonal region within the key, not merely a momentary vertical chord. Where Roman numeral analysis treats each chord as an instantaneous event, the Stufe is a prolonged area of harmonic space that may persist across many surface chords before yielding to the next Stufe.
The Harmonielehre also contains Schenker’s analysis of the scale as a mental (ursächlich) representation — not a physical given but a conceptual construct that the trained musical mind brings to the experience of tonal music. The diatonic major scale is not a collection of equally weighted notes but a hierarchically organized system in which some degrees are structurally primary and others ornamental, a principle systematically developed in his later work. The text is famously demanding: Schenker’s prose is dense and polemical, his examples are extensive, and his implicit assumptions often exceed what he makes explicit.
Schenker’s concept of the Stufe has an important polemical dimension: it was directed against the prevailing figured-bass-based harmonic theory of his time, particularly as represented in the harmonic textbooks of Simon Sechter (1788–1867) — the teacher of Bruckner and Schubert’s late teacher — and their continuation in the work of Anton Bruckner’s own theory teaching. Sechter’s system, based on root-progression theory derived from Rameau, treated every chord as a distinct harmonic entity with a determinate root, and analyzed complex passages by assigning a root to every chord. Schenker objected that this approach produced analyses of absurd complexity for simple passages: a short Beethoven phrase might receive a dozen or more Roman numeral labels where Schenker would identify only two or three Stufen, with the intermediate chords serving as passing or neighboring elaborations of a single sustained harmonic area.
6.3 Kontrapunkt (1910–22)
Schenker’s two-volume Kontrapunkt undertakes a rereading of strict species counterpoint as the foundation of free tonal composition. Unlike most counterpoint pedagogy, which treats strict counterpoint as a pedagogical exercise largely distinct from free composition, Schenker argues that the voice-leading relationships codified by Fux (and behind Fux by Palestrina) in strict counterpoint are literally operative — in a conceptually elaborated or “prolonged” form — in every tonal masterwork. This is one of Schenker’s most heterodox and difficult claims.
The key concept is prolongation: a structural interval or voice-leading motion can be extended over a much longer span of musical time through the interpolation of elaborating figures. Passing tones, neighbor tones, and arpeggiations that in strict counterpoint are local ornaments become, in free composition, the mechanisms by which short structural voice-leading motions are expanded into passages of indefinite length. The simplest two-voice strict counterpoint, prolonged through these mechanisms, is for Schenker the deep structural reality of a multi-movement symphony.
6.4 Der freie Satz (1935)
The culminating synthesis of Schenker’s theoretical project, Der freie Satz (Free Composition, posthumously published 1935 and translated into English by Ernst Oster in 1979), presents the complete theory of musical structure in terms of three hierarchical levels and two fundamental voice-leading archetypes.
- Ursatz (fundamental structure): the deepest structural level of any tonal work, consisting of two components: the Urlinie (fundamental line) — a stepwise melodic descent from scale degree \(\hat{3}\), \(\hat{5}\), or \(\hat{8}\) down to \(\hat{1}\) in the soprano voice — and the Bassbrechung (bass arpeggiation) — a motion from I up to V and back to I in the bass. Every tonal masterwork has exactly one Ursatz.
- Hintergrund (background): the level at which the Ursatz operates. Background structure is universal: all tonal works share the same background, differing only in which Urlinie descent form obtains.
- Mittelgrund (middleground): intermediate structural levels at which the Ursatz is elaborated by prolongations — arpeggiations, linear progressions, voice-exchanges — expanding the time-span over which background events unfold.
- Vordergrund (foreground): the surface of the music as actually heard, with all its ornamental figuration, motivic detail, and rhythmic articulation.
The analytical procedure of Schenkerian analysis is essentially a process of “reduction” — progressively removing foreground elaborations to reveal the middleground prolongations, then removing those to reveal the background Ursatz. Graphical representation uses open noteheads (whole notes) for structural tones, filled noteheads for subordinate tones, and slurs and beams to indicate prolongational spans and linear progressions. The reductive graph is the primary medium of Schenkerian analysis, and learning to read and produce such graphs is the central task of Schenkerian pedagogy.
6.5 Schenker’s Cultural Politics
Any serious engagement with Schenker’s work must confront his cultural politics. Schenker was an ardent German nationalist and an explicit antisemite. His diaries, now published and translated (the online Schenker Documents Online project), contain numerous expressions of racial contempt extending to major composers and performers. His published writings repeatedly claim that the capacity for deep tonal thinking — the capacity to generate organic masterworks from a background Ursatz — is uniquely Germanic, exemplified by Bach, Haydn, Mozart, Beethoven, and Brahms.
6.6 American Reception and Post-Schenkerian Critiques
Felix Salzer’s Structural Hearing (1952) was the first comprehensive English-language presentation of Schenkerian analysis, and it deliberately extended the method to medieval polyphony and contemporary music. Allen Forte and Steven Gilbert’s Introduction to Schenkerian Analysis (1982) codified an orthodox version of the method for graduate pedagogy. Forte’s terminological choices — including the concept of “interruption” for the I–V || I–V–I background pattern — became standard in American doctoral programs.
Critical responses have been wide-ranging and substantive. Joseph Straus’s “The Problem of Prolongation in Post-Tonal Music” (1987) argued that the concept of prolongation, as Schenker defined it, requires conditions of triadic tonality that do not obtain in atonal music; attempts to apply Schenkerian analysis to Schoenberg or Bartók are therefore conceptually incoherent unless the concept of prolongation is fundamentally reconceived. Feminist critiques have examined the gendering of musical form in Schenker’s theory: Susan McClary and others have pointed out that the discourse of structural “penetration” and “masculine” structural descent versus “feminine” ornamental elaboration in Schenker’s writing reflects and reinforces gender ideologies that have nothing to do with the musical structures being described. Philip Ewell’s “Music Theory’s White Racial Frame” (2020), delivered as the Society for Music Theory plenary address, argued that American music theory’s privileging of Schenkerian analysis is inseparable from the discipline’s historical exclusion of non-white composers, theorists, and repertories, and that addressing this exclusion requires more than simply diversifying the analytical canon while leaving the methods unchanged. The subsequent exchange — published as a multi-article symposium in the Journal of Schenkerian Studies — drew responses ranging from careful scholarly engagement to statements that many readers found indefensible in their dismissiveness; the editorial handling of those responses became itself a subject of controversy and led to significant institutional consequences within the discipline.
6.7 Expanding the Schenkerian Method: Prolongation in Extended Tonal Music
Felix Salzer’s decision to apply Schenkerian concepts to medieval and Renaissance music was not merely an act of scholarly imperialism; it was a genuine theoretical argument. Salzer contended that structural hearing — the ability to perceive hierarchical levels of musical structure — is a property of trained musical perception in general, not exclusively of Viennese Classical tonal perception. Medieval polyphony, he argued, exhibits prolongational structures: a voice-leading motion in one part can be prolonged through elaboration in a passage of several measures, just as in Bach or Beethoven.
This claim has provoked substantial methodological debate. William Rothstein and others have argued that Salzer’s approach misidentifies ornamental elaboration for structural prolongation, applying the formal categories of Schenkerian analysis to repertory that lacks the triadic-tonal harmonic framework those categories presuppose. The debate illuminates a fundamental ambiguity in Schenker’s own concept of prolongation: it is never entirely clear whether prolongation is a property of the tonal system (requiring triadic tonality as a precondition) or a property of voice-leading in general (potentially operating across any pitch system that has registral and directional preferences).
Carl Schachter’s analytical writings, collected in Unfoldings: Essays in Schenkerian Theory and Analysis (1999), represent perhaps the most nuanced application of Schenkerian concepts to a wide range of tonal repertory, including music by Chopin, Brahms, and Schubert that Schenker himself rarely analyzed in depth. Schachter’s work demonstrates how prolongational concepts can illuminate music at the edges of common-practice tonality without either abandoning Schenkerian principles or applying them mechanically. His analyses consistently show how rhythmic and formal processes interact with prolongational structure — a dimension of musical organization that the purely pitch-based Schenkerian graphic notation tends to underrepresent.
6.8 Form Theory: Caplin, Hepokoski, and Darcy
Schenker’s influence on music theory generated not only a tradition of voice-leading analysis but also a renewed interest in musical form — in the large-scale architecture of tonal works and the relationship between formal units and harmonic structure. William Caplin’s Classical Form: A Theory of Formal Functions for the Instrumental Music of Haydn, Mozart, and Beethoven (1998) represents the most systematic recent treatment of form in the Viennese Classical style.
Caplin’s theory is organized around the concept of formal function: the role that a passage of music plays within a larger formal design. His primary unit is the theme, which itself is articulated into smaller functional units: presentation phrases (introducing the basic idea and its repetition), continuation phrases (developing and fragmenting the idea), and cadential phrases (providing harmonic and formal closure). These smaller units combine into complete formal types (the sentence, the period, the hybrid) that in turn function as large-scale sections within movements.
- The sentence: a two-part structure consisting of a presentation (which states a basic idea and repeats it, usually at the same or a different harmonic level) followed by a continuation (which develops the idea through fragmentation and harmonic acceleration) and a cadence.
- The period: a two-part structure consisting of an antecedent phrase (which ends with a weak, typically half cadence) followed by a consequent phrase (which repeats or varies the antecedent's opening and ends with a strong, typically authentic cadence). The period creates a question-and-answer structure whose rhetorical logic Caplin traces to the antecedent-consequent phrase pairs of Baroque rhetoric.
James Hepokoski and Warren Darcy’s Elements of Sonata Theory: Norms, Types, and Deformations in the Late-Eighteenth-Century Sonata (2006) offers a complementary but substantially different approach to Classical form. Where Caplin emphasizes local formal function, Hepokoski and Darcy emphasize genre conventions and their systematic deformation. Their theory is explicitly dialogic: a sonata movement is not simply a realization of a formal template but a negotiation between the norms of the genre and the specific choices a composer makes in any given instance.
The concept of deformation — a departure from genre norms that acquires expressive meaning precisely by violating what the genre leads listeners to expect — is central to their approach. A passage that ends in the wrong key, that fails to produce an expected cadence, or that returns to the recapitulation in an unexpected way is not a mistake but a communicative act, exploiting the listener’s internalized knowledge of the sonata norm to create surprise, tension, or irony. This dialogic model of form connects Hepokoski and Darcy’s work to the broader tradition of rhetorical music theory (Mattheson, the Figurenlehre tradition) and to the reception-theoretic turn in musicology associated with Lawrence Kramer and Susan McClary.
Chapter 7: Twentieth-Century Theory — Serialism, Set Theory, and Transformational Theory
The collapse of functional tonality in the early twentieth century was not merely a compositional event but a theoretical crisis. If the harmonic syntax that Rameau had articulated, Riemann had systematized, and Schenker had claimed to find operating at the deepest structural level of all great music was no longer operative in new composition, what principles governed musical organization? Three major theoretical responses emerged across the twentieth century: the extension of serial thinking to a general theory of pitch-class organization, the development of pitch-class set theory as an analytical tool for atonal repertory, and the emergence of transformational theory as a general mathematical framework for musical relationships of all kinds.
7.1 Schoenberg’s Twelve-Tone Method
Arnold Schoenberg (1874–1951) developed the twelve-tone method between approximately 1920 and 1923 as a “method of composing with twelve tones related only to one another” — a phrase that signals both the method’s structural principle and its implicit rejection of the tonal hierarchy in which all twelve tones are related to a single privileged tonic. The method is built around the tone row (German: Reihe), an ordering of all twelve pitch classes in a fixed sequence serving as the generative material for an entire composition.
- Prime (P): the row in its original order. \(P_n\) is P transposed by \(n\) semitones.
- Inversion (I): the row inverted — each ascending interval is replaced by a descending interval of the same size. \(I_n\) is I transposed by \(n\) semitones.
- Retrograde (R): the row reversed in order. \(R_n\) is P reversed and transposed.
- Retrograde-Inversion (RI): the inversion reversed. \(RI_n\) is I reversed and transposed.
Schoenberg insisted that the twelve-tone method did not abolish musical sense but reorganized it: the row replaces the scale as the source of pitch relationships, counterpoint and register still create texture and hierarchy, and large-scale form still articulates musical time. Combinatoriality — the property of certain row hexachords that, combined with a transformed version, yield an aggregate of all twelve pitch classes — was an important structural resource for Schoenberg and became central to Babbitt’s subsequent theorization.
Schoenberg’s use of the method across his late works demonstrates remarkable flexibility. The String Quartet No. 4 (1936) uses the twelve-tone row to generate music with many surface similarities to late Romantic harmonic language; the Piano Concerto (1942) achieves a large-scale formal clarity that Schoenberg explicitly linked to Classical models. These works complicate any simple narrative treating the twelve-tone method as the negation of all prior musical values.
Schoenberg’s students Alban Berg (1885–1935) and Anton Webern (1883–1945) adopted the twelve-tone method but used it in markedly different ways, producing music whose divergence demonstrates the method’s compositional flexibility. Berg’s twelve-tone works — the Violin Concerto (1935), Lulu (1935) — retain obvious surface connections to tonal harmony, using rows that generate tonal triads and using large-scale tonal planning to articulate formal structure. Webern’s twelve-tone works — the Symphony Op. 21 (1928), the Variations Op. 27 (1936) — are characterized by extreme brevity, pointillistic texture, and rigorous application of the row’s symmetrical properties, producing music of great structural economy. The contrast between Berg’s and Webern’s twelve-tone aesthetics generated two distinct lineages in post-war new music: a Bergian line emphasizing expressive continuity with the Romantic tradition, and a Webernian line emphasizing structural rigor and the emancipation of new music from its historical past.
7.2 Babbitt and the Formalization of Twelve-Tone Theory
Milton Babbitt (1916–2011) transformed Schoenberg’s compositional method into a rigorous theoretical system, drawing on set theory, group theory, and combinatorics. His Princeton doctoral dissertation, The Function of Set Structure in the Twelve-Tone System (written 1946, accepted 1992 — the long delay reflects the music department’s initial resistance to mathematical music theory as a legitimate dissertation topic), established the foundations of what would become pitch-class set theory.
Babbitt introduced the term pitch class to refer to the equivalence class of all pitches related by octave transposition: C4, C5, and C3 are all instances of pitch class C. This abstraction allowed him to treat twelve-tone rows as mathematical objects in modular arithmetic (arithmetic mod 12) and to analyze their structural properties precisely.
Babbitt extended serial organization beyond pitch to rhythm through the time-point system: rhythmic positions within a measure are mapped onto the same mod-12 arithmetic, so that a twelve-tone row can simultaneously determine both pitch and rhythmic structure. This “total serialism” — extending serial organization to dynamics, articulation, register, and timbre — was pursued in parallel by Pierre Boulez and Karlheinz Stockhausen in Europe, though with somewhat different theoretical frameworks. Babbitt’s own compositions from the 1950s onward — All Set (1957), the Partitions (1957), and the string quartets — demonstrate the compositional possibilities of this expanded serialist framework.
7.3 Forte and Pitch-Class Set Theory
Allen Forte (1926–2014) published The Structure of Atonal Music in 1973, providing a comprehensive analytical system for the atonal repertory of Schoenberg, Webern, and Berg from roughly 1908 to 1923. Forte’s system — pitch-class set theory — treats any collection of pitch classes as a set subject to analysis through its prime form and interval vector. The system provides a common vocabulary for comparing harmonic collections across different atonal works, replacing the scale-degree framework of tonal analysis with a more abstract combinatorial framework.
The interval vector of a set is a six-element array \([ic_1, ic_2, ic_3, ic_4, ic_5, ic_6]\) counting how many times each interval class (1 through 6) appears between pairs of elements. For the major/minor triad (3-11): interval vector [001110], indicating one instance each of interval classes 3, 4, and 5.
Forte’s system attracted both enthusiastic adoption and sustained critique. George Perle argued that Forte’s prime forms did not adequately capture the way composers actually worked with twelve-tone rows, since Forte’s equivalences conflate transpositionally and inversionally related sets that function very differently in compositional practice. David Lewin pointed out that treating transposition and inversion as equivalences — so that a major triad and a minor triad are the same set class (3-11) — conflates entities that function very differently in musical context, and proposed a richer transformational framework maintaining their distinctness as musically significant objects.
Robert Morris’s Composition with Pitch Classes: A Theory of Compositional Design (1987) extended Forte’s framework in the direction of compositional theory, developing the concepts of pitch-class space (a geometric representation of pitch-class relationships) and voice-leading in pitch-class set terms. Morris’s work represents an attempt to bridge the gap between analytical set theory and compositional practice — to show that the same mathematical framework that explains the structure of atonal analysis also provides tools for compositional planning.
The influence of Forte’s Structure of Atonal Music on graduate music theory curricula in North America was enormous. For roughly two decades (the late 1970s through the early 1990s), set theory was the dominant analytical paradigm for twentieth-century music in American doctoral programs. Its decline relative to transformational and neo-Riemannian theory in the 1990s and 2000s was not because set theory was shown to be wrong but because alternative frameworks — particularly Lewin’s transformation theory — proved capable of asking more musically interesting questions about the same repertory. Set theory remains a fundamental tool, but it is now more often used as one component of a broader analytical toolkit than as a comprehensive method in itself.
7.4 Lewin and Transformational Theory
David Lewin (1933–2003) is the most mathematically sophisticated and philosophically ambitious music theorist of the twentieth century. His Generalized Musical Intervals and Transformations (1987) proposed a fundamental reconception of what music theory is about.
Lewin’s deeper proposal was philosophical. He argued that the GIS framework, while mathematically natural, embeds a problematic spatial metaphysics treating musical objects as fixed positions in a space and intervals as distances between them — a metaphysics derived from our experience of moving through physical space. An alternative framework treats music as a network of transformations — operations that carry one musical object to another. The analyst’s question shifts from “what is this note?” to “what transformation connects this note to that one?” — from spatial to processual thinking about musical experience.
Transformation networks — directed graphs whose nodes are labeled with musical objects and whose arrows are labeled with transformations — represent this processual conception analytically. A passage of Brahms harmony can be represented as a network of transpositions and inversions; a Schenkerian reduction can be recast as a transformation network; a twelve-tone row table is a transformation network. Lewin’s framework is a meta-theory subsuming many existing analytical approaches as special cases, and it has generated an enormous body of subsequent research in mathematical music theory.
7.5 Neo-Riemannian Theory and the Tonnetz
Neo-Riemannian theory emerged in the 1990s from the convergence of Lewin’s transformational framework with renewed analytical interest in the chromatic harmonic language of late Romantic music — Schubert, Brahms, Wagner, Liszt. Its central figures are Brian Hyer and Richard Cohn. The name refers to Hugo Riemann’s harmonic dualism, from which neo-Riemannian theory borrows the symmetrical treatment of major and minor triads while abandoning Riemann’s physical and psychological claims about undertones.
- P (Parallel): maps a triad to its parallel major or minor. C major ↔ C minor. One voice (the third) moves by semitone; root and fifth are held as common tones.
- L (Leading-tone exchange): maps a major triad to the minor triad a major third above, and vice versa. C major ↔ E minor. The fifth moves down by semitone to become the root of the minor triad.
- R (Relative): maps a triad to its relative major or minor. C major ↔ A minor. The root moves up by whole tone to become the fifth of the minor triad.
The tonnetz as a geometric representation has antecedents in Euler’s Tentamen novae theoriae musicae (1739) and in Riemann’s own theoretical writings, but its modern neo-Riemannian form is primarily the work of Hyer and Cohn. The geometric representation has been extended computationally to explore voice-leading spaces for seventh chords, ninth chords, and other chord types, generating a rich mathematical theory of harmonic proximity and voice-leading efficiency. This work has produced important analytical results for Wagnerian harmony, Liszt’s late piano music, and the chromatic style of Schubert’s late instrumental works.
7.5b Spectral Music and the Theorization of Timbre
While neo-Riemannian theory was reviving and transforming aspects of the Riemannian tradition, a parallel theoretical development was emerging from the spectral music movement in France. Spectral music — associated with composers including Gérard Grisey (1946–1998) and Tristan Murail (b. 1947), and theorized at the Institut de Recherche et Coordination Acoustique/Musique (IRCAM) in Paris — takes the overtone series as its primary compositional material. Rather than building musical structure from scales, modes, or twelve-tone rows, spectral composers build it directly from the acoustic properties of sounds: the specific frequency ratios, durations, and amplitudes of individual partials.
Theorizing spectral music requires a conceptual vocabulary quite different from either Schenkerian or set-theoretic analysis. The fundamental concepts include spectral analysis (the decomposition of a complex sound into its component frequencies), inharmonicity (the deviation of upper partials from whole-number multiples of the fundamental — characteristic of bells, metallophones, and certain wind instruments), and temporal envelope (the amplitude contour of a sound from attack through decay, sustain, and release).
Spectral music theory connects to the broader history of music theory in revealing ways. Rameau’s appeal to the corps sonore — the claim that the overtone series is the physical foundation of harmonic consonance — is both vindicated and complicated by spectral music: vindicated because spectral composers take the overtone series seriously as a compositional resource, complicated because they find in it not just the major triad but an entire world of microtonal, inharmonic, and spectrally complex sounds that Rameau never considered.
Murail’s theoretical writings, collected in Modèles et artifices (2004), provide the most systematic account of spectral compositional technique from the inside. His concept of the modulant — a pitch or sound whose spectrum serves as the harmonic raw material for a passage of music — is the spectral analogue of the tonal “tonic,” functioning as the generative center around which musical events are organized. The theoretical vocabulary of spectral music — spectra, modulants, temporal envelopes, and their interactions — represents a genuinely new contribution to music theory, not merely an application of existing tools to new music.
7.6 Lerdahl, Jackendoff, and Generative Music Theory
Fred Lerdahl and Ray Jackendoff’s A Generative Theory of Tonal Music (1983) brought the methods of generative linguistics — particularly Noam Chomsky’s distinction between competence (implicit grammatical knowledge) and performance (actual linguistic behavior) — to bear on tonal music. Their theory aims to describe the implicit musical knowledge of an “experienced listener” through a set of well-formedness rules (specifying which structural descriptions are logically possible) and preference rules (specifying which possible description is most preferred given the musical input).
The theory posits four hierarchical components: grouping structure (how musical events are segmented into motives, phrases, and sections), metrical structure (the hierarchy of strong and weak beats at multiple metric levels), time-span reduction (a hierarchy of structural importance across successive time spans, analogous to Schenkerian reduction but derived from explicit preference rules rather than analytical intuition), and prolongational reduction (a hierarchy of tonal tension and relaxation connecting temporally distant but structurally related events).
7.7 Current Debates and Future Directions
The early twenty-first century finds music theory in a state of productive self-examination touching simultaneously on its methods, its materials, and its politics. Empirical music cognition — the experimental study of how listeners actually perceive and process musical structures — provides a scientific check on theoretical claims previously argued from introspection or analytical authority alone. Experimental work by researchers including David Huron (Sweet Anticipation, 2006) and Carol Krumhansl (Cognitive Foundations of Musical Pitch, 1990) has tested claims about key-finding, expectation, and tonal hierarchy against behavioral and neuroimaging data, sometimes confirming and sometimes revising the claims of traditional theory.
Computational tools including the Humdrum toolkit (David Huron) and the music21 Python library (Michael Cuthbert) enable corpus analysis of large repertories, testing claims about harmonic frequency and progression patterns against statistical evidence from thousands of pieces. This work has produced important revisions of received wisdom: corpus studies have revealed that the “common practice” harmonic conventions described in textbooks are not uniformly distributed across the repertory but vary significantly by composer, genre, and historical period.
The politics of the canon — the question of whose music gets theorized, and by whose analytical standards — has become central to the discipline’s self-understanding. The historical dominance of a German-Austro-Hungarian repertory in music theory pedagogy reflects historical exclusions whose intellectual costs are increasingly recognized. Research in the theory of jazz harmony (Steve Larson, Henry Martin, Kent Williams), popular music theory (Richard Middleton, John Covach, Walter Everett), and non-Western music theory (building on ethnomusicological foundations) is expanding the discipline’s analytical toolkit and its sense of what counts as musically interesting structure worth theorizing.
7.8 Jazz Theory, Popular Music Theory, and the Expansion of the Canon
The broadening of music theory’s analytical scope beyond the European common-practice canon has required not only political will but genuine theoretical invention. The harmonic language of jazz, for example, shares some features with common-practice tonality (functional progressions, tonal centers, voice-leading norms) but also differs in fundamental ways that existing theory did not adequately capture.
Steve Larson’s Analyzing Jazz: A Schenkerian Approach (2009) argued that jazz improvisation exhibits prolongational structures analogous to Schenkerian middleground motions, with improvisers navigating between structural chord tones through elaborating passing and neighbor motions. Henry Martin’s work on Charlie Parker and bebop identified motivic-harmonic relationships that persist across improvised variations, showing how jazz improvisation generates new melodic material through systematic transformation of underlying patterns. Kent Williams and Mark Levine (the latter through practical instructional texts like The Jazz Theory Book, 1995) developed the pedagogical theory of jazz harmony, codifying concepts like the tritone substitution — replacing a dominant seventh chord with the dominant seventh chord whose root is a tritone away, since the two chords share the same guide tones (the third and seventh, which merely exchange roles) — and modal interchange (borrowing chords from parallel modes).
Walter Everett’s The Foundations of Rock: From “Blue Suede Shoes” to “Suite: Judy Blue Eyes” (2008) applied Schenkerian and set-theoretic tools to the analysis of rock harmony, identifying the characteristic features of rock’s harmonic language — modal mixture, pentatonic scale relations, power-chord progressions — and showing how they differ from common-practice harmonic norms. John Covach and Andrew Flory’s What’s That Sound? An Introduction to Rock and Its History provides a model for integrating rock harmonic theory into undergraduate pedagogy.
The theoretical analysis of non-Western music presents different challenges. Victor Kofi Agawu’s African Rhythm: A Northern Ewe Perspective (1995) and Representing African Music (2003) both engage with the theoretical frameworks that Western ethnomusicology has applied to African music and argue that many such frameworks distort the music through inappropriate imposition of Western analytical categories. Agawu advocates instead for analytical approaches that emerge from within the musical traditions being studied, drawing on indigenous theoretical concepts where they exist. This methodological argument — whether analysis should use externally derived or internally derived theoretical frameworks — is a version of the broader epistemological debate that runs through the history of music theory from Pythagoras to Aristoxenus.
7.9 The History of Music Theory as Intellectual History
The history of music theory is not merely a chronicle of analytical methods and their development; it is an intellectual history in the full sense — a history of ideas about what music is, how it works, and why it matters. Each of the frameworks surveyed in these chapters embodies a specific set of philosophical commitments: about the relationship between mathematics and perception, between structure and expression, between the authority of tradition and the demands of new compositional practice.
The Pythagorean tradition committed music theory to mathematics as its grounding discipline and to cosmology as its ultimate context. Aristoxenus committed it to perception and trained musicianship. Boethius committed medieval theory to the authority of ancient learning and the priority of the intellect over the senses. Zarlino committed Renaissance theory to the reconciliation of mathematical justification with compositional practice. Rameau committed tonal theory to the derivation of harmonic syntax from physical law. Schenker committed organicist theory to the identification of deep structural unity as the defining property of musical greatness. Babbitt and Forte committed post-tonal theory to mathematical rigor and the methods of formal science. Lewin committed transformational theory to the processual, relational character of musical experience.
The multiplicity of perspectives is not a sign of theoretical failure but of theoretical richness. Music theory is at its most productive when it maintains awareness of its own perspectival character — when it knows that the categories it uses are choices made against a background of alternatives, and that different choices illuminate different aspects of a complex phenomenon. The recurring debates in music theory’s history — ratio vs. perception, mathematical structure vs. expressive function, depth vs. surface, canonical repertory vs. expanded canon — are not merely professional controversies but genuine philosophical problems about the nature of musical experience and the conditions of musical understanding.
7.10 Methodological Pluralism and the Future of the Discipline
Contemporary music theory is characterized by a healthy, if sometimes contentious, methodological pluralism. Schenkerian analysis, pitch-class set theory, transformational theory, neo-Riemannian theory, topic theory, form-functional analysis, empirical corpus analysis, and psychoacoustic music cognition coexist within a single discipline, each illuminating different aspects of musical experience and each making different philosophical assumptions about what counts as musical structure, musical meaning, and musical knowledge.
This pluralism has not always been comfortable. In the 1980s and 1990s, debates between Schenkerian analysts and set theorists over the appropriate methodology for post-tonal music could be acrimonious. In the 2000s and 2010s, debates between speculative theorists and empirical music cognition researchers over the empirical testability of theoretical claims raised fundamental methodological questions. In the 2020s, debates about whose music gets analyzed and by whose standards have raised political and ethical questions that cut to the foundations of the discipline’s self-understanding.
Looking forward, several developments seem likely to shape music theory’s next generation. Machine learning and corpus analysis will continue to expand the scale at which harmonic and melodic patterns can be studied, potentially revealing regularities in large repertories (popular music, world music, music of the distant past) that have been invisible to analysis of individual works. Cross-cultural music theory — the comparative study of music-theoretical systems from non-Western traditions (Indian raga theory, maqam theory from the Arab world and Turkey, Chinese and Japanese modal systems) alongside Western theory — will require new conceptual frameworks that are neither purely Western nor purely relativistic. Computational composition and artificial intelligence raise new theoretical questions: if a machine can generate music indistinguishable (to listeners) from human composition, what does this tell us about what music theory has been modeling? Does it model the cognitive processes of composers and listeners, or abstract structural properties of musical surfaces, or something else entirely?
The questions that motivated the earliest Greek music theorists — what is the relationship between number and sound, between mathematical structure and expressive power, between reason and perception — remain at the center of music theory’s intellectual project. They have been answered differently by every generation, and every answer has revealed something true and something partial. The history of music theory is, in this sense, not a linear progress toward final answers but a continuing conversation — interrupted by discoveries, distorted by ideologies, expanded by new voices and new repertories — about what it means to understand music.
The history of Western music theory, from the Pythagorean monochord to the neo-Riemannian tonnetz, from Boethian musica mundana to transformational networks and computational corpus analysis, is a history of recurring questions: What is the relationship between mathematical structure and sonic experience? What authority should mathematical derivation have over the trained ear? Whose musical practices deserve theoretical articulation, and by whose standards? What counts as structural depth versus surface ornament, and who decides? These questions do not admit of final answers, but their history — the sequence of frameworks, polemics, revisions, and expansions charted in these pages — constitutes one of the richest intellectual traditions in the history of Western thought about the arts. The discipline’s current self-examination, however uncomfortable, is itself a continuation of this tradition: music theory has always been, among other things, an argument about what matters in music and why.
What unites Pythagoras’s experiment with the monochord, Guido’s hexachord hand, Zarlino’s senario, Rameau’s basse fondamentale, Schenker’s Ursatz, and Lewin’s transformation network is not any single answer to these questions but a common conviction: that music is not just sound but structure, and that structure can be understood, described, and argued about with rigor and precision. The specific forms that rigor and precision take have changed dramatically across two and a half millennia of music-theoretical thought. They will continue to change. But the conviction that musical understanding is possible — that the trained mind can grasp something real and transmissible about how music works — is the continuous thread that runs from the Pythagorean Brotherhood through every tradition, debate, and revision chronicled in these pages. Graduate study in the history of music theory is, ultimately, an initiation into that conviction and into the responsibility it carries: to understand music deeply, to analyze it rigorously, and to remain open to the possibility that the categories we use today will one day require revision by thinkers we have not yet imagined.