MUSIC 373: Form and Musical Analysis
Estimated study time: 3 hr 24 min
Table of contents
These notes draw on William E. Caplin’s Classical Form: A Theory of Formal Functions for the Instrumental Music of Haydn, Mozart, and Beethoven (1998), James Hepokoski and Warren Darcy’s Elements of Sonata Theory: Norms, Types, and Deformations in the Late-Eighteenth-Century Sonata (2006), Charles Rosen’s Sonata Forms (revised ed., 1988) and The Classical Style: Haydn, Mozart, Beethoven (expanded ed., 1997), and supplementary material from Harvard Music 175r (Form and Analysis) and Cambridge Faculty of Music Music Analysis 1 and 2 course resources.
Chapter 1: Phrase Structure — The Elements of Musical Syntax
1.1 The Phrase as the Fundamental Unit
To analyze musical form is to understand how composers organize time — how they guide listeners through a succession of events that feel, in retrospect, inevitable. Every large-scale formal structure, from the smallest rounded binary to the most expansive Beethoven symphony, is built from smaller units, and those smaller units are in turn built from the most fundamental of all formal objects: the phrase. A phrase is a relatively complete musical thought ending in a cadence; it is the minimum unit of musical syntax that delivers a sense of formal closure, however provisional. The phrase bears a loose analogy to the clause in language: it can stand alone (a single statement), combine with others to form longer thoughts (a period), and chain together into yet larger structures (a theme, a section, a movement).
William Caplin, drawing on the German theoretical tradition of Hugo Riemann and Arnold Schoenberg, places the phrase at the center of his analytical system. But Caplin is careful to note that “phrase” in the German Phrase or Satz sense does not map perfectly onto colloquial usage. In his framework, what matters is not length but function: does the passage end in a genuine cadence? Does it express a self-contained harmonic motion from a point of departure to a point of arrival? These questions, not bar-count, determine phrase identity. A four-bar unit that ends on a half cadence is a phrase — an open one. A sixteen-bar unit that ends with a perfect authentic cadence is also a phrase — a closed one, however extended. What cannot be a phrase, in this strict sense, is a passage that trails off into the middle of a harmonic progression, or one that dissolves into a transition before any cadential goal is reached.
The history of phrase as a theoretical concept stretches back to the eighteenth century itself. Heinrich Christoph Koch’s Versuch einer Anleitung zur Composition (1782–1793) contains the first sustained theoretical treatment of phrase structure in the Classical style, and Koch’s categories — Einschnitt (incise), Phrase, Periode — anticipate many of the distinctions that later theorists would systematize. Koch was writing as pedagogy for students of composition, not as analysis for scholars, and his categories reflect the practical concerns of a working composer: how do you build a satisfying musical sentence? How long should it be? What should its internal divisions feel like? These pedagogical origins remind us that phrase analysis is not merely an academic exercise but a practical skill, intimately connected to the craft of composition.
The normative four-bar phrase is not a mere convention — it has a perceptual basis in the way listeners organize musical time into groupings. The psychologist of music David Huron, in his Sweet Anticipation: Music and the Psychology of Expectation (2006), argues that listeners form hierarchical temporal groupings automatically, and that the four-bar phrase engages these grouping tendencies by providing rhythmic, melodic, and harmonic closure at a timescale that feels “natural” — neither so short as to feel incomplete nor so long as to strain attention. The four-bar phrase, in other words, is not arbitrary: it is calibrated to the perceptual architecture of the listener.
The phrase’s relationship to meter deserves particular attention. In Common Practice tonal music, phrases are metrically organized: they begin on a metrically strong beat (most commonly the downbeat of the first measure) and end with their cadential goal on a metrically strong beat (most commonly the downbeat or the third beat of the final measure). This metrical alignment of phrase boundaries with metrical strong points is not merely a convention but a formal principle: a phrase whose cadence arrives on a metrically weak beat (an “upbeat cadence”) is marked as formally incomplete, requiring a subsequent metrically strong cadential arrival to achieve full closure. The relationship between the phrase’s internal harmonic organization and its outer metrical organization creates a two-dimensional formal grammar: the harmony governs what kind of cadence ends the phrase, and the meter governs where in the measure that cadence arrives. Both dimensions must be attended to simultaneously for a complete phrase analysis.
1.2 Cadential Types and Their Formal Implications
No concept is more fundamental to the analysis of tonal form than the cadence. A cadence is a harmonic-melodic formula that articulates the end of a formal unit; it is the punctuation mark of musical syntax. Just as punctuation in language differentiates the comma (a pause, continuation expected) from the period (full stop, thought complete), cadences in tonal music create a graded hierarchy of closure from the provisional to the definitive. The analyst who can identify cadences and assign them to their types can immediately perceive the formal articulations of any tonal piece — not because cadences determine form unilaterally, but because form is built from cadential landmarks, and form cannot be understood without them.
The concept of “cadence” derives from the Latin cadere (to fall), reflecting the falling motion of the tenor voice in medieval polyphonic cadence formulas. In Renaissance polyphony, the cadence involved a specific voice-leading pattern — typically a suspension resolving to a consonance over a bass that moves by descending fifth — and its recognition depended as much on voice-leading as on harmonic content. In the tonal music of the Common Practice period (roughly 1650–1900), the harmonic dimension of the cadence becomes primary: cadences are defined by specific harmonic progressions, with the voice-leading serving to articulate and confirm those progressions rather than defining them independently.
The deceptive cadence is put to particularly striking use in the first movement of Beethoven’s Piano Sonata in C minor, Op. 13 (Pathétique). Near the end of the exposition’s second theme group, Beethoven prepares what sounds like a definitive PAC in E-flat major, only to swerve to C minor (\(\text{vi}\) relative to E-flat) — a move that simultaneously frustrates the expected cadential closure and recalls, by harmonic color, the turbulent C-minor opening of the movement. The listener feels the formal sleight-of-hand even without knowing the technical term for it: something that should have arrived has been withheld, and in its place is an unexpected reminder of the movement’s shadowed opening world.
The broader analytical principle that cadence types illustrate is the gradation of harmonic closure from weakest (IAC, and some forms of HC) through intermediate (HC with clear articulation, deceptive cadence as a surprise) to strongest (PAC with tonic in the bass and \(\hat{1}\) in the soprano). This gradation is not merely theoretical but practically important for formal analysis: only at a PAC does Caplin consider a formal unit to have achieved genuine closure. A passage that ends on an IAC is, in Caplin’s terms, at best provisionally closed — the IAC acknowledges the tonic but does not fully confirm it with the leading-tone resolution and root-position tonic arrival that give the PAC its conclusive authority. The analyst who conflates PAC and IAC — treating any motion to the tonic as equivalent closure — misses the crucial formal hierarchy that governs the Classical style’s phrase organization and risks misidentifying the structural moments of each formal level.
1.3 The Sentence (Satz)
The sentence — Satz in the German theoretical tradition — is one of the two fundamental phrase types in Classical instrumental music. Schoenberg describes the sentence in his posthumous Fundamentals of Musical Composition (1967) as a phrase type characterized by the interplay between “statement,” “restatement,” and “liquidation” — a process in which a clearly characterized motivic idea is stated, confirmed by repetition, and then gradually dissolved into generic continuation material that drives toward a cadence. Caplin refines and systematizes this description in Classical Form, providing a more analytically precise account of the sentence’s internal structure and its relationship to formal function.
A sentence at normative length occupies eight bars and consists of two clearly distinct halves: a four-bar presentation followed by a four-bar continuation→cadence unit. This bipartite structure is not merely a matter of bar-count; it reflects a fundamental opposition between two types of formal activity. The presentation is characterized by motivic identity and harmonic stability: it introduces the theme’s basic idea and confirms it through restatement, remaining close to the tonic. The continuation-cadence unit is characterized by motivic dissolution and harmonic motion: it takes the basic idea apart, accelerates the harmonic rhythm, and drives toward the cadential goal.
- Presentation (measures 1–4 at normative length): a two-measure basic idea (BI) followed by an immediate restatement of that basic idea — either exact (a "repeated" BI) or with slight variation — in measures 3–4. The presentation is harmonically stable, typically remaining on or oscillating between tonic and dominant within the home key. Its rhythm of statement-restatement establishes the theme's motivic identity before the continuation dissolves it.
- Continuation (measures 5–6): Caplin identifies four hallmarks — (a) fragmentation: the basic idea is broken down into a smaller motivic cell; (b) harmonic acceleration: chord changes occur more frequently than in the presentation; (c) increased surface rhythmic activity; (d) sequential or developmental treatment of the fragmented motive.
- Cadential (measures 7–8): a standard cadential progression — most typically \(\text{I}^{6/4} \to \text{V}^7 \to \text{I}\) — delivers the PAC (or HC, or IAC) that ends the sentence. The cadential function is embedded within the continuation section, so the two often blend into a single "continuation→cadential" unit.
The opening theme of Beethoven’s Piano Sonata in G major, Op. 49, No. 2 provides a textbook illustration of the sentence that is worth tracing measure by measure. Measures 1–2 present the basic idea: a simple, stepwise ascending motion from G to D over a tonic chord, harmonically inert and immediately memorable. Measures 3–4 repeat this basic idea with only minimal surface variation, again over the tonic, confirming its identity and creating the statement-restatement symmetry of the presentation. Measures 5–6 fragment the basic idea, seizing on its upward stepwise motion and subjecting it to a descending sequence that simultaneously provides forward momentum and dissolves the clear motivic profile of the presentation. Crucially, the harmonic rhythm accelerates in these measures: whereas the presentation was governed by a single tonic harmony, the continuation moves through three or four harmonies per measure. Measures 7–8 deliver the standard cadential formula — \(\text{I}^{6/4}\) on the third beat of measure 7, resolving to \(\text{V}^7\) and thence to \(\text{I}\) in measure 8 — closing the sentence with an unambiguous PAC.
The sentence principle operates not only at the eight-bar level but at many formal scales. A single two-bar basic idea can itself exhibit a miniature sentence structure if it contains an internal statement-restatement pattern. Conversely, a thirty-two-bar main theme can exhibit sentence organization if its first sixteen bars function collectively as a “presentation” (introducing the theme’s material without significant harmonic departure) and its second sixteen bars function collectively as a “continuation→cadence.” Caplin calls these “large-scale sentences,” and they are common in the slow movements of Beethoven’s late sonatas and string quartets.
The practical task of identifying a sentence in an unfamiliar piece requires the analyst to make a sequence of decisions. First: is there a clearly characterized basic idea (BI) — a two-bar or four-bar gesture that can be heard as a coherent motivic unit? Second: is this BI immediately repeated (either exactly or with variation) without any significant harmonic departure — the presentation’s characteristic statement-restatement pattern? Third: does the repetition give way to a continuation that is more harmonically active, more motivically fragmented, and more metrically propulsive than the presentation? Fourth: does the continuation drive toward a clearly articulated cadential arrival? If the answer to all four questions is yes, the passage is a sentence. The practical difficulty arises most often at the second and third steps: a passage that begins with what seems like a restatement but then departs from the BI’s character immediately is better analyzed as a period’s consequent (beginning with the same material as the antecedent but proceeding differently) than as the restatement of a sentence presentation. The distinction between these cases — sentence restatement versus period consequent — is one of the most common analytical challenges in Classical phrase analysis, and it requires careful attention to both the harmonic and the formal-function dimensions of the passage.
1.4 The Period (Periode)
The period is the second fundamental phrase type. Where the sentence is organized by the logic of statement-restatement-dissolution, the period is organized by the logic of question and answer: an antecedent phrase ending on a weaker cadence is “answered” by a consequent phrase ending on a stronger cadence. The question-answer structure of the period is one of the most deeply embedded formal archetypes in Western music, and its logic — provisional statement followed by conclusive resolution — appears at every formal level from the phrase to the movement.
- Antecedent phrase (measures 1–4): begins with the basic idea, proceeds through a brief harmonic motion, and ends with a weaker cadence — most typically a half cadence (HC) in the home key, though an IAC is also possible. The HC creates the harmonic "question": something has been initiated but not completed.
- Consequent phrase (measures 5–8): begins with the same basic idea as the antecedent (often literally, occasionally varied in its continuation), proceeds through a harmonic motion that is more conclusive than the antecedent's, and ends with a PAC in the home key. The PAC provides the harmonic "answer" that the antecedent's HC solicited.
The opening of Mozart’s Piano Sonata in A major, K. 331 — the theme of the set of variations that forms the first movement — is perhaps the most frequently cited example of a period in pedagogical literature, and justifiably so. The antecedent (measures 1–4) presents a graceful, singing melody in A major that arrives on the dominant (E major) with a clear HC at measure 4. The melody’s closing gesture — a cadential trill over the dominant harmony — leaves the phrase open: something has been said, but not resolved. The consequent (measures 5–8) begins with the same melodic gesture as the antecedent, follows a parallel harmonic motion, but diverges at measures 7–8 to close definitively on the tonic (A major) with a PAC. The symmetry of question and answer, the parallel construction of the two phrases, and the neat division into 4+4 bars make this one of the most transparent demonstrations of period structure in the repertoire.
The period’s characteristic question-answer logic is not merely a formal convention but a reflection of deeply embedded patterns of musical expectation. The half cadence that ends the antecedent is perceptually incomplete: listeners trained in tonal music automatically hear it as open-ended, as requiring continuation. This incompleteness is not a deficiency but a resource — it directs attention forward and creates anticipation for the consequent’s resolution. The consequent’s PAC, when it arrives, does not merely provide harmonic closure; it fulfills a promise that the antecedent’s HC has made. The satisfaction of this fulfillment is one of the most fundamental pleasures in tonal music, and the period is its most concentrated formal vehicle.
This antecedent-consequent logic extends far beyond the individual eight-bar period. At larger scales, entire sections of a movement can be understood as antecedent and consequent: an exposition that ends in the dominant is the harmonic equivalent of an antecedent (harmonically open), and a recapitulation that ends in the tonic is the harmonic equivalent of a consequent (harmonically closed). In this sense, Caplin’s period principle is a microcosm of sonata form’s tonal logic: the period, at the phrase level, enacts in miniature the same harmonic drama of departure and return that sonata form enacts at the movement level. The period’s pedagogical importance lies partly in this homology: understanding the period’s question-answer structure is the first step toward understanding the much larger tonal logic of the entire Classical formal tradition.
1.5 Hybrid Theme Types
Caplin identifies a set of hybrid theme types that combine elements of sentence and period organization in ways that resist clean categorization as either one or the other. These hybrids are common in the Classical repertoire and reveal the flexibility with which composers deployed the fundamental phrase types.
The hybrid types are important not merely as taxonomic categories but as analytical tools for understanding the expressive choices a composer makes. When a composer uses a sentence, the formal implication is one of motivic drive — the basic idea is stated, confirmed, and then dissolved into urgent continuation. When a composer uses a period, the implication is one of balanced dialogue — the question of the antecedent is answered by the consequent in a relationship of formal symmetry. The hybrid types capture the wide range of expressive possibilities between these poles: a theme may have the motivic drive of a sentence (fragmented continuation) combined with the harmonic profile of a period (antecedent ending on HC), or the balanced antecedent of a period combined with the developmental continuation of a sentence. These intermediate types are among the most interesting formal objects in the Classical repertoire, because they create ambiguity about the theme’s formal implication — the listener is uncertain, on first hearing, whether to expect a consequent (period logic) or a continuation (sentence logic), and the composer exploits this ambiguity for expressive effect.
The compound basic idea (CBI) of Hybrid 3 and Hybrid 4 deserves particular attention because it is the most common of Caplin’s intermediate units. The CBI consists of two sub-ideas — a basic idea (BI) followed by a contrasting idea (CI) — that together span four bars without cadential closure. The BI establishes the theme’s primary motivic character; the CI provides a contrasting gesture (different rhythm, different contour, different harmonic implication) that together with the BI creates a four-bar unit that feels complete as a gesture but harmonically open. This harmonic openness is what distinguishes the CBI from the antecedent: the antecedent ends with a clear HC (harmonically open but cadentially articulated), while the CBI ends with an inconclusive harmonic gesture that does not achieve even HC-level articulation. The CBI is, in a sense, a phrase that has not yet committed to any particular cadential goal — a gesture that is awaiting its continuation. Composers use CBIs in themes that they want to feel more exploratory and less symmetrical than a period, but more motivically cohesive and less driving than a sentence.
- Hybrid 1 (antecedent + continuation): the first four bars function as an antecedent phrase (ending on HC), but the second four bars do not begin with the basic idea (as a consequent would) — instead, they provide a continuation function, fragmenting and developing the material before driving to a PAC. The phrase type has the cadential profile of a period (HC followed by PAC) but the internal organization of a sentence (the second half is developmental rather than parallel to the first).
- Hybrid 2 (antecedent + cadential): the first four bars function as an antecedent (ending on HC), and the second four bars consist entirely of cadential material — a prolonged approach to the PAC without intervening thematic content. This hybrid is rare but found in some Haydn themes.
- Hybrid 3 (compound basic idea + continuation): the first four bars present a "compound basic idea" (CBI) — a two-bar basic idea followed by a two-bar contrasting idea, neither of which contains a full cadence — and the second four bars provide a continuation that drives to a PAC. The CBI differs from the antecedent in that it does not end with an HC; it is melodically and harmonically incomplete, requiring continuation rather than "answering."
- Hybrid 4 (compound basic idea + consequent): the first four bars present a CBI (as in Hybrid 3), and the second four bars begin with the basic idea (as a consequent would) and close with a PAC. This is arguably the most common hybrid type in the Classical repertoire.
1.6 Phrase Extension: Prefix, Suffix, and Internal Expansion
Real musical phrases rarely conform perfectly to the normative four- or eight-bar models. Composers of the Classical period — particularly Haydn, whose wit and formal originality are extensively documented by Rosen — routinely manipulate phrase lengths through three mechanisms: prefix (or anacrusis), suffix (or cadential extension), and internal expansion.
A prefix is material that precedes the basic idea proper and does not itself participate in the phrase’s harmonic or motivic identity. A slow introduction before a sonata-allegro movement is a large-scale prefix; a single measure of tonic prolongation before the melodic material enters is a small-scale one. Prefixes do not “count” as part of the phrase proper: the phrase’s length and internal organization are measured from the first bar of the basic idea, not from the prefix’s beginning. This distinction matters for hypermeter: a two-bar prefix before a four-bar phrase creates a six-bar unit, but the hypermetric organization is still four-bar (the prefix is metrically absorbed into the first hypermeasure of the phrase).
Suffix material (also called cadential extension, closing extension, or post-cadential material) follows the cadence that ends the phrase and prolongs the sense of arrival. A suffix may repeat the cadential formula (V–I several times in succession), add a brief codetta figure, or simply extend the tonic chord reached by the cadence. Suffixes are extremely common in the closing sections of exposition and recapitulation, where repeated cadential affirmations are needed to confirm the formal closure of the exposition or the ESC. The suffix does not add new formal content; it elaborates and confirms the cadential arrival already achieved.
Internal expansion occurs when a phrase is lengthened by the insertion or repetition of material within the phrase itself, before the cadence is reached. This is the most formally complex of the three extension types, because the expansion changes the phrase’s internal proportions without altering its fundamental logical structure. A four-bar continuation might be expanded to six bars by repeating a two-bar harmonic sequence; the phrase is now six bars in its continuation section, but it still has the same beginning (the presentation) and the same ending (the cadence). The expanded phrase is, so to speak, stretched in its middle.
These three extension types interact in important ways in the Classical repertoire, and the most dramatically effective phrases often combine all three. A phrase may begin with a two-bar prefix that establishes the tonic atmosphere before the basic idea enters; the basic idea and its continuation proceed for six or eight bars; the continuation section is then internally expanded by a repeated sequential unit that adds two additional bars; and the cadence, when it finally arrives, is followed by a three-bar suffix of repeated V–I progressions that confirms the arrival. The resulting phrase might be fifteen bars long where the normative model would produce eight — and every one of the additional bars has a specific formal justification within the extension framework. Beethoven’s mastery of phrase extension is precisely this: the ability to expand phrases far beyond their normative lengths while maintaining the phrase’s internal formal logic at every point of expansion.
1.7 Elision and Phrase Overlap
A particularly important formal device in Classical phrase structure is the elision — a point at which the ending of one phrase and the beginning of the next coincide on a single beat or measure. In a normal phrase succession, there is a clear articulation between the cadential arrival (ending the first phrase) and the onset of the next phrase’s basic idea: the PAC arrives, the tonic chord is held for a moment, and only then does the new phrase begin. In an elision, the tonic chord that ends the first phrase simultaneously serves as the first beat of the second phrase’s basic idea, so that there is no moment of rest between the two phrases.
Elision creates a sense of forward propulsion and energy — the formal argument is never allowed to rest, and the listener is carried from phrase to phrase without the brief moment of contemplative closure that a non-elided phrase succession would provide. Haydn and Mozart use elisions with particular frequency at the juncture between the transition and the S-zone, and between the antecedent and consequent of certain compressed periods.
The formal and perceptual effect of elision depends on context. When the elided cadence is a PAC — the strongest possible cadential arrival — the elision is particularly striking: the listener anticipates a moment of rest (the tonic arrival), and that moment is simultaneously a new beginning, with no intervening silence or held note to separate them. The effect is one of relentless forward motion: the music does not pause to register the arrival because a new departure has already begun. This forward propulsion is one of the defining characteristics of the Classical style’s energetic surface, and it explains why Classical music so often seems to be in motion even when its themes are intrinsically lyrical.
In contrast, when elision occurs at a half cadence — the transitional HC that ends the antecedent of a period, for example — the effect can be more subtle. The antecedent’s open-ended HC seems to demand continuation, and when the consequent begins immediately without any gap, the elision merely accelerates what was already an expected continuation. The formal result is a compressed period whose two phrases move together more seamlessly than the normative eight-bar period’s clear articulation. Schubert and Brahms occasionally use this technique to create lyrical phrases of unusual continuity, in which the internal phrase boundaries are felt as gentle undulations in the melodic line rather than clear structural articulations.
1.8 Cadential Evasion and Deceptive Closure
Beyond the standard deceptive cadence (DC: \(\text{V} \to \text{vi}\)), composers employ a range of cadential evasion strategies — techniques for approaching and then avoiding a PAC — that are important for phrase-level analysis. Understanding these strategies requires distinguishing between the preparation for a cadence (the approach through the pre-dominant and dominant harmonies) and the cadential arrival itself (the final \(\text{V} \to \text{I}\) resolution).
- Deceptive resolution: \(\text{V} \to \text{vi}\) (or another non-tonic chord) instead of \(\text{V} \to \text{I}\).
- Abandoned cadence: the dominant is reached but then abandoned without resolving, moving instead to a new harmonic progression.
- Dissolving cadence: the \(\text{V} \to \text{I}\) resolution occurs, but the tonic is reached in an unstable position (inversion, weak beat) that prevents full closure.
- Evaded cadence (Caplin): the dominant is followed by a return to the tonic in first inversion (\(\text{I}^6\)), which is not stable enough to serve as a cadential arrival. In Caplin's framework, this is the most common form of evasion and typically leads to a repetition or extension of the cadential approach.
The evaded cadence is particularly important in Caplin’s analytical framework. When a cadential approach ends on \(\text{V} \to \text{I}^6\) — reaching the tonic in first inversion rather than root position — the formal argument is not complete: the phrase must continue to attempt another cadential arrival, typically at the same pitch level or in the same harmonic region. This continuation generates additional bars and creates the phrase extensions (measure extensions after an evaded cadential approach) that are common in Beethoven’s more dramatic themes.
The practical skill of identifying cadential evasions requires careful attention to register and bass voice. A \(\text{V} \to \text{I}\) resolution in which the tonic note is in the bass (root position) is a PAC; the same harmonic motion with the tonic in first inversion (\(\text{I}^6\), third of the chord in the bass) is not a PAC but an IAC or an evaded cadence. The bass note is often the most reliable indicator of cadential strength: root-position tonic in the bass, with \(\hat{1}\) in the soprano, is the maximum degree of closure; any other configuration is weaker. Analysts who focus exclusively on the harmonic content (\(\text{V} \to \text{I}\)) without attending to the voice-leading positions risk treating structurally weak cadences as equivalent to structurally strong ones. Developing the habit of always checking the bass and soprano voices at moments of cadential approach is one of the most important practical skills for cadential analysis.
Chapter 2: Small Forms — Binary and Ternary
2.1 Binary Form: The Two Reprises
Binary form is, historically, one of the oldest formal structures in Western music, underlying the dances of the Baroque suite — allemande, courante, sarabande, gigue — and the keyboard sonatas of Domenico Scarlatti. In its simplest definition, binary form consists of two sections (reprises), each typically repeated: ||: A :||: B :||. The two sections are distinguished primarily by their tonal relationship: the A section begins in the tonic and moves to a related key, while the B section returns from that key to the tonic. The repetition signs are functionally important: each reprise must be heard twice for the binary logic to register fully, since the first hearing establishes the material and the second allows the listener to hear it against the backdrop of their now-formed expectations.
- The first reprise (A) begins in the tonic key and moves, typically, to the dominant (in major-mode pieces) or to the relative major (in minor-mode pieces) by its close. If the first reprise ends with a PAC in the new key, it is harmonically closed (or "sectional"); if it ends with an HC or a tonicized dominant arrival that does not achieve PAC-level closure, it is harmonically open (or "continuous").
- The second reprise (B) begins in or near the new key, passes through a section of tonal instability, and returns to the tonic key, closing with a PAC in the tonic. Typically the B section is longer than the A section — often substantially so — reflecting the greater formal weight of the journey home compared to the initial departure.
The distinction between continuous and sectional binary form is crucial for understanding the form’s tonal logic. In a continuous binary form, the first reprise ends with a PAC in a non-tonic key (most often the dominant), and the piece cannot be said to have returned home until the end of the second reprise. The listener is pulled forward across the double bar: the first reprise’s ending, though locally closed, is globally open. In a sectional binary form, the first reprise closes with a PAC in the tonic — the piece momentarily returns home before the second reprise begins. The effect is more self-contained: the two reprises feel like equal halves of a whole rather than a departure and a return.
The binary form’s characteristic proportional asymmetry — the second reprise is typically longer than the first — reflects a fundamental truth about tonal music’s formal logic: departing from the tonic is easier (and faster) than returning to it. The first reprise needs only to establish the tonic and move to the secondary key, a process that in a well-designed binary takes relatively few bars. The second reprise, by contrast, must take its starting point in the secondary key, venture through a region of tonal instability or exploration, and then return convincingly to the tonic — a more complex formal trajectory that naturally requires more bars. This asymmetry is not a defect but a feature: the second reprise’s greater length gives the return of the tonic at the end of the piece its formal weight. If the two reprises were equal in length, the tonic return would feel merely symmetrical; its asymmetric arrival, after a longer and more harmonically complex journey, gives it the character of an achieved resolution rather than a symmetric echo.
The binary form’s two-reprise repeat structure also has an important perceptual dimension. In the standard performance tradition, each reprise is repeated: the form is heard as A–A–B–B (with first and second endings differentiating the repeat exit from the original). This means the listener hears the first reprise twice, then the second reprise twice, rather than the single traversal the notation suggests. The repeated hearings serve a specific perceptual function: on the first hearing of each reprise, the listener is learning the material; on the second hearing, they can hear it against the backdrop of now-established expectations, perceiving the harmonic and phrase-level organization with greater clarity. The repeat convention is therefore pedagogically integrated into the form itself: the form teaches the listener how to hear it through the repetition it requires.
2.2 Simple Binary and the Baroque Dance Suite
The simple binary form — in which the second reprise develops the material of the first but does not bring back the opening material in any recognizable way — is characteristic of the Baroque dance suite. Bach’s six English Suites, six French Suites, and six Partitas for keyboard each consist of a collection of dances (allemande, courante, sarabande, gigue, and optional insertions) in simple or rounded binary form, all in the same key. The unity of key across all movements, combined with the variety of meter, tempo, and character within each dance, creates the suite’s distinctive formal character.
The sarabande — a slow, triple-meter dance of Spanish-Caribbean origin — consistently exhibits the most harmonically exploratory second reprises in Bach’s suites, often passing through remote flat-side tonalities and employing extended harmonic sequences before returning to the tonic. The gigue, by contrast, tends to move more quickly through its harmonic detour, often inverting the subject at the beginning of the second reprise (a contrapuntal technique that creates an immediate sense of motivic freshness without departing radically from the first reprise’s material). These genre-specific conventions reveal that “binary form” is not a single, fixed entity but a family of related forms whose internal characteristics are shaped by the expressive conventions of each dance genre.
Scarlatti’s more than five hundred keyboard sonatas are the supreme achievement of simple binary in the late Baroque. Each sonata is a single movement in simple or continuous binary form, and Scarlatti’s technical and harmonic inventiveness ensures that no two are alike in their internal organization despite sharing the same outer formal container. In Scarlatti’s mature sonatas, the first reprise frequently ends in the dominant with a distinctive “crux” — a dramatic textural gesture that marks the formal midpoint — and the second reprise introduces entirely new thematic material rather than developing the material of the first reprise. This practice of introducing genuinely new themes at the beginning of the second reprise is one of the most striking features of Scarlatti’s formal style: it gives his binary forms an episodic, adventurous quality quite different from the developmental second-reprise logic of Bach’s suites, and it anticipates (without directly causing) the two-theme contrast of Classical sonata form’s exposition.
The analytical challenge of simple binary is to track the relationship between the first and second reprises without the help of the “return” event that marks rounded binary so clearly. In simple binary, the second reprise is entirely new material: its relationship to the first reprise is tonal (moving from dominant back to tonic) rather than thematic (restating the opening material). The analyst must therefore attend carefully to the second reprise’s harmonic trajectory — how it departs from the secondary key established at the end of the first reprise, what harmonies it passes through, and how it manages the return to the tonic — rather than waiting for a recognizable thematic event to mark the formal structure.
2.3 Rounded Binary Form
The most important variant of binary form for the history of Western music is rounded binary — the formal template from which sonata form eventually evolved. In rounded binary, the second reprise is divided into two subsections: a departure section (the “B section” proper) that ventures harmonically away from the home key, followed by a partial or complete return of the opening A material, now in the tonic, that completes the second reprise. The return of A material is the formal event that distinguishes rounded binary from simple binary.
- A is the first reprise: a phrase or small theme in the tonic, closing in the dominant (continuous) or returning to the tonic (sectional).
- B is the first half of the second reprise: harmonically unstable, often brief, ending with a dominant preparation (retransition) that sets up the return of A'.
- A' is a partial or complete return of the opening A material, now harmonically adjusted to remain in or return to the tonic throughout. The return of A' is the formal "rounding" that gives the form its name.
The relationship between rounded binary and sonata form is a matter of formal genealogy. Sonata form grew out of rounded binary: the A section of rounded binary corresponds to the exposition of a sonata; the B section corresponds to the development; and the A’ section corresponds (loosely) to the recapitulation. The key difference is that sonata form elaborates each of these sections enormously and introduces the crucial new element of a second theme in the dominant (the S-zone), whose recapitulation in the tonic creates the form’s defining tonal argument. Simple rounded binary lacks this two-theme opposition; its A section is a single theme, and its formal argument is purely about the return of that theme in the tonic.
The transitional zone between rounded binary and sonata form is occupied by what theorists call the sonatina or “small sonata form”: a form with two themes in the exposition (tonic and dominant) but no development section between the exposition and the recapitulation. Sonatina movements — found in some early Classical keyboard works and in many slow movements of the mature Classical style — thus have the harmonic logic of sonata form (two themes, dominant and tonic) without the developmental complexity that is sonata form’s most distinctive formal feature. The slow movement of Beethoven’s Piano Sonata in G major, Op. 14, No. 2, is a well-known sonatina movement: the exposition presents two themes in tonic and dominant, the recapitulation returns both in the tonic, and there is no intervening development section. Identifying whether a movement is a rounded binary, a sonatina, or a full sonata form thus requires careful attention to the presence or absence of two features: the two-theme contrast (distinguishing rounded binary from sonatina/sonata) and the development section (distinguishing sonatina from full sonata form).
2.4 Ternary Form and the Da Capo Aria
Ternary form — ABA’ — is superficially similar to rounded binary but structurally distinct in one crucial respect. In ternary form, the three sections are genuinely independent: each is a complete, harmonically closed entity. The A section closes in the tonic; the B section is a separate, contrasting piece (often in a contrasting key, frequently in a contrasting mode, always in contrasting character) that also achieves formal closure on its own terms; and the return of A (A’) is a literal or elaborated restatement of the first A section, again fully closed. The three sections of a ternary form are like three separate rooms in a house: each is self-contained, and the house is unified by the fact that one room (A) comes both first and last.
In rounded binary, by contrast, the B section is not self-contained — it is harmonically dependent on what follows. The B section of a rounded binary ends on the dominant and requires the A’ section to resolve it. The difference is not merely academic: it determines whether the middle section can “stand alone” as a musical statement. In ternary form, the B section could be performed independently without harmonic incompleteness; in rounded binary, the B section could not.
- A: a complete, tonally closed formal unit (a period, small binary, or rounded binary) in the tonic. Establishes the piece's primary character and thematic material.
- B: a contrasting, tonally closed formal unit, typically in a related key and of contrasting character (different mode, tempo, texture, or affect). The B section may be simpler or more complex than A; its function is contrast, not development.
- A': a literal (da capo) or ornamented restatement of the A section, in the tonic. The return of A' closes the ternary form and restores the initial character after the contrast of B.
The da capo aria of the Baroque period — the dominant form of operatic and oratorio arias from Handel to Bach — is the archetypal ternary form. The singer performs the A section (presenting the aria’s primary affect and text in the tonic key), proceeds to a contrasting B section (shorter, in a related key, often with a contrasting text expressing a secondary emotion), and then returns to the A section from the beginning (da capo = “from the head”), typically adding improvised ornaments to demonstrate expressive and technical virtuosity. The da capo structure creates a satisfying formal symmetry while providing ample scope for affective contrast within the B section and ornamental elaboration at the A’ return.
Handel’s Ombra mai fu from the opera Serse (HWV 40, 1738) — universally known as “Handel’s Largo” — exemplifies the form with its serene A section in E major (the famous melody of almost hypnotic calm), a slightly more animated B section in B major, and the da capo return of A. Handel himself marked the tempo larghetto (not largo — the “Handel’s Largo” misnomer dates from nineteenth-century arrangements), but the movement’s leisurely pace is a function of the Sarabande-like triple meter and the long-breathed melodic lines of the A section, not a tempo marking. The ternary structure provides the formal frame within which Handel’s long melodic arches can unfold.
The dal segno aria (dal segno = “from the sign”) is a modification of the da capo in which, instead of returning to the very beginning of the A section, the singer returns to a marked point (the segno, %) partway through the A section, omitting the initial ritornello (orchestral introduction) that often preceded the vocal entry. This convention shortened da capo arias considerably in performance and was particularly common in mid-eighteenth century opera seria. The dal segno procedure does not alter the fundamental ternary logic — the formal argument is still ABA’ — but it changes the proportional balance between the three sections, since A’ is shorter than A.
A critical analytical issue in ternary form is the distinction between the B section as a formal entity and the trio as a formal entity. Both are “middle sections” of a compound structure, but the B section of a small ternary and the trio of a compound ternary differ in scope and internal organization. The B section of a small ternary (as in a da capo aria) is itself a complete, harmonically closed section that may be of any length and internal structure. The trio of a compound ternary (as in a minuet-with-trio or a scherzo-with-trio) is typically a complete rounded binary form in its own right — a formal unit of comparable length and organizational complexity to the outer sections. The distinction matters for analysis because the internal formal organization of the trio must be traced on its own terms, not merely characterized as “contrasting material.”
2.5 Compound Ternary: The Minuet-with-Trio
The most important compound form in the Classical period is the minuet-with-trio: both the minuet and the trio are themselves rounded binary forms, and these two rounded binaries are combined into a larger ternary scheme: Minuet (rounded binary) — Trio (rounded binary) — Minuet da capo. The result is a compound ternary whose outer sections (the Minuet) and inner section (the Trio) are each internally complex, hierarchical formal units.
The trio’s function within the compound ternary requires careful analytical attention. The trio is conventionally in a related key — most commonly the subdominant or the parallel minor if the minuet is in major, the relative major if the minuet is in minor — and its character is typically contrasting with the minuet’s: lighter in texture, slower in surface rhythm, more lyrical in melodic style, or more static harmonically. The key-area contrast between minuet and trio is a structural feature, not merely an aesthetic preference: the movement to a related key in the trio creates a large-scale tonal departure analogous to the movement to the secondary key in a binary or sonata exposition. When the minuet returns da capo, the tonic key is restored, and the compound ternary’s large-scale tonal argument — minuet in tonic, trio in related key, minuet da capo in tonic — is analogous to a large-scale binary or ternary at the movement level.
The conventional omission of repeats in the da capo minuet return is a practical as well as a formal convention. Since the minuet has already been heard in full (with all its internal repeats) at the movement’s beginning, repeating its repeats in the da capo return would make the movement excessively long; the convention of playing the da capo “senza replica” (without repeats) saves time and prevents the return from outstaying its welcome. Analytically, the da capo without repeats is still formally a complete rounded binary — all the material is present — but its proportions are compressed, giving the movement’s third section a quality of abbreviated recall rather than full restatement.
Beethoven’s replacement of the minuet by the scherzo in his mature works is formally conservative and expressively radical. The scherzo preserves the minuet-with-trio’s compound ternary structure entirely — both the scherzo and the trio are rounded binaries, and the da capo of the scherzo closes the compound ternary — but it replaces the minuet’s courtly grace with a fierce, often violent rhythmic energy and an unpredictable, sometimes genuinely bizarre character. The scherzo of Symphony No. 9, Op. 125 — in D minor, marked Molto vivace, and moving so rapidly that the 3/4 meter sounds like a brutal perpetual motion — could not be more different in character from a Baroque minuet, yet its formal structure is identical to the formal structure of the Minuet from Bach’s Suite No. 1 in G major, BWV 816.
2.6 Formal Contrast in Binary and Ternary
The analyst must develop sensitivity to the distinction between contrast that is genuinely binary or ternary in formal implication and contrast that is merely thematic. Two themes that are different in character but that participate in a continuous harmonic motion toward a single cadential goal do not constitute a binary or ternary form: they are a two-theme exposition, or a period with a contrasting continuation, or some other formal unit. The formal distinction between binary and ternary rests not on the presence of contrasting material but on the tonal and cadential structure: are the sections harmonically closed and self-sufficient? Do they each achieve PAC-level closure before the next section begins? These questions, not questions about thematic character alone, determine the form’s classification.
The key diagnostic criterion for ternary form is the harmonic independence of the middle section. In a true ternary, the B section opens after a full authentic cadence has closed the A section in its own key, and the B section itself may establish a contrasting key — most commonly the relative major if the outer sections are in minor, or the subdominant or relative minor if the outer sections are in major. The B section closes, typically with a PAC in its own key or a half cadence preparing the return, before A’ begins. This pattern of [close — new key or harmonic area — close — return] is the hallmark of genuine ternary logic. In a binary form, by contrast, the two sections share a continuous harmonic trajectory: the first section opens in the tonic and closes on the dominant (in the case of a simple binary) or in the secondary key (in the case of a rounded binary), and the second section opens in that secondary key, making it harmonically dependent on the first section. The second section of a binary form cannot stand alone harmonically the way the B section of a true ternary can.
This distinction has practical analytical consequences. The rounded binary — in which the second section begins with developmental material before returning to the opening theme in the tonic — is frequently confused with ternary by students, because the returning A’ theme creates an audible three-part texture (A — developmental middle — A’). But the rounded binary is binary, not ternary, for a decisive tonal reason: the developmental middle of the second section is harmonically dependent on the first section (it begins in the secondary key where the first section ended) and therefore cannot stand alone as a self-sufficient B section. The ternary’s B section, by contrast, begins in a harmonically fresh area after the A section has closed completely.
Caplin’s analytical vocabulary is useful here. In his framework, small ternary form requires that the A section end with a PAC in the tonic (a full “tight-knit” close), the B section constitute a “contrasting middle” that introduces new material in a different key or at least a different harmonic area, and A’ return with the opening theme’s tonic-establishing cadential closure. The rounded binary, in Caplin’s terms, has its second section begin with middle function rather than beginning function — it is a continuation of the first section’s harmonic journey, not a new harmonic beginning. The formal-function distinction (beginning function vs. middle function at the opening of the second large section) thus provides an analytical criterion that supplements the tonal criterion of harmonic independence.
Practically, the analyst should ask three questions when encountering a potentially binary or ternary form: First, does the first section end with a PAC in the tonic, or does it end in a secondary key? If the latter, the form is likely binary. Second, does the middle section open with beginning function in a new tonal area, or does it open with middle function continuing from the secondary key? If the former, the form is likely ternary. Third, could the middle section be performed in isolation and make harmonic sense as a self-sufficient musical unit? If yes, the ternary interpretation is strongly supported. These three questions together give the analyst a reliable framework for distinguishing formal types in the absence of explicit composer labeling.
2.7 The Scherzo and Its Formal Character
The scherzo — which Beethoven systematically substituted for the minuet in his mature symphonies, string quartets, and piano sonatas — is formally identical to the minuet-with-trio at the structural level (compound ternary, each section a rounded binary) but differs radically in character, tempo, and expressive intent. The word scherzo derives from the Italian for “joke” or “play,” and the form indeed carries a quality of unpredictability, rhythmic violence, and sometimes genuinely comic wit that is absent from the stately minuet.
- Tempo extremes: the scherzo of Symphony No. 9, Op. 125, is marked Molto vivace and moves so rapidly in 3/4 that it sounds like an aggressive perpetual motion — the triple meter is felt as a single beat per measure rather than three.
- Repeated sections: in some scherzos (Symphony No. 4, Op. 60; Piano Sonata Op. 18), Beethoven repeats the Trio and the da capo Scherzo a second time, creating a five-part compound structure: Scherzo — Trio — Scherzo — Trio — Scherzo. This extends the conventional three-part plan into a five-part arch.
- Metric displacement: the scherzo of Symphony No. 3 ("Eroica"), Op. 55, places strong sforzando accents on metrically weak beats, systematically disorienting the listener's sense of the downbeat for extended passages.
- Programmatic integration: in Symphony No. 6 ("Pastoral"), Op. 68, the scherzo ("Merry gathering of country folk") is interrupted mid-movement by a sudden storm (the fourth movement), so that the scherzo's da capo never fully arrives — the compound ternary structure is disrupted by the programmatic narrative.
Brahms’s scherzos — in the Piano Sonata in F-sharp minor, Op. 2, the Piano Quintet in F minor, Op. 34, and others — carry the Beethoven scherzo tradition into a Romantic harmonic language of considerable density and weight. The Scherzo of Op. 34 is formally orthodox (compound ternary with rounded binary sections) but harmonically adventurous: its B section moves through a sequence of chromatic mediant relations that would have been unthinkable in a Classical minuet.
Chapter 3: Rondo Forms
3.1 The Rondo Principle
The rondo principle is one of the most intuitive in all of musical form: a recurring refrain (R) alternates with contrasting episodes (A, B, C), creating a form of the type R–A–R–B–R (five-part rondo) or R–A–R–B–R–A–R (seven-part rondo). The refrain is always in the tonic; the episodes provide harmonic and thematic contrast. The rondo’s appeal is partly psychological: the return of the familiar, after a contrasting episode, carries an expressive weight that grows with each iteration. By the final return of the refrain, the listener has a history with the theme — has heard it depart and return, has experienced its stability against the episodes’ instability — and its final restatement carries an accumulated significance that a single statement could not.
The rondo’s appropriateness as a finale form is not accidental. In multi-movement works (symphonies, sonatas, concertos, string quartets), the finale faces a formal challenge quite different from that of the first movement: it must provide a sense of culmination and conclusion for the entire work, not merely a satisfying internal formal structure. The rondo achieves this in a way that sonata form does not: the refrain’s final return, after all the episodes and their departures, carries the accumulated formal weight of the entire movement’s journey. The finale’s rondo logic — departure and return, multiple times — mirrors the larger journey of the multi-movement work: from opening to close, through multiple movements of varying character, back to the home key for the finale’s affirmative conclusion. The rondo’s spirit of light return, of playful insistence on the home key, makes it the natural finale form for works that end in a spirit of resolution and joy rather than tragic finality.
The key-area logic of the episodes also contributes to the rondo’s finale function. The first episode in the dominant, the second episode in the relative minor or subdominant — these key excursions retrace, in miniature, the harmonic journey that the whole multi-movement work has undertaken. When the refrain returns for the last time, it does so after having confirmed that all the harmonically distant territories visited during the rondo have been fully explored and left behind: the home key is not merely a starting point but a confirmed destination. This “confirmatory” formal logic distinguishes the rondo finale from the sonata-form first movement, whose first-movement drama is about the generation and resolution of tonal conflict. The rondo’s drama is about the persistence of the home key against all departures — a different, more affirmative tonal argument, perfectly suited to a formal conclusion.
- R: Refrain in the tonic — a closed, self-contained theme (period, small binary, or similar).
- A: First episode, typically in the dominant (major) or relative major (minor). Contrasts with the refrain in character, texture, or thematic material.
- R: Return of the refrain in the tonic (often somewhat shortened or with minor variation).
- B: Second episode, often in a more distant key — the subdominant, submediant, or minor parallel. Sometimes more developmental in character than A.
- R: Final return of the refrain in the tonic, often extended or with a coda.
3.2 Classical vs. Baroque Rondeau
The Classical rondo must be distinguished from the Baroque rondeau, which François Couperin cultivated extensively in his four volumes of Pièces de clavecin (1713–1730). The Baroque rondeau is shorter and more intimate than the Classical rondo: its refrains are typically four or eight bars long, its episodes are brief harmonic excursions rather than substantial contrasting themes, and the form moves through its R–A–R–B–R–C–R plan with a lightness and speed quite different from the weight of a Classical rondo finale. Couperin’s “Les Barricades mystérieuses” from the second ordre of the second book (1717) is one of the most celebrated examples: a mysterious, dark-textured piece in B-flat major whose refrain returns like a recurring enigma after each brief, contrasting episode.
The Classical rondo — as developed by Haydn, Mozart, and Beethoven in the finales of their piano concertos, string quartets, and sonatas — is a considerably more substantial form. The episodes are full-scale thematic areas, often in related keys and with independent melodic content; the refrains may be substantially longer than those of the Baroque rondeau; and the overall proportions of the movement are comparable to those of a sonata-form movement. The finale of Mozart’s Piano Concerto No. 17 in G major, K. 453, is a five-part rondo with an elaborate first episode in D major, a varied second refrain, a second episode that develops material from the refrain, and a final refrain that expands into a substantial coda — a complete, multi-section movement of considerable formal complexity.
The distinction between the Baroque rondeau and the Classical rondo is not merely one of scale but of formal logic. In the Baroque rondeau, the episodes are typically short harmonic excursions that do not establish an independent tonal identity: they are closer to developmental episodes within the refrain’s tonal sphere than to genuinely contrasting secondary key areas. The Classical rondo episode, by contrast, is typically a complete thematic area in a specific related key — the dominant, the relative major, or the subdominant — with its own thematic material, internal phrase structure, and cadential closure before the refrain returns. This key-area logic gives the Classical rondo a tonal dimension that the Baroque rondeau lacked, and it is this tonal dimension that allows the sonata-rondo to emerge as a hybrid of the rondo’s thematic return principle and the sonata’s tonal opposition principle. The Baroque rondeau could not serve as the basis for a sonata-rondo precisely because its episodes lack the tonal definition necessary to serve as S-zone equivalents.
3.3 Seven-Part Rondo
The seven-part rondo (R–A–R–B–R–A–R) expands the five-part plan by adding a second return of the first episode (A) before the final return of the refrain. This creates a palindromic arch — the form is symmetrical around the B section at its center — and significantly extends the overall duration. The second appearance of the A episode may be identical to its first appearance, may be varied or ornamented, or may be in a different key from its first appearance.
The structural advantage of the seven-part over the five-part rondo is primarily one of proportional balance. In a five-part rondo, the B episode appears once and the A episode appears twice, creating an asymmetry in which B always feels like an intruder relative to the thoroughly established A. In the seven-part rondo, the A episode also appears twice and the final return of R comes after the A’s second appearance, so the form’s palindromic symmetry is felt as a formal completion — the A’s second return has the character of a recapitulation, and the final R rounds off the structure with a sense of conclusive arrival. The B section at the center of this symmetry acquires a special weight as the formal and harmonic apex of the movement — the point of maximum departure from which the second half of the rondo is a gradual return.
The key conventions governing the A episode’s second appearance are instructive. In the Classical period, the A episode typically appeared first in the dominant key (in major-mode movements) and returned in the tonic at its second appearance — a transposition that carries the same structural significance as the recapitulation in sonata form: the episode that was “wrong” the first time (in the dominant) is “corrected” the second time (in the tonic). This tonal correction, combined with the refrain’s stable tonic returns, gives the seven-part rondo a hidden sonata-like logic beneath its surface rondo plan. Many analysts identify the seven-part rondo as structurally equivalent to the sonata-rondo (§3.4) when the A episode’s transposition follows this dominant-to-tonic pattern.
3.4 Sonata-Rondo
The most sophisticated rondo variant is the sonata-rondo, a hybrid that combines the rondo plan’s refrain-and-episode alternation with the harmonic logic of sonata form’s exposition-development-recapitulation. The formal scheme R–A–R–B–R–A–R is reinterpreted so that the first R–A pair functions as a sonata exposition (R as P-zone in tonic, A as S-zone in dominant), the B section functions as a development, and the final R–A pair functions as a recapitulation (R and A both now in the tonic).
- R–A: Exposition — refrain in tonic (P-zone), first episode in dominant (S-zone).
- R: Return of refrain in tonic (corresponding to end of exposition, before development).
- B: Development — harmonically unstable, developmental treatment of R or A material.
- R–A: Recapitulation — refrain in tonic (P-zone), first episode now also in tonic (S-zone "corrected" to the tonic, providing the ESC).
- R: Final coda or closing refrain.
The finale of Mozart’s Piano Concerto No. 21 in C major, K. 467, is a beautifully realized sonata-rondo. The refrain — a crystalline, dancing theme in C major — appears throughout the movement with the predictability and tonal stability that define the rondo spirit. Yet the first episode is in G major (the dominant), exactly as a sonata second theme should be; at the recapitulation, this episode returns in C major. The middle B episode develops material from the refrain through harmonic sequences. The form satisfies both the rondo listener and the sonata listener, which is precisely why the sonata-rondo became the preferred finale form of the Classical concerto.
The analytic challenge posed by the sonata-rondo is one of dual accountability: the analyst must assess both its success as a rondo (are the refrain returns sufficiently frequent and tonally stable to anchor the form?) and its success as a sonata (does the B section function as a genuine development, and does the final A section genuinely correct the earlier tonal imbalance?). In Hepokoski-Darcy’s framework, the sonata-rondo counts as a Type 4 sonata — a full sonata form with an added “false” recapitulation before the development — and its EEC and ESC are identified in the same way as in a Type 3 sonata, with the added complexity that the post-B refrain must be parsed for its formal status (is it a “false” recapitulation or a genuine structural moment?).
3.5 The Rondo Refrain: Internal Structure and Variation on Return
The refrain is the formal and expressive anchor of the rondo: its returns are the events around which the episodes pivot, and its character determines the movement’s overall affective profile. Analytically, the refrain deserves careful internal scrutiny: what formal type does it use? How is it harmonically organized? And how is it modified on each of its returns?
- A parallel period: antecedent (HC) + consequent (PAC), eight bars in total. The period's question-answer logic creates a self-contained, balanced statement ideal for frequent repetition. Mozart's piano concerto finales frequently use this structure.
- A rounded binary: A–BA' with each section repeated. This larger structure (typically sixteen or more bars) creates a richer, more self-contained refrain whose internal return of A' provides a miniature formal satisfaction before the episode begins.
- A sentence: presentation + continuation + cadence. The sentence's internal momentum — its drive from presentation through continuation to cadential closure — gives the refrain an energy that launches naturally into the transition to the first episode.
3.6 Haydn’s Rondo Finales and Formal Wit
Haydn’s use of the rondo in his mature symphonies and string quartets is distinguished by a formal wit — a delight in surprising the listener by violating or subverting the expected rondo logic — that is characteristic of his compositional personality as a whole. Where Mozart tends to fulfill the rondo’s formal expectations with graceful precision and Beethoven tends to transform them with dramatic force, Haydn characteristically plays with them, treating formal expectations as the setup for a joke.
Understanding Haydn’s formal wit requires understanding the conventions against which it operates. Haydn’s rondo listeners — the aristocratic and middle-class audiences of Esterházy and the London concert halls — were deeply familiar with the rondo’s expectations: refrain in the tonic, episodes providing contrast, final refrain providing conclusion. These expectations were not abstract theoretical constructions but lived musical experience accumulated through hundreds of encounters with rondo movements. Haydn’s jokes are possible only because this experience is reliable: if the audience did not confidently expect a particular formal outcome, subverting that expectation would have no comic effect. The formal joke is therefore a form of flattery — it acknowledges and depends upon the audience’s sophisticated musical knowledge.
Haydn’s most characteristic device for rondo wit is the false close — a moment that presents all the acoustic hallmarks of a final refrain (full orchestra, tonic key, recognizable opening of the refrain) but that proves premature when the music continues rather than ending. The false close exploits the fact that listeners use multiple cues to determine when a movement is over: the refrain’s return in the tonic is one such cue, but it is not the only one. Haydn’s false closes isolate this single cue and make it mislead: the refrain has returned, but the movement is not over. The subsequent continuation — whether another episode, a developmental passage, or an extension of the refrain itself — reveals the deception and produces the comic effect of a formally trained audience being outsmarted by the composer. Rosen describes this aspect of Haydn’s formal personality as evidence of his underlying seriousness: the formal wit is not decoration but argument, revealing through comedy the arbitrariness of formal conventions and the difference between convention and necessity.
Chapter 4: Theme and Variations
4.1 The Variation Principle
Theme and variations is among the oldest formal procedures in Western music, with roots in the Renaissance practice of divisions — elaborating a cantus firmus melody in progressively faster note values. In its Classical manifestation, a theme — typically a simple, memorable melody in a closed binary or ternary form — is followed by a series of variations that transform it through a wide range of compositional techniques while preserving its underlying harmonic scheme, phrase structure, and length. The result is a form that is simultaneously maximally constrained (every variation must account for the theme’s harmonic skeleton) and infinitely flexible (any compositional technique may be applied within those constraints).
What distinguishes great variation writing from merely competent variation writing is the management of the set as a whole — the sense of cumulative argument, of trajectory, of shape across the entire sequence of variations. A set of variations is not merely a collection of independent pieces that happen to share the same harmonic skeleton; it is a formal entity in its own right, with a beginning (the theme), a middle (the sequence of variations creating contrast, intensification, or narrative), and an ending (the final variation or group of variations that resolves the formal argument). Beethoven, in particular, was a master of the variation set’s large-scale architecture: he typically organized his sets in roughly ternary fashion, with early variations establishing the ornamental and technical range of possibility, a medial climax or series of contrasting variations providing a central dialectic, and a final variation (or group) achieving a qualitative transformation of the theme rather than merely an incremental accumulation of ornament.
The choice of theme for a variation set is itself a formal and expressive decision. A theme of great intrinsic beauty — like the theme of Mozart’s K. 331 variations — creates a standard against which every variation will be implicitly measured; some listeners feel that no variation can quite equal the theme’s melodic perfection, and the entire set must reconcile itself to working in the shadow of its opening. A theme of deliberate simplicity or even banality — like Diabelli’s waltz, or the simple theme of Beethoven’s Op. 34 variations — creates a different situation: the theme is a vehicle, not an end, and the entire set is a demonstration of the compositional imagination working on unpromising material. The choice between a beautiful theme and a simple one shapes the entire expressive strategy of the variation set.
The formal preservation requirements of variation form also create a clear analytical framework for the listener. Because every variation must be the same length as the theme, have the same phrase structure, and visit the same harmonic goals at the same structural moments, the listener can always hear the theme “through” the variation: the variation’s surface is what changes, but the harmonic skeleton and phrase rhythm remain constant. This transparency is one of the form’s great pedagogical virtues — it provides a concrete demonstration of the distinction between musical surface and musical structure — and one of its great expressive virtues: the composer can transform the surface of the theme as dramatically as imagination allows, confident that the theme’s underlying structure will hold the variation together and keep it intelligible.
4.2 Ornamental Variation
Ornamental variation is the oldest and most common type. The theme’s melody is elaborated with faster notes — trills, turns, passing tones, neighbor tones, and other ornamental figurations — while the harmonic content remains essentially unchanged. The melody is embellished but remains audible beneath the decoration. The technique derives from the Renaissance and Baroque improvisatory practice of divisions (English) or diminutions (Italian), in which a simple melodic line was elaborated by dividing each long note into several shorter ones.
Handel’s Harmonious Blacksmith air with variations from Suite No. 5 in E major, HWV 430 (1720), is among the most celebrated examples in the keyboard repertoire. The air is a simple binary form in E major. Variation I doubles the note values of the melody’s elaboration. Variations II and III add increasingly rapid ornamental runs in alternating hands. Variation IV provides running sixteenth notes throughout both hands. Variation V extends to thirty-second notes, creating an almost continuous surface motion against the aria’s harmonic skeleton. The form moves from simplicity to elaboration in a single unbroken trajectory, with the theme’s outline remaining just barely audible beneath the ornamental filigree of the later variations.
The analytical challenge of ornamental variation lies in tracking what the variation preserves and what it transforms. The harmonic skeleton is always preserved: every chord change of the theme recurs at the corresponding rhythmic position in the variation, though the chord may be arpeggiated, elaborated with passing tones, or decorated with non-harmonic neighbor notes. The phrase structure is preserved: if the theme is a sixteen-bar binary (8+8), every variation is also sixteen bars long, with the same internal phrase articulation. The cadential pattern is preserved: wherever the theme has a half cadence or a perfect authentic cadence, the variation must arrive at the same cadential goal, though it may approach it through a more elaborate melodic path. What the ornamental variation transforms is exclusively the melodic surface — the specific notes that fill in the harmonic and rhythmic framework.
A pedagogically useful exercise is to strip away an ornamental variation’s figurations and reduce it to its harmonic skeleton, then to compare that skeleton with the original theme. The comparison reveals which notes in the variation are structural (harmonic tones present in the original) and which are ornamental (passing tones, neighbor tones, anticipations, escape tones). This hierarchical distinction between structural and ornamental tones is directly related to the Schenkerian concept of structural levels, and analyzing ornamental variations provides an accessible entry point into understanding how melodic elaboration and underlying harmonic structure interact at multiple levels simultaneously. The student who can identify the theme’s melodic skeleton within a densely ornamented variation has already internalized one of the central analytical skills of tonal music analysis.
4.3 Harmonic and Modal Variation
Harmonic variation reharmonizes the theme’s melody, substituting new chords for the original ones while retaining the melodic contour. This technique requires considerable compositional skill: the new harmonies must remain compatible with the melodic line (no chord substitution may introduce a dissonant clash with the melody that cannot be resolved) while providing sufficient contrast to justify the variation’s existence as a distinct formal entity.
Modal variation is the wholesale transposition of a major-mode theme into the parallel minor or vice versa. The shift of mode fundamentally alters the character of the theme: a bright, dancing major-mode melody becomes darker and more introspective in minor; a solemn minor-mode theme becomes resolved and luminous in major. Schubert exploits modal variation with extraordinary expressive power — in the slow movement of the Piano Sonata in B-flat major, D. 960, a major-mode theme is suddenly heard in the parallel minor, with an effect of sudden emotional shadow that is among the most haunting moments in the piano literature.
The relationship between harmonic variation and reharmonization is a topic with important implications for improvisatory practice as well as composed variation form. In jazz, the practice of “reharmonizing” a standard melody — substituting new chords (including tritone substitutions, secondary dominants, and chromatic alterations) for the original changes while maintaining the melody — is essentially a form of harmonic variation. The jazz musician’s skill in reharmonization is closely analogous to the Classical composer’s skill in harmonic variation: both require understanding which notes of the original melody are structurally essential (and therefore must be supported by harmonically compatible chords) and which are ornamental (and therefore offer more harmonic freedom). The Classical harmonic variation tradition and the jazz reharmonization tradition are, in this sense, different manifestations of the same underlying musical-compositional insight: the melody and its harmony are separable, and replacing one with a new version — while keeping the other intact — creates a new musical perspective on familiar material.
4.4 Rhythmic and Character Variation
Rhythmic variation applies augmentation (doubling the note values), diminution (halving them), or metric displacement (shifting the melody to a different beat of the measure) to the theme’s melodic content. A theme that originally moved in quarter notes may be presented in eighth notes (diminution), creating a sense of acceleration; or in half notes (augmentation), creating a sense of expansion and grandeur. Metric displacement — placing the theme’s strong beats in metrically weak positions — creates a syncopated, unsettled effect that can be extremely dramatic.
Character variation is perhaps the most conceptually interesting type: rather than applying a specific compositional technique to the theme, the composer reimagines the theme in an entirely new character — as a march, a fugue, a chorale, a pastorale, a barcarolle — while preserving only the harmonic skeleton and phrase length. The theme becomes a skeleton on which entirely different musical flesh is hung. Character variation requires the composer to think not just about how to elaborate a theme but about what entirely different musical world could be built on its harmonic foundation.
The relationship between rhythmic variation and expressive transformation is one of the most direct available to a composer. A theme originally in a moderate common time takes on entirely different implications when presented in a slow triple meter (as a sarabande or a slow waltz), or in a rapid compound meter (as a gigue), or in a march rhythm with dotted notes and precise articulation. The rhythm alone, without any change to the underlying harmonies or phrase lengths, is sufficient to produce what amounts to a change of genre: the same harmonic progression that generated a lyrical andante melody can, in diminution with crisp staccato articulation, generate the impression of a military march. Beethoven exploits this capacity most dramatically in the Diabelli Variations: Variation 1 transforms the waltz into a pompous march by superimposing march rhythms (dotted quarter, eighth, quarter) over the waltz’s underlying harmonic grid.
The fugal variation deserves special mention as one of the most technically demanding forms of character variation. A fugue on the theme’s harmonic skeleton requires the composer to derive a fugue subject from the theme’s melodic or rhythmic content, to treat this subject in strict imitation among the voices, and to navigate the conventional stages of fugal form (exposition, episodes, stretto, climax, final cadence) within the theme’s prescribed length and harmonic outline. Brahms’s concluding fugue in the Handel Variations, Op. 24, and Beethoven’s Variation 32 of the Diabelli Variations both illustrate how the fugal character variation can serve as a formal culmination of a variation set: the fugue’s density and contrapuntal energy gather together all the preceding variations’ accumulated momentum and release it in a single concentrated contrapuntal statement.
4.5 The Chaconne and Passacaglia
The Baroque chaconne and passacaglia are variation forms built not on a melodic theme but on an ostinato — a repeated bass pattern or harmonic progression that recurs continuously beneath varied upper voices. The terms were used interchangeably in seventeenth-century sources, but modern usage tends to reserve “passacaglia” for forms built on a repeated bass line and “chaconne” for forms built on a repeated harmonic progression (which may migrate to upper voices).
The ostinato-based variation forms present a formal challenge quite different from the theme-and-variations: since the bass pattern or harmonic progression returns over and over without the melodic theme’s distinctive presence, the listener must track variation at the level of the upper-voice texture rather than at the level of melodic transformation. In a theme-and-variations, the theme’s melody provides a perceptual anchor for each variation — even densely ornamented variations retain the melody’s contour as a recognizable reference point. In a chaconne or passacaglia, no such melodic anchor is present: the variations are textural and harmonic elaborations of the same recurring bass or progression, and the listener must track form through changes in texture, register, dynamics, and contrapuntal complexity rather than through a recognizable melodic identity. This textural tracking is a more demanding perceptual skill, and it is one reason that chaconnes and passacaglias — despite their formal simplicity (always the same bass or progression) — are often experienced as among the most formally complex and demanding pieces in the repertoire.
4.6 Beethoven’s Diabelli Variations
Beethoven’s Diabelli Variations, Op. 120 (1823), are the apotheosis of the character variation principle. The theme — a banal, mechanically repetitive waltz by Anton Diabelli — is, as Beethoven reportedly put it, a “cobbler’s patch” (Schusterfleck): its bass-line ostinato pattern, its square-cut cadences, its complete absence of harmonic sophistication. Yet from this unpromising material, Beethoven extracts thirty-three variations that collectively constitute one of the most exhaustive explorations of variation technique in the literature. Variation 1 is a grandly ceremonial march; Variation 2 is a furiously contrapuntal exercise; Variation 10 is a Pralltriller study; Variation 20 is a sardonic return to the waltz with bizarre added octaves; Variation 31 is a slow, searching meditation in the minor mode; Variation 32 is a fugue; and the final Variation 33 is a stately minuet in C major — not in the B-flat of the theme but in the C major that transforms the cobbler’s waltz into something approaching the sublime. The work is not merely a formal exercise but an existential statement about the transformative power of compositional imagination.
The origin of the work is itself illuminating. In 1819, the Viennese music publisher and amateur composer Anton Diabelli circulated a simple waltz of his own composition to fifty-one prominent composers in the Austrian Empire, inviting each to contribute a single variation to a collective publication he planned to call Vaterländischer Künstlerverein (“Patriotic Artists’ Association”). Schubert, Czerny, the young Franz Liszt, and even the eleven-year-old Archduke Rudolf responded. Beethoven, characteristically, could not limit himself to one variation: what began as a single response eventually grew into a set of thirty-three variations constituting the entire second part of the published volume, dwarfing the combined contributions of all fifty-one other composers.
The philosophical argument of the set is inseparable from its ironic relationship to the theme. Diabelli’s waltz is in C major, in a simple binary form of sixteen bars repeated, with a bass line that hammers the same broken-octave C — E — G — E figure almost incessantly. Its harmonic vocabulary is primitive: tonic, dominant, subdominant, and little else. Beethoven’s irony operates at multiple levels simultaneously. Some variations treat the theme with a kind of affectionate derision: Variation 1’s pompous march transforms the waltz’s stumbling gait into a procession; Variation 22 explicitly quotes the Notte e giorno faticar opening of Leporello’s catalogue aria from Mozart’s Don Giovanni, a deliberately absurd juxtaposition of Diabelli’s banality with one of the century’s greatest operatic opening numbers. Other variations take the theme’s harmonic implications with complete seriousness and explore them to their logical extremes: Variation 20’s Andante in C major strips the texture to a sparse, sustained chorale that seems to have nothing to do with Diabelli’s waltz until one realizes that every harmonic move is still governed by the theme’s underlying progression.
The middle variations (roughly Variations 14–28) constitute the set’s emotional and philosophical core. Variation 14 is a fierce study in hand-crossing and registral displacement; Variation 15 is a light, two-voice invention in the style of a Bach two-part invention — an implicit claim that the theme’s harmonic skeleton is robust enough to bear the weight of strict counterpoint. Variation 20 slows to an Andante of contemplative depth; Variation 29 slows further to an Adagio ma non troppo of such harmonic richness and inward concentration that it seems to belong to a wholly different aesthetic universe from the waltz that generated it. The attacca transition to Variation 30 and then directly to Variation 31 in D minor — the set’s only variation in a minor key — creates a passage of concentrated emotional darkness that is among Beethoven’s most searching late utterances.
The culminating gesture of the set is the final Variation 33, a Tempo di Menuetto moderato in C major. A minuet, not a waltz — a deliberate formal displacement. The minuet is the aristocratic dance form of a century past, the dance that Beethoven himself had systematically replaced with the scherzo in his mature symphonies. To end the Diabelli Variations with a minuet is to invoke an entire history of formal and social meaning: the waltz was the popular, democratic dance of Diabelli’s Vienna; the minuet is its aristocratic antecedent. Beethoven’s final transformation of the waltz into a minuet — combined with the shift from B-flat major (the waltz’s dominant) to C major (a new tonal center that the whole set has been approaching) — constitutes a kind of formal apotheosis: the banal popular waltz has been elevated, through the labor of thirty-three variations, into something approaching the serene and the timeless. The work is not merely a formal exercise but an existential statement about the transformative power of compositional imagination working on the most intractable material.
4.7 Mozart’s Variation Movements
Mozart composed several major sets of keyboard variations as independent works, and several variation-form slow movements in his larger chamber and orchestral works. His approach to the variation form is generally more conservative than Beethoven’s — closer to ornamental variation than character variation — but within these constraints he demonstrates an extraordinary command of the variation sequence as a cumulative formal argument.
Mozart’s variation sets are distinguished by their sensitivity to the theme’s character and their management of the variation sequence as a narrative arc. Unlike Beethoven — who often treats the theme as raw material to be transformed beyond recognition — Mozart tends to maintain the theme’s melodic identity through most of his variations, elaborating rather than displacing the original melodic line. This fidelity to the theme is not merely a stylistic conservatism but a formal strategy: by keeping the theme audible, Mozart ensures that each variation is heard as a perspective on a single melodic truth, and the sequence of variations becomes a multifaceted portrait of the theme rather than a progressive departure from it.
The slow-movement variation sets in Mozart’s piano concertos are particularly instructive. The Andante of Piano Concerto No. 17 in G major, K. 453, presents a theme in C major and follows it with five variations in which the piano and orchestra exchange roles, the piano becomes increasingly elaborate in its ornamentation while the orchestra becomes increasingly sparse, and the final variation restores a simplified, reflective character that echoes the theme’s original simplicity without exactly repeating it. The formal arc of this variation slow movement — from simplicity through elaboration back to reflective simplicity — is characteristic of Mozart’s approach to the variation sequence as a formal argument about the relationship between the theme’s original character and the transformative potential of ornamental variation.
4.8 Schubert’s Trout Quintet Variations
Franz Schubert’s Piano Quintet in A major, D. 667 (“Trout”), takes its nickname from the fourth movement — a set of variations on Schubert’s own song Die Forelle (“The Trout”), D. 550. The song’s melody is borrowed wholesale into the quintet: the piano states the theme while the strings provide a simplified accompaniment, and the subsequent five variations subject the theme to the full range of Schubert’s variation techniques.
The five variations of the “Trout” Quintet movement are a compressed survey of variation types. Variation I gives the melody to the violin with piano figuration in the accompaniment. Variation II transfers the melody to the viola and cello in a more contrapuntal texture. Variation III shifts to D minor (the parallel minor of the dominant), darkening the mood with a more agitated accompaniment in the piano. Variation IV introduces the piano as the primary voice in an elaborate ornamental treatment of the melody, while the strings provide sustained supporting harmonies. Variation V returns to A major in a broad, expansive treatment in which the cello and double bass divide the melody in a pastoral, bucolic character reminiscent of the song’s original imagery. A short coda in A major concludes the movement with a quiet, satisfied close. The variation sequence is not dramatic in the way that Beethoven’s variation sets are dramatic; it is lyrical and picturesque, perfectly suited to the song’s subject matter and to the quintet’s overall character of Schubertian warmth and spontaneity.
Chapter 5: Sonata Form — The Exposition
5.1 Sonata Form as the Preeminent Classical Structure
Sonata form is the most studied, debated, and influential formal structure in the history of Western music. From its emergence in the mid-eighteenth century through the late nineteenth, it governed the first movements (and often the slow movements and finales) of symphonies, string quartets, piano sonatas, and other instrumental genres. Its influence on Romantic composers who nominally rejected it (Liszt, Berlioz) was hardly less than on those who embraced it (Brahms, Schumann). To understand sonata form is to understand not merely a formal pattern but a way of conceiving large-scale tonal drama.
The term “sonata form” itself is a product of nineteenth-century theory: composers of the Classical period did not use the term, and early theorists (Koch, Riepel) described what we now call sonata form in different and less unified terms. The first theoretical systematization that resembles modern descriptions appears in A. B. Marx’s Die Lehre von der musikalischen Komposition (1845), which popularized the tripartite division into “Aufstellung” (exposition), “Durchführung” (development), and “Rückführung” (recapitulation). The term “first movement form” (Sonatensatzform) reflects the genre association, but the form is not restricted to first movements: many slow movements (e.g., the slow movements of Beethoven’s Op. 13 and Op. 57 sonatas) and even some finales are in sonata form.
Charles Rosen’s Sonata Forms (1980) and The Classical Style (1971) remain among the most illuminating prose accounts of how sonata form works as a dramatic and expressive — not merely structural — procedure. Rosen’s central argument is that sonata form is not a template but an action: the action of establishing a tonal contrast (exposition), exploring its implications (development), and resolving it (recapitulation). For Rosen, the exposition’s crucial tonal act is not merely moving to the dominant but creating a genuine polarity — making the dominant feel like an alternative tonal world, not just a subsidiary key region. This polarity is what gives the recapitulation its weight: the return of all material in the tonic is not mere repetition but the resolution of a genuine tonal conflict. The formal drama of sonata form is ultimately the drama of tonal conflict and resolution, and every structural feature of the form — the transition’s drive to the dominant, the development’s tonal explorations, the retransition’s patient dominant preparation, the recapitulation’s tonal correction of the exposition — is in service of that large-scale drama.
- Exposition: presents the movement's thematic and tonal contrasts, moving from the home tonic key to a secondary key (the dominant in major-mode works, the relative major or occasionally the parallel major in minor-mode works).
- Development: departs from the expositional material into a region of tonal instability and harmonic exploration.
- Recapitulation: returns to the tonic key and presents the expositional material — now all in the tonic — in a way that resolves the tonal tension created by the exposition's move to the secondary key.
5.2 The P-Zone: Primary Theme Area
The exposition begins with the P-zone (primary theme zone, in Hepokoski-Darcy’s terminology; Caplin’s “main theme”). The P-zone is in the tonic key and introduces the movement’s principal thematic material. It is characterized by beginning function at the level of the exposition: it initiates the formal argument, establishes the tonal home base, and typically presents its material in a tight-knit organization (sentence, period, or hybrid). The P-zone may consist of a single theme or a small group of related ideas that together constitute the first thematic area.
The P-zone’s thematic character has significant implications for the movement’s formal trajectory. A tight-knit, forte P-zone creates a starting point of maximal formal stability, so that the transition’s dissolution and the development’s instability represent a genuine formal journey away from that stability. A more ambiguous or searching P-zone — such as the opening of Schubert’s Piano Sonata in B-flat major, D. 960, which begins with a soft, ruminative melody immediately undermined by a mysterious trill in the bass — establishes an expressive world in which formal stability is already elusive from the outset. The recapitulation of such a P-zone does not provide the reassurance that a tight-knit opening theme’s return would: the searching material returns having found no definitive answer to the questions it posed at the beginning. The P-zone is thus not merely a formal “first theme” but an expressive premise that determines the entire movement’s formal and emotional argument.
5.3 The Transition (TR) and Medial Caesura
Following the P-zone, the transition (TR) begins the process of moving from the tonic key to the secondary key. The transition serves a dual formal function: it closes the P-zone’s world (by loosening the tight-knit organization of the P-zone) and opens toward the S-zone’s world (by building harmonic momentum toward the secondary key). Caplin characterizes the transition as having a “dissolution” character: it takes the tight-knit P-zone material and gradually loosens it, dissolving the basic idea into sequential passages and increasing the harmonic instability until the secondary key’s dominant is prepared.
Identifying the MC is one of the most analytically demanding tasks in the application of Hepokoski-Darcy’s framework, because many expositions elide the MC — the transition ends and the S-zone begins without a clear break, the two zones merging in a continuous flow. The elision of the MC is not a fault but a formal choice: it creates a more continuous, seamless exposition in which the listener cannot easily locate the internal boundaries. Mozart frequently elides the MC in his piano concerto expositions, creating the impression of a single, continuous formal arc from the opening P-zone to the S-zone’s entry.
Hepokoski and Darcy identify two types of MC by their harmonic content. The dominant-lock MC (or “V:HC MC”) arrives on the dominant of the secondary key — the most common type in the mature Classical style. After this HC, the S-zone theme enters in the secondary key, confirming the tonal destination that the MC’s dominant had implied. The home-dominant MC (or “I:HC MC”) is rarer: the transition ends with a half cadence in the home key (not the secondary key), and the S-zone then begins directly in the secondary key without a preceding dominant lock. This type — found in some early Classical expositions and in Haydn’s more unconventional formal experiments — creates a more abrupt tonal shift than the V:HC MC, since the secondary key is asserted without a preceding dominant lock in that key.
The transition’s harmonic approach to the MC is also a matter of considerable variety. A continuous transition derives its material entirely from the P-zone theme, subjecting it to fragmentation and sequence until the MC arrives; the transition has no independent thematic identity. An independent transition introduces new material not derived from the P-zone — sometimes called the “transition theme” — before driving to the MC. The distinction between these transition types has expressive implications: the continuous transition suggests that the passage from P-zone to S-zone is one of formal dissolution (the P-zone dissolving into the transition’s instability), while the independent transition suggests a more complex formal space with its own thematic identity between the two tonal worlds of the exposition.
5.4 The S-Zone: Secondary Theme Area
The S-zone (secondary theme zone) presents contrasting thematic material in the secondary key. By convention, the S-zone in a major-mode movement is in the dominant; in a minor-mode movement, it is most typically in the relative major (III), though some movements — particularly among Beethoven’s minor-mode works — place the S-zone in the parallel major (III raised, or major V) for a more dramatic tonal contrast.
The S-zone’s relationship to the P-zone is one of the most discussed topics in Classical formal theory. The conventional account — derived ultimately from the “two-theme” terminology of nineteenth-century pedagogical literature — describes the S-zone as “the second theme,” implying a simple binary opposition between a first theme (P-zone) and a second theme (S-zone). Rosen, Caplin, and Hepokoski/Darcy all resist this oversimplification. The P-zone and S-zone are not simply two themes placed in opposition; they are two tonal worlds — the home tonic and the secondary key — and their contrast is primarily tonal, only secondarily thematic. An S-zone that uses the same thematic material as the P-zone (transposed to the dominant) is still a genuine S-zone, because the tonal contrast between the two zones is maintained regardless of thematic similarity.
The S-zone is also internally more varied than the simple “second theme” label implies. In many Classical expositions, the S-zone is not a single theme but a complex of several related ideas — what Hepokoski and Darcy call the S1, S2, and S3 thematic groups within the S-zone. Each sub-group may have its own thematic character, but they are all in the secondary key, and they collectively constitute the S-zone as a formal unit. The internal complexity of the S-zone reflects the formal weight it must carry: it must establish the secondary key as a fully realized tonal world, not merely assert its presence through a single theme. The S-zone’s internal complexity is also what makes locating the EEC analytically challenging: with multiple thematic groups in the secondary key, the analyst must determine which PAC among the many available constitutes the true EEC — the one followed by new, closing (non-thematic) material.
The EEC concept is analytically powerful because it reveals how composers manipulate the moment of formal closure in the exposition. A composer may “attempt” the EEC multiple times — offering a PAC in the secondary key that is then followed by more S-zone material rather than closing material — before the “true” EEC finally arrives. Each failed attempt raises the listener’s expectation for the eventual true EEC, so that when it arrives, its force is amplified by the accumulated expectation. Beethoven’s Op. 57 (Appassionata) first movement is a particularly dramatic example: the S-zone material makes several approaches to a PAC in A-flat major before the true EEC is finally achieved.
5.5 The C-Zone and Codetta
Following the EEC, the C-zone (closing zone) confirms the secondary key with closing gestures. The C-zone material is typically less thematically distinctive than the S-zone material — it consists of cadential progressions, scale passages, repeated perfect authentic cadences, fanfare figures, and other gestures whose function is confirmatory rather than expressive. The C-zone’s function is to drive the tonal establishment of the secondary key home, so that when the exposition repeat begins (or when the development follows), the listener has no doubt about which key has been established.
The codetta — a brief, post-cadential suffix — often ends the exposition with a final emphatic PAC in the secondary key, sometimes followed by a short silence or a sustained chord that marks the formal boundary of the exposition. In movements with an exposition repeat, the codetta typically contains a first-ending bar that leads back to the beginning of the exposition, and a second-ending bar that leads into the development.
In Hepokoski and Darcy’s framework, the C-zone’s relationship to the EEC is critical for understanding how the exposition fulfills its tonal obligations. The EEC achieves the primary tonal goal — PAC in the secondary key followed by new material — but the C-zone is necessary to ensure that this tonal achievement is consolidated rather than merely asserted. The C-zone can be understood as the zone of tonal affirmation: where the S-zone argued for the secondary key by presenting thematic material in it, the C-zone confirms that key through the sheer weight of cadential repetition. A C-zone with multiple iterated PACs in the secondary key leaves the listener with no doubt that the secondary key has been fully and securely established.
The length and thematic content of the C-zone varies considerably from composer to composer and from movement to movement. In Mozart, C-zones are frequently quite long relative to the exposition’s total length, often comprising twenty to thirty percent of the exposition’s bar count. They may include several distinct “closing themes” (sometimes labeled C1, C2, C3 in the analytical literature), each contributing a distinct closing gesture: an arpeggiated fanfare figure, a running scale passage in the right hand, a piano dynamic echo of the preceding forte cadence, a chromatic turn figure, and finally a bare unison statement of the tonic chord. These layered closing gestures in Mozart’s expositions — as in the first movement of the Piano Sonata in C major, K. 330, or the String Quartet in G major, K. 387 — give the C-zone a cumulative ceremonial quality, as though the secondary key is being formally installed rather than merely arrived at.
In Beethoven’s expositions, the C-zone is sometimes remarkably compact — the arrival of the EEC is followed by only a few bars of confirmatory material before the exposition closes — or it may be dramatically extended. The first movement of the Piano Sonata in C minor, Op. 13 (Pathétique), provides a textbook C-zone: following the EEC in E-flat major, a sequence of descending parallel thirds and a cadential trill figure repeat the PAC in E-flat major three times in quick succession, leaving the secondary key thoroughly confirmed before the exposition repeats.
Codettas also serve a rhetorical function that is easy to overlook: they signal to the listener that the exposition is formally complete. The codetta’s characteristic gesture — typically a final tonic chord or the same brief figure repeated two or three times at diminishing dynamic levels — is a formal marker that says “the exposition argument has concluded.” This signal is important for the listener’s formal comprehension, since it clarifies that the imminent return of the opening P-zone material (if the exposition is repeated) represents a formal repeat, not a new development of the musical argument. Where a composer omits a clear codetta and allows the exposition to end with the S-zone’s EEC alone, the formal boundary of the exposition can feel ambiguous — a deliberate compositional effect in some cases, as in several of Schubert’s exposition endings, where the formal boundary dissolves quietly rather than being marked with a decisive closing gesture.
5.6 Minor-Mode Sonata Expositions
The minor-mode sonata exposition presents a distinctive formal challenge: the convention of placing the S-zone in the dominant (standard for major-mode expositions) creates an awkward harmonic situation in minor, since the dominant of a minor key is itself a minor triad — harmonically weaker and less conclusive than a major dominant. For this reason, the most common convention for minor-mode expositions is to place the S-zone in the relative major (\(\text{III}\)): C minor → E-flat major, G minor → B-flat major.
The harmonic weakness of the minor dominant (\(\text{v}\)) is not merely theoretical: it is perceptible to any trained ear. A minor dominant triad (in G minor, the dominant is a D-minor triad) lacks the leading tone that gives the major dominant its directed, cadential quality. The minor dominant wants to resolve to the tonic less urgently than the major dominant, and a PAC ending with a minor dominant to tonic resolution feels less conclusive than a PAC ending with a major dominant to tonic resolution. This weakness means that placing the S-zone in the minor dominant creates a secondary key area that cannot achieve the full tonal weight needed for the EEC. The S-zone’s PAC in the minor dominant would feel like a tentative arrival rather than a definitive tonal establishment. The relative major solves this problem: as a major key, the relative major can achieve full cadential closure with the leading tone’s resolution, and the EEC in the relative major has the same finality as the EEC in the dominant of a major-mode exposition.
The expressive implications of the relative-major convention are profound, and they explain its near-universality in the Classical minor-mode sonata. When an exposition moves from the tonic minor to the relative major for the S-zone, the tonal contrast is simultaneously harmonic and affective: the relative major is not merely a different key but a brighter, more consoling harmonic world. The transition from the dark tonic minor to the luminous relative major carries the expressive weight of a move from shadow to light — a move that is then reversed, with great drama, by the recapitulation’s insistence that the S-zone material must now be heard in the tonic minor. The recapitulation’s “darkening” of the originally bright S-zone material — taking themes that were first heard in E-flat major and forcing them into C minor — is one of the most powerful expressive gestures available in the minor-mode Classical sonata.
The question of which specific PAC in the relative major constitutes the EEC is particularly nuanced in minor-mode expositions, because the relative major’s greater tonal stability (as a major key, it achieves cleaner cadential closure than the minor dominant would) means that the S-zone often achieves multiple strong PACs in rapid succession. The analyst must identify the first PAC that is followed by new, non-thematic closing material — a judgment that requires careful attention to whether the material following a given PAC represents a fresh start (C-zone) or a continuation of the S-zone’s thematic argument.
- Relative major (\(\text{III}\)): the most common convention in the mature Classical style. The relative major provides a tonally bright, affectively contrasting area that complements the tonic minor's darkness.
- Minor dominant (\(\text{v}\)): less common, harmonically weaker. Found in some Baroque and early Classical minor-mode pieces.
- Parallel major (\(\text{I}\)-major or \(\text{III}\)-major): Beethoven occasionally places the S-zone in the parallel major or closely related major area for a more dramatic tonal contrast, as in the first movement of Op. 57 (F minor → A-flat major, same pitch as relative major but different harmonic function).
5.7 The Double Exposition in the Concerto
The Classical concerto first movement employs what Hepokoski and Darcy call a Type 5 sonata — a modified sonata form in which the exposition is presented twice: an orchestral exposition (ritornello) followed by a solo exposition.
- The orchestral exposition (Ritornello 1) presents the P-zone and S-zone both in the tonic key — no modulation to the secondary key occurs. This is the critical difference from a standard sonata exposition.
- The solo exposition follows and carries out the standard tonal argument: P in tonic (with solo instrument), TR modulating to secondary key, S in secondary key (with EEC), and C confirming the secondary key.
- The cadenza appears near the end of the recapitulation's C-zone, marked by a \(\text{I}^{6/4}\) "cadenza chord" before the final \(\text{V}^7 \to \text{I}\).
Mozart’s piano concertos are the supreme achievements of this form. Piano Concerto No. 20 in D minor, K. 466, is particularly remarkable: the orchestral exposition opens with a quietly sinister repeated-note figure that Rosen calls “the first truly Romantic orchestral opening,” and the solo exposition’s S-zone in F major (the relative major) provides an expressive contrast of almost operatic power.
The formal role of the cadenza in the Classical concerto first movement deserves special attention. The cadenza appears at the structural point corresponding to the approach to the ESC in the recapitulation: the orchestra arrives at the \(\text{I}^{6/4}\) “cadenza chord” — the dominant sixth-four that is the conventional signal for a cadential approach — and then falls silent, leaving the soloist to improvise (or perform a composed cadenza) over the sustained expectation of the dominant seventh. The cadenza is therefore formally a prolonged approach to the movement’s ESC: it elaborates and extends the harmonic approach to the final \(\text{V}^7 \to \text{I}\) resolution, using all the soloist’s technical and expressive resources to intensify the preparation for the movement’s structural goal. When the cadenza concludes with its trill on the dominant, resolving to the tonic with the orchestra’s re-entry, the ESC arrives — all the more powerful for the extended, elaborate approach the cadenza has provided. In this sense, the cadenza is not an interruption of the formal structure but an amplification of its most structurally critical moment.
Chapter 6: Sonata Form — Development and Recapitulation
6.1 The Development Section
The development section is the locus of harmonic adventure, thematic transformation, and formal instability in sonata form. Having established two tonal poles in the exposition (tonic and secondary key), the development departs from both, exploring a range of keys that may include remote tonalities, chromatic progressions, and extended sequential passages. Thematic material from the exposition — particularly P-zone and S-zone themes — is fragmented, combined, inverted, and subjected to sequential treatment in ways that the more orderly context of the exposition would not permit.
- Tonal instability: the absence of prolonged cadential confirmation in any single key; rapid motion through multiple tonal areas; chromatic progressions and enharmonic reinterpretations.
- Thematic fragmentation: the dissolution of complete themes into smaller motivic cells (basic ideas, characteristic rhythms, melodic gestures) that are then subjected to sequential, contrapuntal, and developmental treatment.
- Sequential activity: the mechanical repetition of a harmonic-melodic pattern at successive pitch levels, traversing a large portion of the tonal circle and creating momentum through systematic harmonic descent or ascent.
- Retransition: a passage near the development's end that stabilizes on the home dominant (V of the tonic key), building an expectation of the tonic's return at the recapitulation.
The development section’s emotional character tends toward intensity, instability, and conflict — a quality that is not incidental but is the formal purpose of the section. The exposition established a tonal argument (the move to the secondary key); the development explores the consequences and complications of that argument; and the recapitulation resolves it. The development is the period of maximum harmonic tension, and its instability is the precondition for the recapitulation’s relief.
The development’s formal instability operates not only harmonically but also at the phrase level. The tight-knit eight-bar themes of the exposition, with their clear internal organization and their well-articulated cadential goals, give way in the development to passages of indeterminate length, irregular phrase rhythm, and evaded or abandoned cadences. A development passage may consist of a two-bar motivic cell that is sequenced seven or eight times — each iteration moving the bass down a fifth or a step — without any clear phrase articulation between iterations. The sequence is felt not as a series of phrases but as a continuous harmonic motion, a single directed gesture that spans many bars without internal punctuation. This contrast between the exposition’s phrase-articulated tight-knit themes and the development’s phrase-obliterating sequential passages is one of the most viscerally audible formal features of Classical sonata form, and it is what gives the development its characteristic character of purposive but boundary-dissolving forward motion.
The analyst approaching a development section for the first time should begin by identifying the development’s thematic sources: which expositional material is being used? Then trace the harmonic path: what keys are visited, in what order, and through what harmonic pivots? Then identify the development’s internal structure: are there discrete “developmental units” (a model presented in one key and then sequenced to others), or is the development a single continuous harmonic trajectory? Finally, locate the retransition: where does the development’s exploratory phase end and the dominant-pedal preparation of the recapitulation begin? This four-step analytical protocol — source, path, structure, retransition — provides a reliable framework for engaging with developments of any complexity.
6.2 The Entry of the Development and Tonal Strategies
The development typically begins by departing from the secondary key that ended the exposition — often by reinterpreting the codetta’s closing chord as the beginning of a new harmonic journey, or by beginning with P-zone or S-zone material in the secondary key and then moving through a modulatory sequence. The “entry of the development” is an important analytical moment: it signals the formal departure from the ordered world of the exposition into the exploratory world of the development.
Composers have widely varying strategies for how far into harmonic “outer space” to venture during the development. Some — particularly Haydn — prefer to stay relatively close to the home key and its near relations (dominant, relative major, subdominant), creating a compact development whose function is harmonic intensification rather than exploration. Others — particularly Beethoven in his mature works — venture into strikingly remote tonalities: the development of the first movement of the Eroica Symphony, Op. 55, introduces a new theme (unheard in the exposition) in a remote E minor, one of the most dramatic formal surprises in the symphonic literature.
The development’s opening move is a particularly revealing analytical moment, because it shows the composer’s hand regarding the development’s overall strategy. A development that begins with P-zone material in the secondary key, then immediately moves toward the subdominant (IV), announces a conservative, inward-looking strategy that will stay close to home. A development that begins with S-zone material in the secondary key, then pivots abruptly to a remote chromatic area, announces a more adventurous exploratory strategy. Mozart’s developments often begin with a quiet, somewhat mysterious restatement of the P-zone’s opening in the dominant or another related key, then move through a carefully planned sequence of modulations back toward the home dominant. Beethoven’s developments often begin at the dynamic extreme of the preceding exposition — if the exposition ended forte, the development begins piano, creating a moment of formal and dynamic reorientation — and then build systematically toward the development’s central crisis.
The most characteristic harmonic technique of the development is the sequential modulation — a repeated harmonic pattern (most typically a sequence of falling fifths: I–IV–VII–III–VI–II–V–I) that carries the music rapidly through multiple key areas. Sequential modulations are efficient for development purposes because they provide both harmonic momentum (each iteration of the sequence moves the music forward harmonically) and motivic coherence (the same melodic-rhythmic pattern is repeated at each new pitch level, maintaining thematic identity while the harmony changes). The development of Mozart’s Piano Sonata in F major, K. 332, traverses the sequence F–C–G–D–A–E through a falling-fifth pattern, giving the development a sense of purposive harmonic direction even as it departs from the tonal areas established in the exposition.
The analytical task of understanding a development’s harmonic strategy is closely linked to understanding its thematic strategy. In developments that subject a single motive to exhaustive sequential treatment, the motivic unity compensates for harmonic diversity: the listener can track the same motive as it passes through a range of keys, maintaining a thread of motivic identity even amid harmonic instability. In developments that use multiple thematic fragments simultaneously — presenting P-zone material in one key, then S-zone material in another, then a new combination of both — the harmonic and thematic diversity interact to create a sense of controlled formal complexity. The analyst’s goal is always to understand how the development’s thematic and harmonic choices interact to create the movement’s specific formal and expressive argument at this pivotal section of the form.
6.3 The Retransition
The retransition (RT) is the terminal passage of the development, whose function is to stabilize the harmony on the home dominant and to prepare the listener’s ear for the recapitulation’s tonic return. In most Classical developments, the RT consists of a prolonged dominant pedal — the bass holding the home dominant (the fifth degree of the tonic key) while the upper voices move restlessly above it, creating an accumulation of harmonic tension that is released only when the tonic arrives at the recapitulation’s opening.
The dominant pedal of the retransition is one of the most powerful formal devices in the Classical arsenal. By holding the same bass note for an extended period while the upper voices move chromatically or sequentially, the composer creates a double tension: the harmonic tension of an unresolved dominant seventh, and the formal tension of a sustained expectation of the recapitulation. The longer the dominant pedal is sustained, the more powerful the recapitulation’s arrival will feel. Beethoven, always sensitive to the formal energetics of this moment, sustains the dominant pedal in the retransition of the Eroica Symphony’s first movement for a remarkable sixteen bars — an eternity of suspended harmonic expectation before the recapitulation bursts in.
The length of the retransition is one of the most variable formal parameters in sonata form, and its variation has direct expressive consequences. A very short retransition — as little as a single bar of dominant harmony before the tonic arrives — creates a sense of abruptness, of the recapitulation arriving before the listener is fully prepared for it. This effect can be comic (a joke about the convention of the retransition, as Haydn exploits it) or dramatic (a sense of formal urgency that the development cannot afford to linger). A very long retransition, by contrast, creates an almost unbearable formal suspense: the listener knows the recapitulation is coming, can feel its imminence in the sustained dominant harmony, and is kept waiting long past the point of comfortable expectation.
The harmonics of the retransition frequently involve more than a simple dominant pedal. Composers often approach the dominant pedal through a chromatic process: the development may have ventured into remote tonal areas (\(\flat\text{VI}\), \(\flat\text{VII}\), or other chromatic regions), and the retransition must navigate back to the home dominant from wherever the development has ended. This navigation may involve a dominant of the dominant (V/V → V), an enharmonic respelling that reinterprets a chromatically distant chord as the home dominant seventh, or a diminished seventh pivot that slides smoothly from the development’s final chord to the dominant seventh of the home key. Mozart’s retransition in the first movement of the Piano Sonata in D major, K. 311, provides an elegant example of the last technique: after development material in B minor, a diminished seventh chord is reinterpreted to become the dominant seventh of D major, and the retransition pedal begins.
Haydn’s retransitions deserve special mention because Haydn frequently uses them as sites of wit and formal play. In several of his string quartets and symphonies, the retransition’s dominant pedal becomes a kind of comic waiting game: the dominant is prolonged far past the expected point, the texture strips down to a bare minimum (sometimes a single instrument sustaining the dominant note), and the silence before the recapitulation becomes a formal theatrical pause — the listener hanging in expectation. The recapitulation’s eventual arrival, after this protracted waiting, carries an explosive formal release whose comic timing Haydn controls with the precision of a master dramatist.
6.4 The Recapitulation
The recapitulation restates the exposition’s material in a fundamentally revised tonal context. The P-zone returns in the tonic (as expected). The transition is rewritten to be non-modulating (since the S-zone must now be in the tonic rather than the secondary key). The S-zone is presented in the tonic, delivering the essential structural closure (ESC). The C-zone confirms the tonic. The tonal argument initiated by the exposition — the departure to the secondary key — is resolved by the recapitulation’s tonic-centered presentation of all the expositional material.
The recapitulation is not — as nineteenth-century formalistic accounts sometimes suggested — a simple “repeat” of the exposition in the tonic. It is a creative transformation of the exposition’s material: the same themes, in a new tonal context, with revised harmonic content, and with a profoundly different formal meaning. The listener who has heard the S-zone material in the dominant during the exposition now hears it in the tonic during the recapitulation, and this harmonic difference changes the material’s expressive character. A lyrical S-zone theme that had a quality of gentle contrast (offering the brightness of the dominant against the tonic’s solidity) acquires, when heard in the tonic during the recapitulation, a quality of homecoming: it is the same melody, but it now belongs to the tonic world rather than the dominant world, and its closure in the tonic carries an emotional weight that its earlier dominant-key version could not.
The recapitulation’s most technically demanding task is the rewriting of the transition. In the exposition, the transition moved from the tonic to the secondary key. In the recapitulation, this modulation is not needed — the S-zone must remain in the tonic — and so the transition must be substantially revised. Beethoven’s approaches to this problem are instructive in their variety: in Op. 13, he truncates the transition dramatically, arriving at the S-zone’s material in C minor (instead of E-flat major) with unusual abruptness. In Op. 57, he rewrites the transition with an entirely new harmonic detour through D-flat major, creating a recapitulation transition that is even longer and more harmonically intense than the exposition’s transition. In Op. 67 (Symphony No. 5), he adds a completely new oboe solo during the recapitulation’s transition, commenting on the exposition’s material with a melancholy new voice.
6.5 The Coda
The coda is material that follows the ESC and extends the movement beyond its structural completion. Beethoven’s mature codas are among the most formally significant in the repertoire: rather than merely confirming the tonic with post-cadential gestures (as Classical codas typically do), they sometimes revisit the development’s harmonic territory, introduce new thematic material, or dramatically reinterpret expositional themes in the light of all that has preceded them.
The coda’s formal status in the Classical and Romantic tradition is a contested question. Hepokoski and Darcy regard the coda as formally optional: movements may end at the ESC, or may extend beyond it with a coda of any length. Caplin treats the coda as having ending function at the movement level — it is the movement’s final formal unit, providing post-ESC tonic confirmation and concluding the formal argument. Both frameworks agree that the coda’s content is more free than any section before the ESC: since the ESC has already resolved the movement’s large-scale tonal argument, the coda does not need to maintain the tight-knit organization of the recapitulation’s key zones. It can be as brief as a few bars of tonic confirmation or as expansive as a second development section.
Haydn’s codas in his mature symphonies often make a final formal joke: an unexpected dynamic or textural change at the very end, a sudden quiet after a long forte passage, or a single unexpected harmonic gesture that reframes the entire movement’s formal argument. Mozart’s codas tend to be relatively brief, adding a few bars of post-ESC confirmation that round off the movement cleanly without altering the expressive argument. Beethoven’s most famous codas — the Eroica’s 140-bar coda, the Appassionata’s developmental coda, the Fifth Symphony’s breathtaking fortissimo extended coda — are formal events in their own right, sometimes longer than the development section and carrying a weight of formal and expressive significance that transforms the movement’s overall architecture. In these grand Beethoven codas, the coda is not a postscript but a culmination, and the movement’s formal argument is not truly complete until the coda has spoken.
6.6 The “Eroica” Recapitulation as Case Study
The first movement of Beethoven’s Symphony No. 3 in E-flat major, Op. 55 (“Eroica”), is perhaps the most analyzed sonata-form movement in the literature, and for good reason: its exposition, development, and recapitulation each contain formal features that push against the boundaries of Classical convention while remaining firmly grounded in the sonata-form logic they are subverting.
The Eroica’s coda (measures 551–691) is even more formally audacious. Following the ESC, the coda introduces a completely new theme — a flowing, E-flat major melody in the woodwinds and brass that has never appeared before in the movement — as if the movement’s argument, having been resolved, now has the freedom to explore entirely new territory. This “new theme in the coda” is a formal strategy that Beethoven uses in several mature works (including the first movement of the Fifth Symphony) and that represents a significant expansion of the coda’s traditional confirming function into something closer to a “second development.”
The Eroica’s development section is also a case study in developmental formal organization. Its opening (measures 152–184) begins in the secondary key (B-flat major, where the exposition ended) and gradually moves away through a sequence of modulations. The famous crisis passage (measures 248–280) — a massive fortissimo dissonance that is among the most harmonically violent passages in all of Classical music — arrives in the middle of a sequence leading through E minor, C major, and F major, before the brass sustain a dissonant chord (B-flat against A in the strings) for two bars before collapsing into a quiet, hushed statement of what appears to be the P-zone theme. This quiet statement — in E minor, not the home key of E-flat major — is what Rosen identifies as the development’s most formally audacious gesture: the introduction of a new theme at the development’s center, so far removed from the expositional material as to constitute an entirely new thematic entity. The new E-minor theme is the development’s formal and emotional climax, and its return — combined with the retransition’s long dominant preparation — creates the formal context from which the premature horn call and the eventual true recapitulation emerge. The Eroica first movement is, in this sense, the founding document of the Romantic sonata’s expansion of formal possibility: it demonstrates that sonata form is not a template to be filled but a framework to be explored, subverted, and ultimately transcended from within.
6.7 Hepokoski and Darcy’s Dialogic Framework
Hepokoski and Darcy’s most significant contribution to sonata theory is their insistence that sonata movements conduct an implicit dialogue with the generic norm. The “norm” is not a rigid template but a flexible set of expectations — the normative script of a sonata-form movement, with its expected zones, its expected cadential moments, its expected tonal logic. Every deviation from this script is meaningful precisely because the norm is known: a “deformation” is not a mistake but a deliberate formal argument that depends on the listener’s awareness of the norm being violated.
The concept of the dialogic form has important implications for the analyst’s method. In Hepokoski and Darcy’s framework, the analyst does not merely describe what a movement does — identifying P-zone, TR, S-zone, EEC, C-zone, and so on — but interprets the movement as a series of choices made against a background of normative possibilities. A movement that achieves its EEC exactly where the norm predicts is making a statement about conformity — its formal argument is that the normative script is appropriate to this material. A movement that repeatedly delays its EEC, offering multiple failed attempts before the true EEC arrives, is making a very different formal argument — that the normative script must be earned, that its conventions are not automatically applicable but must be striven for. The formal argument is not merely a description of what the movement does but a claim about the significance of the choices the movement makes.
The dialogic framework also explains why Hepokoski and Darcy’s analytical vocabulary centers on intentions and responses: the movement “responds” to the normative template, it “succeeds” or “fails” to fulfill the template’s expectations, it “refuses” certain normative moves. These quasi-volitional metaphors are not anthropomorphisms but analytical tools: they capture the way the movement’s formal choices acquire their meaning from the dialogue with the norm. Beethoven’s Appassionata first movement does not accidentally fail to achieve its EEC on the first attempt; it deliberately, repeatedly, and dramatically refuses each initial PAC in the secondary key, making each refusal carry expressive weight and making the eventual true EEC feel genuinely achieved rather than merely expected.
- Type 1: No development; the exposition is followed directly by the recapitulation. Essentially a large-scale rounded binary. Found in some late Haydn movements.
- Type 2: The recapitulation begins with S-zone material (P is omitted from the recapitulation). Rare but found in some Haydn movements.
- Type 3: The standard three-part exposition-development-recapitulation sonata form. The default understanding of "sonata form" in pedagogical literature.
- Type 4: The sonata-rondo (see Chapter 3).
- Type 5: The concerto sonata, in which the double exposition (orchestral ritornello followed by solo exposition) modifies the standard three-part plan significantly.
Chapter 7: Caplin’s Theory of Formal Functions
7.1 Formal Function as Analytical Concept
William Caplin’s Classical Form (1998) is built on a single foundational insight: that every passage of music can be understood in terms of its formal function — the role it plays within a larger formal organization. Formal functions are not labels applied after the fact; they are intrinsic properties of the music, generated by the composer’s choices about melody, harmony, rhythm, and phrase structure. To analyze a passage’s formal function is to ask not merely what the passage contains — what themes, what harmonies — but what it does: does it begin something? Continue something? End something? How does it relate to what precedes and follows it?
This functional approach to form has precedents in the German theoretical tradition — in Riemann’s theory of harmonic functions (Tonic, Dominant, Subdominant as functions rather than scale-degree labels), in Schoenberg’s analysis of theme types as functional categories, and in the broader German tendency to understand musical form in terms of logical process rather than spatial architecture. Caplin systematizes these precedents and applies them with unusual rigor to a specific repertoire (the keyboard and chamber music of Haydn, Mozart, and Beethoven), producing a framework of extraordinary analytical power within its chosen domain.
- Beginning functions: passages that initiate formal units. Characterized by harmonic stability (tonic prolongation), clear thematic profile, and tight-knit organization. Examples: the presentation of a sentence; the antecedent of a period; the P-zone of a sonata exposition.
- Middle functions: passages that continue or develop formal units, transitioning between beginning and ending. Characterized by harmonic instability, motivic development or fragmentation, and loose organization. Examples: the continuation of a sentence; the transition of a sonata exposition; the development section.
- Ending functions: passages that close formal units. Characterized by cadential progressions, harmonic arrival, and post-cadential prolongation. Examples: the cadential function of a sentence; the consequent's PAC in a period; the ESC of a recapitulation; a coda.
Caplin’s crucial observation is that formal functions operate simultaneously at multiple hierarchical levels. A single eight-bar theme may have beginning function at the level of the exposition (it initiates the exposition’s argument) while having internally differentiated beginning, middle, and ending functions at the phrase level (the presentation begins, the continuation continues, the cadence ends). This nested hierarchy of functions is what gives Classical music its extraordinary density of formal meaning: every passage is simultaneously doing work at several levels of organization, and the analyst must specify which level they are addressing.
One of Caplin’s most important methodological contributions is the insistence that formal functions are not merely labels but explanations. To say that a passage has “middle function” is not just to place it in a taxonomic category; it is to explain why the passage sounds unstable, why its phrase boundaries are irregular, why its harmonic content is more complex than passages with beginning or ending function. The formal function is the explanatory principle that accounts for the music’s local character. A passage with middle function sounds the way it sounds because middle function requires harmonic instability, motivic development, and loose organization — properties that are as audible as they are analytically describable. Caplin’s framework thus bridges the divide between technical analysis and perceptual description: what the analyst labels as “middle function” is what the listener hears as “transitional,” “developmental,” or “searching.” The framework gives the listener’s intuition a precise technical correlate.
The relationship between Caplin’s formal functions and Schenker’s structural levels is a matter of ongoing theoretical discussion. Schenker’s system organizes tonal music into background, middleground, and foreground levels, with the background being the fundamental harmonic progression (Ursatz) that underlies the entire piece and the foreground being the actual note-by-note surface. Caplin’s formal functions operate at what might be called the “phrase-to-movement” middleground: above the note-to-note foreground (which Caplin largely ignores) but below the whole-movement background (which Schenker addresses at his deepest structural level). The two approaches are complementary: Schenker explains how the notes connect across vast spans of musical time, and Caplin explains what formal roles the local passages play within their immediate formal context. Together, they provide a multi-level account of tonal formal organization that is more complete than either approach alone.
7.2 Tight-Knit vs. Loose Organization
- A tight-knit passage exhibits: regular phrase lengths (four-bar norms maintained); simple harmonic content (primarily \(\text{I}\) and \(\text{V}\)); strong cadential closure (PAC); unified and coherent thematic content. Tight-knit passages typically function as beginnings or endings in the formal hierarchy — they are the stable poles of formal organization.
- A loose passage exhibits: irregular phrase lengths (extended, compressed, or elided); more complex harmonic content (secondary dominants, chromatic progressions, sequential passages); weak cadential closure (IAC, HC, evaded cadence); fragmented or varied thematic content. Loose passages typically function as middles or transitions in the formal hierarchy — they are the dynamic, unstable regions between stable poles.
The contrast between tight-knit and loose organization is one of the most immediately audible dimensions of Classical phrase structure. A listener who does not know any of Caplin’s terminology will nonetheless hear the difference between the stable, regular P-zone theme (tight-knit) and the destabilizing transition (loose): the former sounds settled and self-contained; the latter sounds like it is going somewhere, building toward something, not yet arrived. Caplin’s analytical framework gives this intuitive perception a precise technical vocabulary.
The tight-knit/loose distinction also has important implications for the analyst’s understanding of expressive form. A movement that begins with a highly tight-knit, secure P-zone and then subjects that material to extensive loosening in the development creates an expressive trajectory from security to instability to (recapitulatory) resolution. A movement that begins with an unusually loose, uncertain P-zone — as in many of Schubert’s sonata first movements, where the opening theme may be relatively extended and harmonically ambiguous — establishes a different expressive premise: the movement begins already in a searching, unsettled state, and the recapitulation’s return to that same loose, searching material does not carry the same sense of reassurance and resolution that a tight-knit P-zone recapitulation would provide. The expressive implications of tight-knit versus loose organization at the movement’s formal poles shape the entire movement’s expressive trajectory, and attending to these implications is an important dimension of formal analysis that purely structural accounts of sonata form often overlook.
7.3 The Transition’s Formal Function
Caplin’s analysis of the transition is among his most influential contributions. Older accounts describe the transition as a “bridge” between the first and second themes — neutral connective tissue whose only job is to move between keys. Caplin reveals that the transition has a specific and distinctive formal function: it combines middle function at the level of the exposition (it is between P and S) with a characteristic harmonic goal (the medial caesura). The transition’s formal function is “dissolution”: it loosens the tight-knit organization of the P-zone and builds momentum toward the secondary key’s dominant.
Caplin distinguishes two types of transition based on their thematic source: the continuous transition derives its material from the P-zone itself (as in the Appassionata example above, where the opening four-note descent is the transition’s primary material), while the independent transition introduces new thematic material not heard in the P-zone. The continuous transition creates a stronger sense of P-zone dissolution — the familiar material is being broken apart before the listener’s ears — while the independent transition creates a cleaner formal boundary between P and TR. Composers mix both strategies: a transition may open with P-zone material (establishing continuity with the opening theme) and then introduce new material as it approaches the MC (signaling that the P-zone world has been left behind). The distinction matters analytically because it affects how the listener tracks thematic identity and formal function simultaneously — a continuous transition asks the listener to hear the “same” material functioning differently, while an independent transition announces by its novelty that a new formal phase has begun.
7.4 Formal Functions in the Development and Recapitulation
The development section has an overwhelmingly middle function at the level of the movement: it is the unstable, developmental middle between the beginning (exposition) and the ending (recapitulation). Within the development, there may be internal formal functions — a passage of beginning function (a “model” or new initiating idea), followed by a middle-function sequence, followed by an ending-function retransition. These internal functions create a sense of formal organization within the development’s instability, preventing it from becoming a structureless continuum of harmonic meandering.
The recapitulation has a complex dual function. At the movement level, it has ending function: it is the moment at which the formal argument of the movement is resolved, the tonic is confirmed, and the movement’s large-scale harmonic drama concludes. But internally, the recapitulation restates the beginning-function material of the exposition (the P-zone), creating a local sense of “beginning again” within the overall context of formal conclusion. This dual function — ending-as-beginning — is what gives the recapitulation its distinctive emotional quality: it is simultaneously a resolution and a return, simultaneously the end of the argument and the restoration of the beginning.
Caplin’s concept of dissolution is central to understanding the development’s formal behavior. In the exposition, the P-zone and S-zone are characterized by what Caplin calls “tight-knit” formal organization: clear phrase beginnings and endings, predictable cadential patterns, well-defined formal boundaries between presentation and continuation, and a secure sense of tonal center throughout each section. The development systematically loosens this tight-knit organization. Motivic fragments are extracted from their original phrase contexts and subjected to sequential treatment, so that the same four-note gesture that began a complete phrase in the exposition now generates a sequence of five or six sequential repetitions with no stable arrival point. Cadences that were definitive in the exposition are now avoided or undermined: the development is, among other things, a systematic avoidance of full authentic cadences in the home key. This “dissolution” of tight-knit structure is not a failure of formal logic but a deliberate and positive formal procedure — the development earns its right to return to the tight-knit clarity of the recapitulation precisely by abandoning that clarity so thoroughly.
In the recapitulation, formal functions differ from their expositional counterparts in several important ways. The P-zone in the recapitulation often differs from the expositional P-zone in ways that reflect the harmonic journey just completed: it may be compressed (suggesting the composer wishes to arrive at the now-tonic-key S-zone as quickly as possible), expanded (suggesting the composer wishes to linger in the tonic arrival), or texturally transformed (adding a countermelody, altering the orchestration, or restating the theme in a different register). The transition in the recapitulation must be substantially rewritten: where the expositional transition modulated away from the tonic, the recapitulatory transition must either avoid modulating or modulate in a direction that allows the S-zone to arrive in the tonic. The S-zone in the recapitulation — now in the tonic key — carries the formal weight of the ESC, the structural goal of the entire movement. Caplin notes that the S-zone’s tonic-key presentation in the recapitulation often feels subtly different in character from its dominant-key presentation in the exposition: the resolution of the tonal argument gives the familiar theme a quality of arrival and completion that it could not have had when it was first heard in the “wrong” key.
7.5 Applying Caplin’s Framework: A Reading of Beethoven Op. 13
A complete application of Caplin’s analytical framework to the first movement of the Pathétique Sonata, Op. 13, reveals the depth and explanatory power of the formal-function approach. The Pathétique is an especially instructive case for Caplin’s framework because its formal organization is simultaneously conventionally clear (the three-part sonata structure is unambiguous) and expressively complex (Beethoven introduces a slow Grave introduction that returns within the Allegro, creating an unusual relationship between introductory and main-movement material). Caplin’s framework must account not only for the normative formal organization of the Allegro but also for the introduction’s formal status and its integrating role across the movement’s formal structure.
Before turning to the Allegro, the Grave introduction deserves careful Caplinian analysis. The introduction is formally a large-scale prefix at the movement level: it precedes the sonata form’s argument without constituting part of it. But Caplin’s analysis reveals more: within the introduction, there is a sentence-like organization (basic idea–restatement–continuation–cadential approach) that mirrors the formal logic of the Allegro’s tight-knit themes while remaining harmonically open and unresolved. The introduction ends not with a PAC but with a diminished seventh chord over the dominant bass — a deliberately unstable close that makes the Allegro’s entry feel like a formal and harmonic response to a question the introduction has posed but cannot answer on its own terms.
The Grave introduction (measures 1–10) has the formal function of a large-scale prefix at the movement level: it precedes the sonata-form argument without participating in it structurally. Within the introduction, there is an internal sentence-like organization: a basic idea (measures 1–2), a restatement (measures 3–4), and a continuation→cadential unit (measures 5–10) that ends on a diminished seventh chord over the dominant bass — a deliberately unresolved “question” that makes the Allegro’s entry feel like an urgent response.
The Allegro exposition (measures 11–132) has beginning function at the movement level and contains three sub-sections with their own internal functions:
- The P-zone (measures 11–22): beginning function at the exposition level; internally a tight-knit sentence (presentation, continuation, cadential).
- The transition (measures 23–50): middle function at the exposition level; internally loose, with fragmentation and harmonic acceleration.
- The S-zone (measures 51–88): beginning function (from the S-zone’s perspective) at the exposition level; internally a period-like organization with a lyrical antecedent and consequent in E-flat major, ending at the EEC.
- The C-zone (measures 88–132): ending function at the exposition level; internally post-cadential closure material confirming E-flat major before the exposition repeat.
The development (measures 133–186) has middle function at the movement level. Beethoven subjects the P-zone’s arpeggio figure to sequences through E minor, G major, and G minor before the retransition (measures 170–186) builds a prolonged G-major dominant pedal that prepares the return of C minor.
The recapitulation (measures 187–294) has ending function at the movement level. The P-zone returns in C minor (measures 187–198); the transition is rewritten and substantially compressed (measures 199–220); the S-zone appears in C minor (measures 221–252), delivering the ESC at measure 252; and the C-zone (measures 252–294) confirms C minor before the final cadential close.
A particularly rich analytical observation about the Pathétique’s formal structure concerns the reappearance of the Grave introduction within the Allegro. The Grave material returns twice during the movement: once at the beginning of the development section (measure 133), where it appears in E minor and signals the development’s opening; and once more near the end of the exposition repeat (measure 137 in the repeat). In Caplin’s framework, these Grave reappearances within the Allegro are unusual: the introduction, formally a prefix at the movement level, is not supposed to participate in the sonata form’s argument. Beethoven’s reintroduction of the Grave within the Allegro blurs this distinction: the introductory material has been drawn into the formal argument, serving as a developmental gesture within the development section itself. This formal integration of the introduction into the main movement’s argument is a distinctive Pathétique strategy, and it gives the movement a cyclical quality — the Grave material framing the entire movement and periodically returning to comment on the ongoing formal argument — that anticipates the Romantic cyclic form procedures of later composers.
7.6 Caplin’s Framework Applied to the Late Beethoven Style
Caplin’s framework was designed for the “classical style” of Haydn, Mozart, and early Beethoven, and it encounters some analytical challenges when applied to the late Beethoven style of the Op. 101–111 sonatas. The late style is characterized by formal procedures that deliberately defy or transcend the standard sentence/period/hybrid categories: extremely slow tempos that make phrase lengths ambiguous; continuous variation of thematic material that blurs the distinction between “statement” and “development”; and a tendency toward what Caplin calls “loosely knit” formal organization throughout, even in passages that would conventionally have tight-knit (beginning-function) character.
These challenges reveal the limits of any purely functional analytical approach when applied to music that is deliberately resistant to functional clarity. The late Beethoven style’s formal ambiguities are not failures of formal logic but a higher-order formal strategy: Beethoven deliberately creates passages that could be heard as either ending one formal unit or beginning another, either completing a phrase or initiating a new one, either providing closure or generating new departure. This deliberate formal indeterminacy is the stylistic correlate of the late works’ philosophical depth: formal certainty would imply that the music’s large-scale formal argument has been settled, whereas formal indeterminacy implies that the music’s argument is still open, still in progress, still refusing the neat cadential closures and well-defined formal boundaries of the middle-period style.
Caplin himself acknowledges these limitations: his framework is presented as a theory of the Classical style, not a universal theory of musical form. Applied to the late Beethoven style, it functions best as a set of normative expectations against which the late style’s deviations can be measured: knowing what a sentence “should” do, one can identify the specific ways in which the opening theme of Op. 109 departs from those expectations and assess what expressive and formal meaning those departures carry. The normative framework remains analytically useful even when the music violates it; indeed, violations of the norm are meaningful only against the backdrop of the norm. This is precisely the lesson of Hepokoski and Darcy’s dialogic framework at the level of the movement applied at the level of the phrase: formal meaning is always relational, always defined by the dialogue between what the music does and what the normative framework would lead one to expect.
Chapter 8: Form in the Nineteenth and Twentieth Centuries
8.1 Sonata Form Transformed: Romantic Expansions
The history of sonata form in the nineteenth century is not a history of its abandonment but of its transformation under new pressures — larger orchestras, longer movements, more complex harmonic languages, and new aesthetic imperatives rooted in the Romantic ideal of organic form, in which a whole work grows from a single motivic seed. The fundamental architecture of exposition-development-recapitulation remained the dominant formal framework throughout the century, but its internal relationships were significantly reinterpreted.
The expansion of the development section is the most visible Romantic transformation. Where a Classical development might occupy thirty to sixty bars, a Romantic development — in Schubert, Brahms, or Bruckner — may extend to several hundred bars, exploring a far wider range of tonal areas and subjecting the expositional themes to more exhaustive motivic elaboration. Brahms’s Symphony No. 1 in C minor, Op. 68, first movement, has a development section that dwarfs the exposition in length and harmonic scope, passing through E-flat major, D-flat major, and A minor before the retransition’s long dominant preparation. The development’s expansion reflects the Romantic aesthetic priority of exploration over architecture: the process of harmonic searching becomes as important as — sometimes more important than — the formal goal of the recapitulation.
Cyclic form — the recall of material from earlier movements in later ones — is a Romantic innovation that profoundly altered the relationship between a work’s movements. César Franck’s Symphony in D minor (1888) is a classic of cyclic form: a single generating theme appears in all three movements, transformed in character but recognizable as the same musical idea, creating an inter-movement formal unity that supplements and transforms the intra-movement formal structures. Berlioz’s Symphonie fantastique (1830) uses its idée fixe — a melody representing the beloved — as a cyclic link between all five movements. In these works, formal coherence is no longer purely intra-movement: the formal argument spans the entire work, and the recurrence of cyclic material creates long-range formal parallels that override the individual movements’ internal formal structures.
The delayed recapitulation is another characteristically Romantic manipulation of sonata form. In Classical sonata form, the recapitulation arrives at a predictable juncture: after the development has run its course and the retransition has established the dominant, the P-zone theme enters in the tonic in a moment of formal clarity that listeners familiar with the genre have been anticipating throughout the development. Romantic composers discovered expressive possibilities in withholding or ambiguating this arrival. Schubert’s first movement of the Symphony No. 9 in C major delays the recapitulation through an exceptionally long retransition that loops repeatedly around the dominant, creating an almost hallucinatory sense of suspension. Brahms’s Symphony No. 3 in F major, first movement, introduces the recapitulation in a hushed, subterranean register that initially sounds more like a developmental episode than a formal return — the recapitulation arrives before the listener has fully registered it.
Tonal pairing — the systematic use of two tonal centers in a non-hierarchical or deliberately ambiguous relationship — is a Schubertian and Romantic formal strategy that challenges the Classical sonata’s binary tonic-dominant logic. Where Classical sonata form defines itself by the opposition between tonic (I) and dominant (V), Romantic forms often substitute a third-related opposition: tonic and mediant (\(\text{I}\) and \(\text{III}\)), or tonic and submediant (\(\text{I}\) and \(\text{VI}\)), creating a “flat-side” or “sharp-side” tonal narrative in place of the Classical tonic-dominant one. Robert Hatten and other theorists have analyzed this Romantic tonal pairing as a reflection of a broader Romantic aesthetic of duality and irresolution: where Classical tonality resolves its tensions through a clear hierarchical return to the tonic, Romantic tonal pairing often leaves the two tonal centers in productive tension rather than resolving one into the other.
The Romantic expansion of the introduction is another transformation worth noting. Classical slow introductions — like the Grave of Beethoven’s Op. 13 or the slow introduction of Haydn’s Symphony No. 104 — are formally discrete entities, clearly separated from the Allegro that follows. Romantic introductions tend to be longer, more motivically integrated with the main movement’s thematic material, and more formally ambiguous in their relationship to the main movement. Brahms’s Symphony No. 1 opens with a long introduction whose material is intimately connected to the main Allegro’s thematic content — so intimately that the boundary between introduction and Allegro is itself a matter of analytical debate. In Bruckner’s symphonies, the slow introduction is replaced by a “Bruckner beginning” — a tremolando or sustained tone from which the main theme gradually emerges, as though the music is crystallizing out of silence rather than beginning with a formal gesture. These Romantic introductory strategies reflect a broader Romantic interest in formal genesis: the idea that a work’s material grows or emerges from a seed rather than being formally presented at the outset.
8.2 Schubert’s Formal Innovations
Franz Schubert’s relationship to sonata form is one of the most debated topics in nineteenth-century music theory. Schubert preserves the fundamental three-part structure of sonata form (exposition, development, recapitulation) but transforms its internal logic in several distinctive ways that collectively create what might be called a “Schubertian” formal style.
The most important Schubertian formal innovation is the use of third-related key areas in the exposition. Where Classical expositions typically move from tonic to dominant (I to V in major), Schubert’s expositions often move from tonic to the mediant (I to III, a third lower or higher), creating a more colorful, harmonically adventurous expositional argument. The first movement of the Piano Sonata in C major, D. 840, moves from C major to E minor for the S-zone — a mediant relationship that the Classical sonata would rarely employ as a primary expositional key contrast.
A related Schubertian strategy is what analysts have called the three-key exposition: an exposition that moves through not two but three distinct tonal areas before the C-zone closes the exposition. The first movement of the String Quintet in C major, D. 956, moves from C major (P-zone) through E-flat major (a brief, transitional area) to G major (the S-zone in the dominant), creating an exposition with a richer and more complex tonal geography than the Classical norm. The three-key exposition is not merely a proliferation of tonal areas for its own sake: each key carries its own expressive associations, and the accumulation of tonal contrasts within a single exposition creates a larger harmonic canvas against which the development and recapitulation can operate.
Schubert also transforms the development section in characteristic ways. Where Classical developments are often compact and highly focused — taking a single motivic idea and subjecting it to a rapid sequence of transpositions and harmonizations — Schubert’s developments tend to move more slowly and dwell more contemplatively in each tonal area. The development of the first movement of the Piano Sonata in B-flat major, D. 960, begins in F-sharp minor — the remotest key possible from the home key of B-flat major, reachable only through an enharmonic respelling — and moves through a series of dreamy, softly-scored passages in remote flat-side keys before the retransition’s long dominant preparation eventually returns to B-flat major. The development’s character is exploratory and ruminative, not dynamically driven: Schubert’s developments dwell in tonal spaces, creating atmosphere rather than generating dramatic momentum.
The recapitulation in Schubert frequently introduces harmonic surprises that Classical norms would not allow. In the first movement of the Symphony No. 8 in B minor, D. 759 (“Unfinished”), the recapitulation transposes the S-zone material not to the expected tonic (B minor) but to D major — the parallel major of the relative major, rather than the tonic minor that the Classical norm prescribes. This “wrong-key” recapitulation (from a strict Classical perspective) is in fact a deliberate expressive choice: the D-major S-zone material in the recapitulation has a quality of luminous, bittersweet resolution that the B-minor tonic would not have provided, and it leaves the movement’s tonal argument subtly unresolved even after the recapitulation is complete.
8.3 Liszt and the Single-Movement Sonata
Franz Liszt’s Piano Sonata in B minor (1853) is the most audacious formal experiment of the nineteenth century: a single-movement work of approximately thirty minutes’ duration that simultaneously represents multiple formal types — sonata form, four-movement symphonic form, and theme and variations — layered upon each other in a structure of exceptional complexity.
The formal principle that makes the Liszt Sonata’s multi-layered structure coherent is thematic transformation: the three generative motives announced in the slow introduction are not merely themes to be stated and developed but seeds from which the entire work’s formal and expressive content is grown. Each transformation of a motive is simultaneously a new formal event (a different section of the sonata form) and a new expressive statement (a different character or mood derived from the same fundamental material). Transformation is thus both a compositional technique (the same basic pitches and intervals, transformed in rhythm, tempo, harmony, and articulation) and a formal principle (the sequence of transformations constitutes the work’s formal architecture). The sonata form’s exposition-development-recapitulation provides the large-scale formal framework, but within that framework the transformations of the three motives provide the local formal articulations.
The Liszt Sonata’s influence on subsequent composers was substantial. Brahms’s D minor Piano Concerto, Op. 15, Schumann’s Piano Sonata in F-sharp minor, Op. 11, and a long line of single-movement piano works by composers from Fauré to Scriabin all reflect — directly or indirectly — the Lisztian model of the single-movement work as a compressed symphony whose formal complexity is managed through thematic transformation rather than through the conventional multi-movement architecture. The single-movement form became one of the most characteristic formal innovations of the late Romantic period, and its challenge — how to maintain coherence and direction across a very long span of time without the multi-movement structure’s conventional formal architecture — was one of the central formal problems of nineteenth-century music.
8.4 The Symphonic Poem and Through-Composed Form
The symphonic poem (or tone poem) — a one-movement orchestral form representing an extra-musical subject — presented composers with the challenge of inventing or adapting a musical form suited to a specific narrative or pictorial program. Richard Strauss’s Don Juan, Op. 20 (1888), uses a heavily modified sonata form to trace the protagonist’s amorous adventures; the programmatic narrative and the formal logic reinforce each other, the sonata’s P-zone representing Don Juan’s vital, questing energy and the S-zone his lyrical encounters with his various loves. Sibelius’s symphonic poems take a different approach: the arch form of En saga, Op. 9, and the continuous transformation of Tapiola, Op. 112, suggest formal paradigms derived from Finnish narrative traditions rather than the German symphonic tradition.
Through-composed form in song abandons repetitive formal structures entirely in favor of a musical form that follows the poem’s evolving narrative from beginning to end. Schubert’s Der Erlkönig, D. 328 (1815) — one of over six hundred Lieder Schubert composed in his short life — is the canonical example. Goethe’s ballad presents four dramatically distinct characters (narrator, father, child, Erlking) in a racing, terrifying narrative of a child’s death. Schubert’s through-composed setting gives each character its own melodic style: the father’s rational, calming tones; the child’s increasingly desperate cries; the Erlking’s seductive, honeyed phrases; the narrator’s objective account in G minor. The formal through-composition reflects the narrative urgency of the poem — a strophic setting (the same music for each stanza) would falsify the drama by making each encounter feel formally equivalent.
The invention of the symphonic poem as a distinct genre is most closely associated with Franz Liszt, who composed twelve such works between approximately 1848 and 1882, establishing the generic conventions that Strauss and others would later inherit and transform. Liszt’s symphonic poems are each formally idiosyncratic — each work invents or adapts a formal structure suited to its specific poetic or pictorial subject — and this idiosyncrasy is itself part of the genre’s aesthetic argument: the symphonic poem rejects the idea that any pre-existing formal schema (sonata form, rondo, ternary) is adequate to the demands of a specific literary or programmatic content. The form must be discovered through the content.
Les Préludes, S. 97 (after a poem by Lamartine, c. 1854), is Liszt’s most frequently performed symphonic poem and a clear illustration of his formal approach. The work is organized around a single generative motive — a three-note ascending figure — that appears in different tempos, harmonizations, and orchestral textures to represent the poem’s succession of images: the prelude-like quality of life before death, pastoral love, the trials of war, and the consolations of nature. The formal structure is roughly ternary at the large scale — slow introduction, fast middle section, slower coda — but within each section the music is through-composed, following the poem’s imagery rather than a predetermined formal template. The work’s organizing principle is thematic transformation (Liszt’s term, thematische Verwandlung): the same basic motive is transformed throughout, appearing now as a lyrical melody, now as a martial fanfare, now as a pastoral horn-call, creating formal unity out of expressive diversity.
Mazeppa, S. 100 (after a poem by Victor Hugo, 1851/1858), represents a more dramatic approach. The work begins with the famous etude depicting Mazeppa’s wild ride — the Cossack hero tied to a galloping horse — in a breakneck perpetual motion that has almost no formal articulation beyond the drive of its rhythm. The “formal” structure here is essentially narrative: the music runs with the horse until both collapse, and then — after a passage of desolation — a triumphant march announces Mazeppa’s rise as a Cossack leader. The form is generated entirely by the narrative’s three phases (ride, collapse, triumph) rather than by any abstract formal scheme.
Orpheus, S. 98 (1854), provides perhaps the most formally meditative of Liszt’s symphonic poems: a slow, lyrical work in which Orpheus’s lyre is represented by soft harp and woodwind sonorities, and the overall form is a gradual intensification and then withdrawal of lyrical energy — an arch shape without dramatic conflict, purely contemplative and elegiac.
Richard Strauss brought the symphonic poem to its greatest formal complexity and orchestral elaboration. His tone poems of the 1880s–1890s — Don Juan (1889), Tod und Verklärung (1889), Till Eulenspiegels lustige Streiche (1895), Also sprach Zarathustra (1896), Don Quixote (1897), Ein Heldenleben (1898) — each employ a different formal approach suited to their specific subject. Till Eulenspiegel uses a loose rondo form, with Till’s mischievous theme returning repeatedly between adventures. Don Quixote is a theme with variations, each variation depicting one of the knight’s adventures. Also sprach Zarathustra juxtaposes a grand C-major opening (the famous fanfare representing the cosmos) with a series of episodes in B major, creating a tonal opposition that remains unresolved at the work’s end — an open form that reflects the philosophical irresolution of Nietzsche’s text.
For the analyst, through-composed and symphonic poem forms present a distinctive methodological challenge: the appropriate analytical vocabulary is not purely musical but necessarily involves the program or text that the music follows. A purely formalist analysis of Les Préludes that ignores Lamartine’s poem will miss the structural logic that organizes the work’s successive sections. Yet a purely programmatic analysis that treats the music as direct illustration of the poem risks reducing a complex musical structure to a simple translation exercise. The most productive analytical approaches treat program and musical form as independent but interacting systems: the program provides a sequence of images or events, and the composer responds to that sequence by inventing musical structures that both follow the program and achieve their own musical coherence.
8.5 Bartók’s Arch Form
Béla Bartók employed a distinctive large-scale formal principle in several of his major works: arch form (ABCBA), in which the formal sections of a multi-movement work are palindromically symmetrical around a central axis.
- First and last sections correspond (A).
- Second and penultimate sections correspond (B).
- Central section is unique (C), forming the arch's keystone.
Bartók’s String Quartet No. 4 (1928) is the locus classicus of arch form in the twentieth century. The five movements are arranged ABCBA: Movements I and V are analogous (both vigorous and rhythmically aggressive, Movement V recalling thematic material from Movement I); Movements II and IV are analogous (both scherzos with similar character and material); and Movement III is the central slow movement of almost unbearable intensity. Bartók explicitly recalls specific themes and textures at each corresponding point, making the formal symmetry clearly audible on repeated hearings.
The arch form’s relationship to tonal symmetry is a recurring feature of Bartók’s use of the principle. In the String Quartet No. 4, the movements’ tonal centers are also arranged symmetrically: C–A–E–A–C (or variants of this). The tonal palindrome reinforces the formal palindrome, so that the arch is simultaneously a formal and a harmonic structure. This double symmetry is characteristic of Bartók’s formal thinking more broadly: he tends to construct his large-scale formal architectures with multiple interlocking symmetrical principles rather than a single organizing axis. The result is a formal coherence of unusual density — the arch form is not merely an external template imposed on the music but an expression of the music’s own internal organizational logic.
The analytical challenge that arch form presents is the question of correspondence: in what sense are the outer sections and the inner sections “the same”? In Bartók’s works, the correspondence is sometimes literal (exact recurrences of thematic material) and sometimes gestural (the same emotional or textural character without exact thematic recall). The analyst must specify which kind of correspondence obtains and at what formal level — whether the symmetry is thematic, harmonic, textural, or merely temporal (the same number of bars). These distinctions are analytically important because they determine whether the arch form is structurally binding (exact recurrence) or merely expressively evocative (gestural return).
8.6 Additive Form in Minimalism
The Minimalist composers of the 1960s and 1970s — Steve Reich, Philip Glass, Terry Riley, La Monte Young — developed a radically different approach to musical form: process form (also called additive form or phase form), in which the formal structure emerges from the gradual, systematic application of a compositional process to a simple initial pattern. The form is not imposed from outside as a template; it is generated from within by the process itself.
- Phasing: two identical patterns begin in unison; one gradually accelerates, causing the patterns to drift out of synchrony, traversing a range of rhythmic relationships before (sometimes) returning to unison.
- Additive process: a short pattern is gradually lengthened by adding notes or beats, so that each repetition is slightly longer than the previous one.
- Substitution: notes within a pattern are gradually replaced by rests (or vice versa), creating a gradual change in the pattern's density.
Steve Reich’s Piano Phase (1967) is the canonical example of phase form: two pianos begin playing an identical twelve-note melodic pattern in unison; one gradually accelerates, causing the two parts to drift out of phase. As the phasing proceeds, new composite rhythmic patterns emerge from the interference between the two lines — patterns that are not “composed” by Reich but generated by the process. The form has a beginning (unison), a middle (the phasing sequence, traversing multiple phase relationships), and an end (another phase relationship, though not necessarily a return to unison). The aesthetic of gradual transformation — of form as process rather than architecture — represents a fundamental departure from the architectonic formal thinking of the Classical tradition.
Philip Glass’s Music in Fifths (1969) and Music with Changing Parts (1970) explore additive form through different processes: beginning with a short melodic-rhythmic cell and gradually extending it by adding notes at the beginning and end, or by varying the number of repetitions of each section. The form is literally visible in Glass’s notation: each section is a longer or different version of the previous one. The aesthetic of gradual transformation, of immersive stasis punctuated by incremental change, defines the Minimalist formal aesthetic.
8.7 Open Form and Aleatory
The post-war avant-garde developed the concept of open form: a formal structure in which some or all of the parameters of the performance are left undetermined — left “open” to chance operations, to performer choice, or to environmental conditions. Open form represents the most radical possible departure from the deterministic formal structures of the Classical tradition.
John Cage’s Music of Changes (1951) is among the first fully realized aleatory works: composed using the I Ching (a Chinese divination text using coin-tossing to generate hexagrams), the work’s every parameter — pitch, duration, dynamics, silence — was determined by Cage’s interpretation of randomly generated I Ching hexagrams. The result has no “form” in the traditional sense: its succession of events follows no internal logic, no motivic development, no tonal argument. Yet the very absence of these traditional formal principles is itself a formal statement — a deliberate refusal of the European tradition of purposive, teleological formal organization, grounded in Cage’s Zen-influenced philosophy that music should present sounds “as themselves,” without the imposition of human will or expression.
Earle Brown’s December 1952 goes further still: a single page of graphic notation — black rectangles of varying sizes and positions on a white field — that performers may interpret in any orientation (the page may be held vertically or horizontally) and in any manner they choose. Brown called this approach “mobile form,” drawing an analogy with Alexander Calder’s sculptural mobiles, which take different configurations depending on air currents. The score imposes only the most general constraints (the rectangles suggest durations and/or pitches; their sizes suggest dynamics or prominence); the work’s form is entirely recreated at each performance.
8.8 Spectral Form in Grisey
The French spectral composers of the 1970s and 1980s — Gérard Grisey, Tristan Murail, Horatiu Radulescu — developed a formal approach derived from the acoustic analysis of sound spectra. Rather than organizing form through motivic development, tonal harmonic relations, or a compositional process, spectral composers organize time through the transformation of acoustic spectra: a work’s form is the trajectory of spectral change, from consonance to dissonance, from stasis to motion, from the “pure” sound of a simple harmonic series to the “noisy” sound of inharmonic spectra.
Grisey’s complete cycle Les espaces acoustiques (1974–1985) unfolds over six works of increasing ensemble size, from solo viola to large orchestra. The cycle’s overall formal logic is spectral: beginning from a single note and its natural overtones, the works progressively complexify, distort, and eventually resolve the spectral argument. The form of the entire cycle — spanning over seventy-five minutes of music — is a macro-level spectral trajectory, a form that has no precedent in Western music before the spectral school.
8.9 Network Form and Electronic Music
The rise of electroacoustic music and computer-assisted composition in the late twentieth century created new formal possibilities. Network form describes a formal structure in which the work consists of a network of possible paths through a set of musical modules, with the actual performance traversing a path determined by performer choice, algorithmic decision, or real-time audio analysis.
Stockhausen’s Klavierstück XI (1956) is an early prototype: the score is a single large sheet containing nineteen musical fragments. The performer begins anywhere on the sheet, plays a fragment, then proceeds to whatever fragment their eye falls on next. The work’s form is not fixed but probabilistic — defined by the network of possible orderings rather than by any single realization. Each performance is a unique traversal of the same network, and the “form” of the work is the totality of the possible paths through that network.
In computer-assisted interactive works — such as George Lewis’s Voyager (1987), a composition for improviser and autonomous computer — the form is determined in real time by the interaction between human and machine, each responding to the other’s gestures. The “score” of such works is not a fixed sequence of events but an algorithm: a set of rules governing how the computer will respond to what the human plays. The form that emerges is genuinely unpredictable and unrepeatable — a form that could not have existed before digital computation made real-time musical intelligence possible.
8.10 Post-Minimalism and the Return of Repetitive Forms
Post-minimalism — the style of composers like John Adams, David Lang, Michael Gordon, and Julia Wolfe — represents a partial return to repetitive, tonally organized formal structures, inflected by the lessons of minimalism but embedded within a more expressive, emotionally varied musical language. Post-minimalist works typically employ clear phrase structure and regular meter (often borrowed from minimalism’s even, pulsed rhythmic surface) but combine these with harmonic progressions, dynamic contrasts, and emotional arcs that are far more varied and dramatically weighted than the gradual transformations of classic minimalism.
8.11 Conclusion: Form as Argument
These late-twentieth-century formal experiments remind us that “form” in music is not a neutral descriptive concept but a culturally situated one — grounded in specific assumptions about the nature of musical time, the roles of composer and performer, and the relationship between music and meaning. The analytical tools developed in this course — cadence types, phrase models (sentence, period, hybrid), small form categories (binary, ternary, rondo, variations), Caplin’s formal functions (beginning, middle, ending), Hepokoski-Darcy’s sonata-form zones (P, TR, S, C) and dialogic norms — were developed to analyze a specific repertoire and a specific set of compositional priorities. They remain indispensable for understanding the music of the Classical and Romantic periods. But the music of the twentieth and twenty-first centuries demands supplementary analytical frameworks, and the analytical spirit that animates the entire history of form analysis — the habit of asking what a passage does rather than merely what it contains — is as relevant to the analysis of Grisey’s spectral transformations or Reich’s phase processes as it is to the analysis of a Mozart period or a Beethoven sonata exposition.
Form is never merely a container. It is always an argument — a claim about how musical time should be organized, how tonal tension should be generated and resolved, how the relationship between familiar and unfamiliar, between expectation and surprise, should be calibrated. Every formal structure studied in this course represents a different answer to the same fundamental question: given that music unfolds in time, what is the best way to organize that unfolding so that the listener’s experience is shaped, directed, and ultimately fulfilled? The diversity of answers — from the period’s question-and-answer symmetry to the chaconne’s ostinato variations to the sonata’s tonal drama to the phase piece’s processual transformation — testifies to the extraordinary richness of the human musical imagination and to the inexhaustibility of the formal question itself.
The study of musical form is ultimately the study of musical meaning — not meaning in the narrow sense of programmatic content or extramusical reference, but meaning in the deeper sense of how music organizes its own internal relationships to create coherence, drama, contrast, and resolution. When Beethoven delays the recapitulation of the Eroica, when Mozart arranges his C-zone closing themes in a sequence of decreasing register and dynamic, when Bach organizes an entire Chaconne as a vast arch from D minor through D major and back — these decisions are not arbitrary preferences but formal arguments: claims about how this particular musical material should be organized to achieve the most powerful formal and expressive effect. The analyst who can read these arguments, who can hear the form as an active process rather than a static container, has acquired one of the most valuable skills in musical understanding.
Understanding formal processes also deepens the experience of listening. A first-time listener to a Beethoven sonata may enjoy the surface of the music — the drama, the beauty, the energy — without being able to articulate why one moment feels transitional and another feels conclusive, why the return of an opening theme provokes a different response the second time it is heard than the first. A listener trained in formal analysis can articulate these intuitions, can place specific moments within the formal logic of the whole, and can understand why the music unfolds as it does rather than merely accepting that it does. This articulate understanding does not diminish the music’s emotional power; on the contrary, it deepens it, by revealing the extraordinary craft and formal intelligence that underlies the most compelling moments of musical experience.
End of MUSIC 373 notes. The analytical frameworks developed here — phrase structure, cadence types, formal functions, sonata theory — provide the tools for engaging with the full range of Western music from the Baroque through the contemporary. Students wishing to extend these analyses should consult the primary sources listed above, particularly Caplin’s chapter-by-chapter analyses of specific works, the Hepokoski-Darcy analyses in their full appendices, and Rosen’s detailed readings of individual movements in The Classical Style.
Key Analytical Vocabulary Summary
The following core terms from this course’s analytical frameworks should be firmly in the analyst’s toolkit:
From Caplin’s Classical Form: phrase, sentence (basic idea, presentation, continuation, cadential), period (antecedent, consequent, HC, PAC), hybrid theme types (Hybrid 1–4, compound basic idea), phrase extension (prefix, suffix, internal expansion), elision, evaded cadence, tight-knit vs. loose organization, formal functions (beginning, middle, ending), formal hierarchy (phrase–theme–section–movement), the five formal types (small binary, small ternary, sentence, period, hybrid).
From Hepokoski and Darcy’s Elements of Sonata Theory: P-zone, TR (transition), medial caesura (MC), S-zone, EEC (essential expositional closure), C-zone, codetta, development, retransition, recapitulation, ESC (essential structural closure), coda, the five sonata types (Type 1–5), dialogic form, normative script, deformation.
From Rosen’s Sonata Forms and The Classical Style: tonal polarity (tonic vs. dominant), exposition as tonal argument, development as harmonic consequence, recapitulation as resolution, the expressive significance of formal proportions, the relationship between form and style in the Classical period.
These frameworks are complementary, not competing: Caplin addresses phrase-level formal organization from the inside out; Hepokoski and Darcy address movement-level formal organization from the outside in (normative expectations and dialogic deviations); Rosen situates both within the expressive and stylistic conventions of the Classical period. The most complete formal analysis draws on all three, recognizing that musical form is simultaneously a phrase-level phenomenon, a movement-level phenomenon, and a stylistically and historically conditioned phenomenon.
In practice, a well-executed written analysis typically proceeds in three stages. First, the analyst identifies the movement’s large-scale formal type (sonata, rondo, variation, binary, ternary) and establishes its principal divisions with measure numbers and key areas. Second, the analyst examines each major formal section internally using Caplin’s formal-function vocabulary: identifying phrase types (sentence, period, hybrid), cadential types (PAC, IAC, HC, evaded), and the tight-knit/loose character of each sub-section. Third, the analyst evaluates the movement against the appropriate normative script — the Hepokoski-Darcy sonata-type template, or the rondo or variation paradigm — noting where the specific work confirms the norm, where it extends or embellishes it, and where it departs from it in ways that constitute compositionally meaningful “deformations.” The most illuminating analytical claims are those that connect formal observations to expressive ones: not just identifying that the EEC arrives later than expected, but explaining what the delay accomplishes — what effect the extended S-zone has on the listener’s sense of arrival and what that effect contributes to the movement’s expressive trajectory. This integration of structural description and expressive interpretation is the hallmark of genuinely analytical music criticism.
The practical analytical method that integrates all three approaches proceeds in three stages. First, at the local level: identify phrase types (sentence, period, hybrid), locate cadences and assess their strength (PAC, HC, IAC, DC, evaded), and trace phrase extensions and elisions. Second, at the section level: identify formal sections (exposition, development, recapitulation; or A, B, A’ in binary/ternary), locate the structurally crucial cadential moments (EEC in the exposition, ESC in the recapitulation), and assess the tonal organization of each section and the movement between them. Third, at the movement level: characterize the movement’s formal dialogue with the normative script — what does it do that is expected, what does it do that is unexpected, and how do the unexpected choices contribute to the movement’s specific expressive argument? Only when all three levels of analysis are complete does the form reveal its full depth and complexity — and it is at that point that the music reveals, in the most precise terms available, exactly what it is arguing and how it is making its case.
Mastery of these analytical tools is not the end of musical understanding but its beginning. The frameworks of Caplin, Hepokoski-Darcy, and Rosen provide a shared critical language through which analysts can communicate precisely about formal phenomena that would otherwise be difficult to describe. But formal analysis in the deepest sense is always in the service of musical understanding: the goal is not to fill in a formal diagram correctly but to arrive at a richer, more articulate, and more fully realized experience of the music itself.