MUSIC 676: Ethnomusicological Methods and Fieldwork

Estimated study time: 1 hr 6 min

Table of contents

These notes draw on Gregory Barz and Timothy Cooley (eds.) Shadows in the Field: New Perspectives for Fieldwork in Ethnomusicology (2nd ed., 2008), Helen Myers (ed.) Ethnomusicology: An Introduction (1992), Bruno Nettl’s The Study of Ethnomusicology: Thirty-One Issues and Concepts (2005), Jeff Todd Titon (ed.) Worlds of Music: An Introduction to the Music of the World’s Peoples (6th ed., 2017), Alan Merriam’s The Anthropology of Music (1964), and supplementary materials from Columbia University GR8412 and UCLA ethnomusicology doctoral seminars.


Chapter 1: Defining Ethnomusicology

1.1 Music as Culture, Not Autonomous Object

Ethnomusicology begins with a deceptively simple but philosophically radical proposition: music is not an autonomous aesthetic object but a human practice embedded in social life, cultural meaning, and historical circumstance. To study music ethnomusicologically is therefore not only to analyze its pitches, rhythms, and formal structures, but to ask who makes it, when and where and why, what participants say it means, how it is taught and learned, how it is bought and sold, how it encodes and reinforces social distinctions, and what it feels like from the inside — from the perspective of those whose music it is.

This orientation distinguishes ethnomusicology sharply from the mainstream of Western academic musicology as it developed through the nineteenth and early twentieth centuries. Classical musicology, at least in its dominant positivist form, treated the musical work — typically a written score, typically by a named European composer — as its primary object. The analyst’s task was to describe the internal logic of that object: its thematic architecture, harmonic language, counterpoint, large-scale formal organization. The social conditions of the work’s production, circulation, and reception were, in this framework, interesting biographical background at best, and at worst a distraction from the music itself.

Ethnomusicology refuses this abstraction. It insists that music cannot be separated from the human beings who make it and the social worlds in which it lives. This does not mean that ethnomusicologists are uninterested in musical sound, or that they regard formal and structural analysis as irrelevant. Rather, they insist that structural analysis must be integrated with social and cultural analysis: a melody is not only a sequence of intervals but also a vehicle for particular kinds of emotional expression, a marker of ethnic or religious identity, a pedagogical tool, an economic commodity, and perhaps a legally contested form of intellectual property, all at once.

Ethnomusicology is the scholarly study of music as a form of human culture and social practice. It combines musical analysis with ethnographic fieldwork, drawing on methods and theories from cultural anthropology, linguistics, sociology, and music theory. The field's defining commitments are: (1) taking music of all cultures as equally worthy of scholarly attention; (2) studying music in its living social and cultural context, not only as a text or score; and (3) generating understanding through sustained firsthand engagement with musical communities.

1.2 Origins: Comparative Musicology

The discipline’s institutional prehistory lies in what German-speaking scholars called Vergleichende Musikwissenschaft — comparative musicology — a project that flourished in Berlin, Vienna, and Leipzig from roughly the 1880s through the 1930s. Its intellectual roots were simultaneously in natural science, evolutionary anthropology, and colonial administration. The new availability of the phonograph cylinder from the 1890s onward made it possible, for the first time, to preserve and study musical performances from distant cultures in the library and the laboratory rather than only in the field.

Hermann von Helmholtz had already demonstrated in On the Sensations of Tone (1863) that musical acoustics could be treated as a branch of physics, and that the scales and intervals of European music were not natural facts but culturally contingent solutions to a universal problem of organizing the harmonic series. Alexander J. Ellis, the English phonetician and translator of Helmholtz, pushed this insight further in his landmark 1885 paper “On the Musical Scales of Various Nations,” in which he systematically measured the pitches of non-European instruments using a device called a monochord and a unit he invented — the cent, one hundredth of an equal-tempered semitone — that allowed scalar structures from different cultures to be precisely compared without presupposing the primacy of any one system.

At the Berlin Phonogram Archive, founded in 1900, Carl Stumpf and his student Erich Moritz von Hornbostel built what was for a time the world’s largest collection of cylinder recordings of non-Western music. Hornbostel, working with the organologist Curt Sachs, developed a systematic classification of musical instruments that became the foundation of comparative organology. The Hornbostel-Sachs classification system, first published in 1914, divided all musical instruments into four primary categories — idiophones, membranophones, chordophones, and aerophones — according to the physical mechanism that produces their sound, with electrophones added later.

The Hornbostel-Sachs system was designed partly as a tool for diffusion studies: if one could establish that a particular instrument type (say, a spike fiddle with a resonating membrane) appeared in both South Asia and Southeast Asia, one could begin to reconstruct the historical routes by which musical technologies traveled. The system remains in use today in museums, archives, and catalogues, though its limitations — particularly its difficulty with electronic instruments and with instruments that produce sound through multiple mechanisms — are well recognized.

Diffusionism — the theory that cultural traits, including musical ones, spread outward from points of origin through contact, trade, migration, and conquest — animated much of the comparative musicologists’ work. Their ambition was nothing less than a global atlas of musical cultures that would allow scholars to trace the prehistoric migrations of human populations by following the distribution of instrument types, scale systems, and melodic patterns.

1.3 The Shift to Ethnomusicology

The term ethnomusicology was coined by the Dutch scholar Jaap Kunst in a 1950 addendum to his survey of Indonesian music, where he proposed it as a replacement for “comparative musicology” — a name he felt was misleading because the discipline’s primary aim was not comparison but ethnographic description and analysis. The new term was picked up and popularized in the United States, above all by Mantle Hood at UCLA, who in 1960 founded the first dedicated graduate program in ethnomusicology in North America.

Hood’s conception of the discipline was distinguished from its comparative predecessor in at least two fundamental ways. First, it insisted on fieldwork as the methodological foundation: where the comparative musicologists had largely worked in archives and laboratories, analyzing recordings sent back by colonial administrators and missionaries, the ethnomusicologist was expected to go to the community, live among its members, and learn its music firsthand. Second, Hood introduced the concept of bi-musicality — the idea that the ethnomusicologist should strive to become a competent performer in the music she studies, not merely a detached analyst of recordings.

1.4 Merriam’s Tripartite Model

The most influential single theoretical framework in the discipline’s first generation was proposed by the ethnomusicologist and anthropologist Alan Merriam in his 1964 book The Anthropology of Music. Merriam argued that music, as an object of study, must be understood simultaneously at three levels.

Merriam's tripartite model analyzes music at three interlocking levels:
  • Music as concept: the ideas, beliefs, and values that a community holds about music — what music is for, who should make it, when and where it is appropriate, what distinguishes good music from bad, how it relates to the sacred or the political.
  • Music as behavior: the social activities surrounding music-making — rehearsal, performance, instruction, ceremony, commerce, and the social roles that musicians and listeners play within the community.
  • Music as sound: the acoustic and structural properties of the music itself — melody, rhythm, timbre, texture, form.
Merriam's key claim was that all three levels must be studied together, and that any analysis that focuses on only one will be distorted.

Merriam’s model was immensely generative, in part because it was clear enough to function as a template for fieldwork planning. A researcher arriving in a new community could organize her initial questions around all three levels: what do people say about music? what do they do in musical contexts? what does the music sound like? But critics, particularly Timothy Rice in a 1987 article that became a touchstone of the field, argued that Merriam’s framework was too static, that it treated music as a social fact rather than a dynamic process, and that it marginalized the individual — the creative, choosing, historically situated human being who actually makes music.

1.5 Seeger, Blacking, and the Emic/Etic Problem

Charles Seeger — father of the folk singer Pete Seeger, and himself one of the founding figures of American musicology — drew a sharp distinction between what he called musicology and ethnomusicology. Musicology, in his usage, was the study of “music about music”: the analytical and historical disciplines that treat music as an internally organized system. Ethnomusicology was the study of “music in culture”: the inquiry into music as a dimension of social life, irreducible to its internal logic.

John Blacking, the South African ethnomusicologist whose work on Venda children’s songs and adult ceremonial music became a model for the field, offered in How Musical Is Man? (1973) a definition that has become almost a disciplinary motto: music is humanly organized sound. The definition is deliberately inclusive — it encompasses the simplest lullaby and the most complex symphonic score, jazz improvisation and ritual drumming — and it places the emphasis on human agency and cultural organization rather than on any particular acoustic property. Blacking also made the radical claim that all human beings are musical, that musicality is not a rare talent but a universal human capacity, and that cross-cultural comparison can reveal this universality beneath the surface diversity of musical traditions.

The emic/etic distinction, borrowed by ethnomusicologists from the linguist Kenneth Pike, contrasts two analytical perspectives:
  • An emic analysis describes music in terms meaningful to the culture's own members — using their categories, their vocabulary, their evaluative criteria.
  • An etic analysis describes music in terms imposed from outside — using the analyst's own categories, typically drawn from Western music theory or social science.
Both perspectives have value, but the tension between them is never fully resolvable: an outsider analyst can never achieve a purely emic understanding, because she cannot entirely shed her own cultural formation. Reflexive ethnomusicology takes this impossibility as a methodological premise rather than a problem to be solved.

Chapter 2: Theoretical Frameworks

2.1 Functionalism and Its Discontents

The earliest systematic theoretical framework available to ethnomusicologists from anthropology was functionalism: the idea that cultural institutions, practices, and beliefs persist because they serve the functional needs of societies. In ethnomusicology, Merriam’s 1964 account of the social functions of music remains the most cited example. He identified ten functions: emotional expression, aesthetic enjoyment, entertainment, communication, symbolic representation, physical response, enforcing conformity to social norms, validation of social institutions and religious rituals, contribution to the continuity and stability of culture, and contribution to the integration of society.

Example — Functional analysis of a Zulu ceremonial song cycle: A functionalist analysis might observe that the songs performed at a Zulu initiation ceremony serve several of Merriam's ten functions simultaneously: they enforce conformity to social norms (by dramatizing the expected behavior of an initiated adult), validate religious ritual (by invoking ancestral protection), and contribute to the integration of society (by gathering dispersed kin in collective performance). The functionalist account illuminates why the community performs these songs and what would be lost if they disappeared — a practically important insight for cultural preservation work.

The critique of functionalism in ethnomusicology parallels the broader critique in anthropology. If music is analyzed only in terms of its social functions, two things are lost: the aesthetic dimension — the fact that music is valued partly for its intrinsic sonic properties, not only its social utility — and the possibility of dysfunction, conflict, and change. A musical tradition that is maintained by some members of a community against the resistance of others, or one that is undergoing rapid transformation under commercial pressure, cannot be adequately described in functionalist terms alone.

2.2 Cultural Relativism

Cultural relativism, as a methodological principle, holds that every cultural practice must be understood on its own terms, in its own context, before it can be evaluated or compared. In ethnomusicology, this means that every musical system — however alien to the analyst’s ear — must be treated as internally coherent and locally meaningful before any comparative claims are made. The principle has Boasian roots: Franz Boas, the German-American anthropologist who shaped American cultural anthropology in the early twentieth century, insisted against the evolutionary hierarchies of his time that no culture was simply more advanced than another, and that the anthropologist’s first obligation was sympathetic description rather than evaluative ranking.

Cultural relativism as a methodological stance does not commit the researcher to moral relativism — the view that no cross-cultural moral judgments are possible. It is a disciplinary norm about how to begin analysis, not a metaphysical claim about value. Nevertheless, it has generated ongoing debates in ethnomusicology about whether the researcher is obligated to maintain neutrality toward musical practices she finds ethically problematic.

2.3 Diffusionism and Organology

Diffusionism as a theory of musical culture holds that musical traits — instruments, melodic formulas, rhythmic patterns, tuning systems — spread geographically through contact between populations. The ambition of the early comparative musicologists was to use the geographic distribution of instrument types to reconstruct prehistoric population movements. While grand diffusionist narratives have largely been abandoned — they tended toward speculative just-so stories about “cradles of civilization” — diffusionist methods remain valuable in more modest forms: tracing the spread of the lute family from the Middle East through Europe, documenting how the banjo traveled from West Africa to the American Southeast, or mapping the distribution of pentatonic scales.

The Hornbostel-Sachs system remains the standard organological tool for these kinds of studies. Its decimal classification (chordophones subdivide into simple chordophones, composite chordophones, and so on, each with its own numerical designation) allows instruments from different traditions to be catalogued and compared in a way that does not presuppose any single culture’s own instrument taxonomy.

2.4 Nettl’s Pluralism and Rice’s Revised Model

Bruno Nettl, whose The Study of Ethnomusicology: Thirty-One Issues and Concepts is the closest thing the field has to a canonical methodological textbook, argues for a theoretically pluralist ethnomusicology. No single theory, he insists, is adequate to the full range of musical phenomena the discipline encounters; the researcher must be conversant with multiple frameworks — functionalist, semiotic, cognitive, political-economic — and must be willing to select and combine them based on the specific questions being asked about a specific musical tradition.

Timothy Rice’s 1987 article “Toward the Remodeling of Ethnomusicology” proposed replacing Merriam’s tripartite model with a framework organized around three temporal scales of musical life:

Rice's revised model locates music at the intersection of three processes:
  • Historically constructed: music exists in a tradition that has been shaped by past events, migrations, encounters, and choices — the researcher must understand this historical sediment.
  • Socially maintained: music is kept alive by the ongoing practices, institutions, and social relationships of a community — the researcher must observe these in the present.
  • Individually created and experienced: within the historical and social constraints, individual musicians make creative choices and individual listeners have private experiences — the researcher must attend to this personal dimension.

2.5 Semiotics, Marxism, and Post-Structuralism

Thomas Turino’s application of the American philosopher Charles S. Peirce’s semiotic framework to ethnomusicology, developed most fully in Music as Social Life (2008), offers a particularly productive theoretical tool. Turino distinguishes between different types of musical signs — icons (which resemble their objects), indices (which point to their objects through physical or experiential connection), and symbols (arbitrary conventions). He argues that music is especially powerful as an indexical sign: particular instruments, rhythms, melodies, or timbres become directly connected in memory and experience to the social occasions on which they are heard, and thereby acquire the capacity to evoke those occasions, identities, and communities whenever they are performed.

Turino also distinguishes between participatory and presentational musical performance — the former characterized by the integration of performers and listeners in a common activity, the latter by a clear separation between performers on a stage and an audience. This distinction carries important consequences for how music functions as social glue: participatory performance builds stronger community bonds than presentational performance, because it puts all participants in the same active role.

Turino's distinction between participatory and presentational performance has been influential in applied ethnomusicology — particularly in debates about community music, music therapy, and the design of musical education programs. It raises the question of whether a concert tradition that seats the audience in rows facing a stage can produce the same kind of social bonding as a circle drum session, and suggests that the architectural and social organization of musical events is never neutral.

Marxist and political economy approaches to ethnomusicology examine music as a commodity produced and consumed within capitalist structures. The rise of the “world music” industry in the 1980s and 1990s — in which major record labels marketed music from Africa, Latin America, and South Asia to Western consumers — provided an especially rich object of analysis: who profits from the sale of Malian kora music on a CD distributed by a London label? What happens to the musicians, and to the local musical economy, when their music is extracted and repackaged for export? Post-structuralist and feminist approaches ask how power relations shape the archive of ethnomusicological knowledge itself: whose music gets recorded, by whom, archived where, analyzed how, and published for what readership?


Chapter 3: Ethnographic Fieldwork

3.1 What Fieldwork Means

Fieldwork in ethnomusicology is not a data-collection technique but a mode of inquiry — a way of being in relation to a musical community over time. It is defined by sustained presence, reciprocal relationship, and the willingness to be changed by what one encounters. The contrast with laboratory or archive-based research is fundamental: the ethnomusicologist cannot control her research environment, cannot replicate her encounters, and cannot maintain the distance that experimental design presupposes.

The conception of fieldwork that has governed ethnomusicology since the 1960s draws heavily on cultural anthropology’s development of participant observation as a method — a method associated above all with Bronisław Malinowski, whose extended residence among the Trobriand Islanders of Papua New Guinea between 1915 and 1918 produced a model of intensive, long-term fieldwork that transformed the discipline.

3.2 Participant Observation and Bi-musicality

Participant observation involves being present in a community as a participant — taking part in the activities one is studying — while simultaneously maintaining an analytical observer’s perspective. The tension between these two roles is constitutive of the method, and is never fully resolved. As a participant, the researcher risks losing the analytical distance that allows her to notice what community members take for granted. As an observer, she risks the superficiality of a tourist — present but not really involved, unable to understand what she sees because she has not experienced it from the inside.

Mantle Hood’s concept of bi-musicality addresses this tension by insisting that genuine musical participation is analytically essential, not merely a courtesy or a diplomatic gesture. Learning to play the Javanese gamelan, or the Indian sitar, or the Zimbabwean mbira — not to virtuoso performance level, but to a level of competent participation — gives the researcher access to the music’s logic from the inside: the physical sensations of producing the sounds, the cognitive challenges of learning the repertoire, the social experience of playing with others, the embarrassments of making mistakes, and the pleasures of getting something right.

Example — Bi-musicality in gamelan study: Mantle Hood's own training in Javanese gamelan at the court of Yogyakarta allowed him to observe, from personal experience, that the instruments of the gamelan orchestra are not simply interchangeable but occupy hierarchically ordered positions — that learning to play a given instrument is also learning one's place within a complex social and musical structure. This experiential knowledge could not easily have been obtained by listening alone, or by interviewing musicians who found the hierarchy too obvious to articulate.

3.3 Preparing for Fieldwork

Preparation for ethnomusicological fieldwork involves several parallel streams of work that ideally begin one to two years before departure. Language learning is non-negotiable for serious fieldwork: even a modest command of the community’s language transforms the researcher’s access to informal conversation, overheard commentary, and song texts. Literature review should cover not only the ethnomusicological literature on the relevant musical tradition but also the historical, anthropological, and area studies scholarship on the region and community.

Institutional requirements have grown substantially in the past two decades. An Institutional Review Board (IRB) application is required for any research involving human subjects at a North American university. In ethnomusicology, most fieldwork falls into the “minimal risk” category — observing and recording musical performances — but any research that involves interviewing about sensitive topics, that works with vulnerable populations (children, prisoners, people with disabilities), or that involves a significant deception component requires full board review. Beyond the IRB, the researcher must obtain appropriate visas, and in many countries must also obtain permission from a national research council and/or a local institutional host.

Identifying communities and initial contacts is often facilitated by the diaspora networks available in university cities: students, visiting musicians, community organizations, and religious institutions associated with the target culture. Arriving in the field with some existing relationships is enormously valuable; arriving as a complete stranger with no introduction is both harder practically and ethically more fraught.

Example — Preparing fieldwork on Carnatic music: A researcher planning a year of fieldwork on the performance culture of Carnatic (South Indian classical) music in Chennai would ideally begin with two years of preliminary preparation: learning Tamil to conversational proficiency, undertaking private lessons in a Carnatic instrument (most commonly vocal, veena, violin, or mridangam), reading the existing ethnomusicological literature (including work by Matthew Allen, Amanda Weidman, and T. Viswanathan), obtaining the relevant ICCR (Indian Council for Cultural Relations) clearance, identifying a host institution in Chennai, and establishing contact with musicians and music organizations through Tamil cultural organizations in the researcher's home city. The researcher who arrives in Chennai with even a rudimentary ability to sing swaras (solfège syllables) and discuss raga characteristics will be received very differently from one who arrives with no musical engagement whatsoever.

3.4 Field Notes and Research Journals

The most fundamental documentary practice of ethnographic fieldwork is the field note: a written record, made during or immediately after an observation or encounter, describing what the researcher saw, heard, and experienced. Field notes are both more and less than recordings: more, because they capture the researcher’s interpretive responses and contextual observations that a microphone cannot; less, because they are inevitably selective, compressed, and filtered through the researcher’s categories and expectations.

The practice of field notes has been extensively theorized since Roger Sanjek’s edited collection Fieldnotes: The Makings of Anthropology (1990). Ethnomusicologists typically distinguish between two complementary forms of written documentation. Descriptive field notes aim at factual accuracy: who was present, what was performed, how long it lasted, what instruments were used, what was said during breaks, who spoke to whom. Research journals or personal diaries record the researcher’s own emotional responses, analytical confusions, interpersonal difficulties, and emerging theoretical ideas — the subjective and reflexive dimension of the fieldwork experience.

The relationship between field notes taken in the moment and expanded notes written up later the same day is a recurrent practical dilemma. Notes taken during a performance are inevitably fragmentary — the researcher cannot write and listen attentively at the same time. Notes written from memory several hours later are more complete but vulnerable to reconstruction effects: the mind fills in gaps with plausible-sounding details that may not be accurate. Many experienced fieldworkers recommend a combination: cryptic jottings during the event, expanded write-up within two hours, cross-referenced with any recordings made.

3.5 Duration, Depth, and the Seasonal Cycle

The question of how long fieldwork must last is both practical and epistemological. Short-term fieldwork — a few weeks or months — can yield useful data on the surface features of a musical tradition: instrument types, repertoire, performance contexts, basic social organization. But it cannot access the seasonal rhythms of ceremonial life, the variation in musical practice between wet and dry season, harvest and planting, or the multi-year cycles of initiation, funerary commemoration, and communal celebration that structure the musical calendar in many societies.

The classic model of long-term fieldwork — one to two years or more of continuous residence, with return visits in subsequent years — remains the gold standard for doctoral dissertation fieldwork in ethnomusicology, even though funding realities and changing professional norms have made shorter fieldwork increasingly common at the master’s level and in applied ethnomusicology. The intellectual case for extended residence is straightforward: musical communities are not static, and the changes one observes over time — new instruments adopted, old ceremonies abandoned, young musicians experimenting with fusion — are as ethnomusicologically interesting as the stable features of a tradition.

3.6 Gaining Access and Negotiating Trust

The fieldwork literature consistently emphasizes that gaining genuine access to a community’s musical life is a gradual process that cannot be forced or shortcut. The first few weeks in the field are typically a period of uncertainty: the researcher does not yet know who the important musicians are, what the significant performance occasions will be, or how her presence is being perceived by the community. She attends public events, introduces herself cautiously, listens more than she speaks, and tries to make herself useful in whatever ways are appropriate to the context.

A common practical challenge is the gatekeeper — the person who controls access to a community’s inner life, and whose disposition toward the researcher will decisively shape what she can observe and with whom she can speak. Gatekeepers may be community leaders, religious authorities, senior musicians, or cultural brokers who are accustomed to dealing with outsiders. The researcher’s relationship with a gatekeeper is necessarily somewhat instrumental — she needs access, and the gatekeeper controls it — but it must also be genuinely respectful. The gatekeeper who is treated purely as a means to access will typically withhold exactly what the researcher most needs to find.

Experienced fieldworkers note the paradox that the most accessible and articulate initial contacts in the field are often not the most representative or knowledgeable members of the community. The person who is comfortable speaking to foreigners, who has a repertoire of explanations designed for outsiders, and who is eager to be helpful may be culturally marginal — someone whose relative distance from the community's inner life makes him confident and available. The researcher must balance the practical value of early access against the analytical risk of allowing the most accessible voice to become the most prominent voice in her analysis.

Chapter 4: Recording and Documentation

4.1 Audio Recording in the Field

Audio recording is the primary technical tool of ethnomusicological fieldwork, and the choices made about recording equipment have direct consequences for the quality and usability of the resulting archive. The contemporary fieldworker has access to compact, high-quality digital recorders (the Zoom H5 and H6, the Tascam DR-40X, the Sound Devices MixPre series) that were not available to previous generations; the practical barriers to high-quality recording have been dramatically lowered. But the conceptual challenges remain unchanged: the researcher must decide where to place her microphones, what stereo configuration to use, how to manage the inevitable background noise of the field environment, and how to balance the goal of documenting what happens with the goal of capturing high-quality audio.

The choice of microphone configuration depends on the acoustic situation. Mid-side stereo (one cardioid microphone pointing forward and one bidirectional microphone pointing sideways, combined in a matrix that allows the stereo width to be adjusted in post-production) is flexible and useful in unpredictable acoustic environments. XY stereo (two cardioid microphones angled at 90–135 degrees) provides accurate stereo imaging and is particularly effective in smaller, controlled spaces. Mono recording with a high-quality condenser microphone close to the sound source is often the most reliable choice for documenting solo instruments and interviews in noisy environments.

The researcher should not mistake technical quality for ethnographic adequacy. A recording that captures perfect studio-quality audio of a ritual song may be ethnographically impoverished if it was made by moving participants into an acoustically controlled room, or by positioning the microphone so close to one instrument that the communal texture of the ensemble is destroyed. The recording should document the musical event as it actually occurs, even if this means accepting some acoustic compromise.

4.2 Video Recording and Its Limits

Video recording has become an essential tool of ethnomusicological documentation for any music that has a significant visual dimension: dance, instrumental technique, the choreography of ensemble performance, audience interaction, the spatial organization of ceremonies. The visual record captures dimensions of musical practice that audio alone cannot — the physical technique of the mbira player’s thumbs, the gestural communication between ensemble members, the way participants’ bodies respond to rhythmic patterns.

At the same time, video recording is in many contexts more intrusive than audio recording. A small audio recorder can often be placed inconspicuously and forgotten; a video camera requires direction, focus, and framing decisions that make the researcher’s presence more visible. Many communities permit audio recording of events that they restrict for video, and some restrict all recording. The researcher must be attentive to these distinctions and must seek specific permission for video rather than assuming that consent to audio recording extends automatically.

4.3 Permissions, Protocols, and Cultural Restrictions

Recording permissions in ethnomusicological fieldwork are governed by a combination of ethical principles and practical negotiation. The general principle is that the researcher must obtain informed consent from the appropriate parties before recording. But “the appropriate parties” is not always obvious: in some communities, an individual performer can consent to her own recording; in others, a ceremony’s owner, a religious authority, or a village chief must give permission that covers all participants.

Many communities maintain explicit prohibitions on recording certain categories of music or performance. Sacred music intended only for initiates may be restricted. Music performed at funerals may be considered inappropriate for archival preservation. Music associated with particular social roles — women’s working songs, men’s hunting songs — may be restricted to performers of the relevant gender. The ethnomusicologist’s obligation is to respect these prohibitions regardless of whether they could practically be circumvented.

Cultural protocols around recording encompass the community-specific norms, beliefs, and legal frameworks that govern who may record what, under what conditions, and for what purposes. They may be based on beliefs about the spiritual properties of sound (some communities believe that a recording captures and imprisons the performer's spirit), on social norms about secrecy and initiation, or on practical concerns about commercial exploitation. The researcher's obligation to understand and follow these protocols precedes any other methodological consideration.

4.4 Annotation, Archiving, and Repatriation

A recording without annotation is nearly ethnomusicologically useless: an unidentified audio file on a hard drive cannot tell future researchers who performed, what was performed, when, where, under what social circumstances, in what language, for what audience. Annotation must be created concurrently with recording, either through spoken logs captured on a parallel track or through written notes linked to each recording file.

Archiving field recordings according to professional standards is both an ethical obligation to future researchers and a practical necessity for long-term preservation. Metadata standards such as Dublin Core — a set of fifteen basic descriptive elements (title, creator, date, format, rights, and so on) — provide a common framework for describing recordings in machine-readable ways that allow them to be found and used across different archival systems. The Library of Congress American Folklife Center, the British Library Sound Archive, and the Smithsonian Center for Folklife and Cultural Heritage are among the leading institutions for ethnomusicological archiving.

Repatriation — returning copies of recordings to the communities that were recorded — has become a central ethical norm of contemporary ethnomusicology. In the era of the phonograph cylinder, recordings made in the field typically ended up in European archives to which the recorded communities had no access. The International Association of Sound and Audiovisual Archives (IASA) has issued guidelines that explicitly identify repatriation as a professional obligation. In practice, repatriation is complicated by questions of format (the community must have equipment capable of playing back the recordings), cost (creating and shipping archival copies is expensive), and access (recordings of restricted ceremonies must be repatriated in ways that do not make them available to non-initiates within the community).

Example — Hawaiian repatriation projects: The Bishop Museum in Honolulu has undertaken systematic digitization and repatriation of early wax cylinder recordings of Hawaiian chant (oli) and hula music made by ethnographers in the late nineteenth and early twentieth centuries. Working with Hawaiian cultural organizations, the museum has developed access protocols that honor both the principle of open scholarly access and the community's assertion that certain sacred recordings should be available only to trained hula practitioners and their teachers. The project has become a model for balancing archival openness with cultural sovereignty.

4.5 Photography and Visual Documentation

Photography in ethnomusicological fieldwork follows the same permission protocols as audio and video recording, but with important additional considerations. A photograph freezes a moment in a way that a recording does not: it creates a persistent, easily reproduced image of a person’s face and body that can circulate without the subject’s knowledge or control. The distinction between documentation photography — images of instruments, spatial arrangements, performance contexts, material culture — and portraiture — images in which an individual person’s face and identity are central — is ethically significant. Documentation photography can often be conducted with general group consent; portraiture requires individual, explicit, informed consent.

Visual documentation can capture dimensions of musical practice that neither audio recording nor verbal description can adequately convey: the precise finger position of a lutenist, the spatial arrangement of an ensemble in a ceremonial context, the material construction of an instrument, the costume and bodily adornment of performers. As a supplement to audio recording, systematic visual documentation substantially enriches the resulting archive. The challenge is to ensure that visual documentation does not reduce performers to exotic spectacle — a risk that has been noted repeatedly in the critique of photography in anthropological and ethnomusicological contexts.


Chapter 5: Transcription and Musical Analysis

5.1 The Problem of Transcription

Transcription — the conversion of a musical performance into a written notation — is simultaneously one of ethnomusicology’s most essential tools and one of its most contested practices. It is essential because written notation creates an analyzable and communicable artifact: a transcription can be read by scholars who have not heard the recording, included in a published article, subjected to comparative analysis, and stored in ways that survive the degradation of the original recording medium. The entire analytical tradition of ethnomusicology — its theories of scale, melodic contour, phrase structure, rhythmic pattern — has been built on the practice of transcription.

At the same time, transcription is always a transformation. Western staff notation — the system of five-line staves, clef signs, note values, bar lines, and dynamic markings developed in European art music — encodes certain musical parameters with great precision (relative pitch names, approximate rhythmic durations, basic dynamic levels) while systematically failing to encode others (microtonal inflection, timbral variation, the precise degree of rhythmic deviation from a nominal beat, ornamentation styles, the physical gestures of performance). When this system is applied to music organized by fundamentally different principles — music that uses microtonal scales, music in which the “beat” is an emergent property rather than a metronomic grid, music in which timbre is more structurally significant than pitch — the transcription necessarily distorts.

Transcription bias refers to the systematic distortions introduced when one notational system is applied to music organized by a different logic. Common forms include: (1) pitch normalization, in which microtonal inflections are rounded to the nearest equal-tempered pitch; (2) rhythmic regularization, in which flexible or groove-based timing is mapped onto a fixed meter; and (3) timbral erasure, in which the notation captures only pitch and rhythm while the acoustically most distinctive feature of the music (the buzzing timbre of a kora, the nasalized vocal quality of certain Mongolian singing styles) goes unrepresented.

5.2 Pitch Transcription and Measurement

Pitch transcription presents different challenges depending on the musical tradition. For music organized around a scale closely approximating twelve-tone equal temperament — including much urban popular music globally, and music from traditions that use Western instruments — standard staff notation provides an adequate representation. For music using distinctly different tuning systems, the transcriber must choose between simplification (using the nearest equal-tempered pitch) and precision (marking deviations in cents above or below conventional note heads).

Spectrogram analysis using software such as Sonic Visualiser allows the analyst to display the frequency content of a recording visually and to measure pitch with far greater accuracy than the ear alone permits. The strobotuner — a stroboscopic device that displays small pitch deviations visually — was widely used in earlier decades and remains useful in contexts where computer analysis is impractical. Charles Seeger’s melograph, developed in the 1950s, was an analogue device that automatically produced graphic representations of a melody’s pitch contour and amplitude over time — one of the first attempts at automated transcription. Its achievements were real, but its output was difficult to read and did not resolve the fundamental problem of translating acoustic fact into analytical symbol.

5.3 Rhythmic Transcription

Rhythmic transcription raises its own set of problems. Western staff notation represents rhythm relative to a regular beat organized into measures of equal duration. For music in which this framework applies — including a great deal of sub-Saharan African drumming, much Latin American popular music, and Western art music — the framework is serviceable. For music in which the concept of a regular beat is absent or irrelevant — some plainchant traditions, certain kinds of South Asian alap improvisation, some speech-song — it actively misleads.

Example — Proportional notation for free-rhythm music: In transcribing a South Indian vocalist's alap — the slow, unmeasured opening of a raga performance in which the musician explores the characteristic phrases and ornaments of the mode without rhythmic accompaniment — many transcribers prefer proportional notation: notes are spaced horizontally on the page in rough proportion to their actual duration, without bar lines or time signatures. This approach is less precise than mensural notation in some ways, but it does not impose a metric framework that the music does not possess.

Swing and groove present a subtler version of the same problem. Jazz musicians and West African popular musicians routinely play rhythms that are notated as equal eighth notes or quarter notes but performed with systematic inequalities — the first of each pair slightly longer, the second slightly shorter, or vice versa — that are integral to the music’s feel. Conventional staff notation cannot represent this without cumbersome textual annotations; tablature and graphic notation systems have been proposed as alternatives.

5.4 Software Tools and Analytical Methods

Contemporary ethnomusicological analysis is substantially aided by digital tools. Sonic Visualiser, developed at the Centre for Digital Music at Queen Mary University of London, allows the analyst to display spectrograms, pitch tracks, and other acoustic analyses overlaid on the waveform of a recording. Praat, developed by Paul Boersma and David Weenink at the University of Amsterdam, is primarily a phonetics tool but is extensively used by ethnomusicologists working on vocal music for pitch analysis and formant tracking. Transcribe! is simpler software designed specifically to assist with transcription: it allows recordings to be slowed down by a large factor without pitch shift, looped, and annotated with markers.

Beyond the act of transcription, analysis of transcribed material requires the analyst to bring to bear the analytical vocabulary of ethnomusicology:

Core analytical concepts in ethnomusicological transcription and analysis include:
  • Scale / mode: the set of pitches available in a musical system and the hierarchy among them.
  • Melodic contour: the overall shape of a melodic line — ascending, descending, arch-shaped, wave-shaped — independent of its specific pitches.
  • Phrase structure: the division of a melodic line into smaller units, analogous to sentences in speech.
  • Cyclic form: formal organization based on the repetition of a rhythmic or harmonic cycle, characteristic of much African, Indian, and Latin American music.
  • Heterophony: simultaneous performance of variants of the same melody by different performers, common in many Asian and Middle Eastern traditions.
  • Call and response: an antiphonal structure in which a leader's phrase is answered by a group, widespread in West African and African American musical traditions.
  • Drone: a sustained pitch or pitches held below or above the melody, structurally central to many South Asian, Central Asian, and Celtic traditions.

5.5 Analysis Beyond Sound: Text, Context, and Intertextuality

Ethnomusicological analysis is not exhausted by the analysis of transcribed musical sound. A full analysis of a song or a performance must also address the text — the verbal content, where present — and the context — the social occasion, the participants, the spatial setting, the historical moment. The relationship between text and music is itself analytically significant: does the musical phrase enhance the semantic content of the words, contradict it, render it ambiguous, or make it possible to interpret it in multiple ways simultaneously?

Intertextuality — the way a piece of music quotes, alludes to, or dialogues with other pieces of music or other cultural texts — is as important in oral and traditional music as in written literature or art music. A griots’ praise song in West Africa may allude to formulaic melodic patterns associated with particular lineages; a Javanese court dance piece may quote a melody that listeners associate with a specific episode of the Mahabharata; a country music song may use a characteristic guitar lick associated with a foundational recording from a previous generation. These intertextual connections are part of the music’s meaning, and they are accessible only to listeners — and analysts — who know the wider repertoire.

The analysis of intertextuality in non-Western music traditions highlights an important asymmetry in ethnomusicological practice: the analyst who studies a tradition that is not her own typically arrives without the background knowledge that allows intertextual connections to be heard. The community member who has grown up surrounded by the music perceives these connections effortlessly; the fieldworker must build this knowledge slowly through observation, listening, and conversation with knowledgeable consultants. The implication for research design is that the most analytically rich fieldwork often begins from a position of cultivated ignorance — the researcher's willingness to acknowledge what she does not hear is the precondition for learning to hear it.

Chapter 6: Interviewing and Linguistic Competence

6.1 The Ethnographic Interview

The ethnographic interview is categorically different from both the journalistic interview and the social scientific survey. It is not a structured instrument designed to elicit predetermined types of information from a sample of respondents, and it is not a question-and-answer session organized around the interviewer’s agenda. It is, ideally, a conversation in which the researcher’s genuine curiosity — about what a musician means by a term she has used, about why a particular song is always performed at the end of a ceremony rather than the beginning, about what the musicians themselves consider the most important thing an outsider should understand — guides the exchange, while the interlocutor’s own priorities, emphases, and interpretive frameworks are allowed to emerge.

Ethnomusicologists typically distinguish three types of ethnographic interview:
  • An unstructured interview is essentially a guided conversation. The researcher may have a broad topic in mind but no specific questions; she follows her interlocutor's lead, pursuing whatever threads emerge as the most fruitful.
  • A semi-structured interview is organized around a topic guide — a list of themes or questions that the researcher intends to cover, but in no fixed order and with considerable flexibility to pursue unexpected directions.
  • A structured interview uses a predetermined set of questions asked in a fixed order. This format sacrifices depth and flexibility for comparability: it allows data from multiple interviews to be systematically compared, which is useful when the researcher needs to document how different members of a community understand a particular practice.

6.2 Building Rapport

Effective interviewing presupposes rapport — a relationship of mutual trust, respect, and ease — and rapport cannot be rushed. A researcher who arrives in a new community, introduces herself, and immediately produces a recorder for a formal interview is likely to receive guarded, formulaic responses. The community member cannot yet know whether the researcher is trustworthy, whether she will use what she hears responsibly, or even whether she is capable of understanding the nuances of what is said. The initial weeks of fieldwork are typically more about being present — attending events, helping with practical tasks, sharing meals — than about conducting formal interviews.

The person or persons with whom the researcher develops the most sustained and productive relationships in the field are typically called key consultants (the older term “informants” has largely been retired in contemporary ethnomusicology because of its associations with surveillance and intelligence work). A key consultant is not simply someone who answers questions but someone who has agreed to serve as an ongoing interlocutor, who is willing to explain things the researcher does not understand, to correct her mistakes, and to guide her toward the aspects of the musical tradition that the community itself considers most important.

6.3 Language, Translation, and Linguistic Competence

Language competence is the single most important methodological variable in ethnomusicological fieldwork, and it is the one most frequently underestimated. Working through a translator is always a substantial limitation: the translator inevitably filters, selects, and reinterprets what is said, and the researcher has no independent check on the accuracy or completeness of the translation. Even an excellent translator may miss the significance of a word or phrase that has specific musical or ritual meaning not shared by the wider speech community.

This does not mean that all ethnomusicological fieldwork without language competence is worthless — for some traditions, the musical and social dimensions of performance can be substantially documented through careful observation, recording, and limited conversation in a shared language. But it does mean that the depth of access available to a researcher working through a translator is categorically different from — and in most respects inferior to — that available to one working in the community’s own language.

Translation as interpretation is particularly acute in the case of song texts. A song lyric is not merely a sequence of semantic statements; it is a cultural artifact in which particular words are chosen for their sound, their rhythm, their associations, their poetic register, and their relationship to musical phrases. A translation that captures the semantic content but misses all of this — that renders a Yoruba praise song as a plain English list of a chief’s attributes — has captured very little of what makes the song culturally significant.

The growing interest in collaborative translation — in which the ethnomusicologist works together with a community member who is also a skilled writer or poet in both languages — reflects an awareness that the translation of song texts requires literary skill as well as linguistic competence. Steven Feld's translations of Kaluli song texts in Sound and Sentiment (1982), produced with the sustained assistance of local poets and verbal artists, set a standard for collaborative translation that remains influential.

6.4 Recording Interviews and Sensitive Topics

Recording interviews requires its own consent procedure: the researcher must explain that the interview will be recorded, what the recording will be used for, who will have access to it, and under what conditions it can be quoted. Consent must be documented — typically through a signature on a consent form, or, where literacy is limited or writing is culturally inappropriate, through a witnessed verbal consent recorded at the beginning of the session.

The question of verbatim transcription versus summarized transcription of recorded interviews is both practical and methodological. Full verbatim transcription is enormously time-consuming — a one-hour interview may take four to eight hours to transcribe accurately. But summary notes inevitably introduce the researcher’s interpretive choices into the documentary record: what she judged important enough to include, what she compressed, what she paraphrased in her own vocabulary.

Sensitive topics in ethnomusicological research require special care. Music that is sacred, ritually restricted, or gender-exclusive may be subjects that the consultant is willing to discuss only in broad terms, or not at all. Music that is politically contested — songs associated with independence movements, protest traditions, or inter-ethnic conflict — may be sensitive for reasons of personal safety rather than cultural prohibition. Commercially contested music — repertoire that is the subject of ongoing copyright disputes or that has been appropriated by the recording industry — may involve legal complexity. In all such cases, the researcher’s obligation is to follow the consultant’s lead and to resist the temptation to press for information that is not freely offered.

6.5 The Collaborative Turn

Contemporary ethnomusicology has undergone what many scholars call a collaborative turn: a move from research conducted on communities, in which the ethnomusicologist is the sole analytical authority, toward research conducted with communities, in which community members are active partners in shaping the research questions, the analytical frameworks, and the form of the research output.

In practice, this turn takes a range of forms. At a minimum, it involves member checking: returning draft analyses to key consultants, asking them to review the researcher’s interpretations for accuracy and adequacy, and treating their responses seriously rather than as interesting data to be analyzed in turn. At a maximum, it involves co-authorship: publications formally authored by both the ethnomusicologist and community members, representing a genuine negotiation between academic and community perspectives.

The collaborative turn also raises challenging questions about the relationship between academic and community knowledge claims. A community member may insist on an interpretation of a song’s meaning that the ethnomusicologist, based on comparative evidence, finds analytically incomplete. Who has the authority to resolve this disagreement? The honest answer is that neither party has absolute authority: the community member’s emic understanding is irreplaceable, but it is not necessarily complete (community members may not know or may choose not to share all of the historical or comparative context that shapes a tradition); the ethnomusicologist’s analytical perspective adds something, but it is also limited by her outsider position. The collaborative ideal is a genuine dialogue in which both perspectives contribute to a richer account than either could produce alone.

Autoethnomusicography refers to first-person scholarly accounts by musician-scholars who are simultaneously insider members of the tradition they analyze and trained academic researchers. Such works — including Deborah Wong's writing on Thai American music, Felicia Sandler's accounts of her own Hasidic musical upbringing, and a growing body of work by Indigenous scholars on their own communities' musical traditions — occupy a unique epistemological position: the author has access to the emic perspective in a way that is genuinely impossible to replicate through fieldwork alone, while also being trained in the analytical frameworks of the academic discipline. The tensions between these two positions — the community member's obligations of loyalty and discretion and the scholar's obligations of transparency and rigor — are often themselves the central subject of autoethnomusicographical writing.

Chapter 7: Ethics and Positionality

7.1 The Legacy of Extractive Research

The ethical framework of contemporary ethnomusicology has been shaped, above all, by the discipline’s own uncomfortable history. The comparative musicologists of the colonial era recorded the music of colonized peoples with little or no consultation with those peoples, no sharing of the resulting recordings, and no acknowledgment of any obligation to the communities that were the objects of study. The phonograph cylinder recordings made in colonial Africa, Asia, and the Americas — now held in European archives — were produced within a relationship of profound power inequality, in which the researcher’s access was often facilitated by colonial administrators and in which the recorded community had no say in how the material would be used or represented.

This history cannot be cleanly separated from the intellectual content of the early discipline. The evolutionary schema that placed “primitive” music at one end of a developmental spectrum and European art music at the other was not merely a scientific error but an ideology that served the interests of colonial domination by naturalizing racial hierarchies. Recognizing this history is not a matter of blaming the comparative musicologists for the moral norms of their time; it is a matter of understanding how the discipline’s current practices are shaped by the need to repudiate its earlier extractive model.

7.2 IRB Requirements and Research Ethics Protocols

Institutional Review Board review is the formal mechanism by which North American universities regulate research involving human subjects. The IRB framework has its origins in the response to notorious violations of research ethics — the Tuskegee syphilis study being the most widely cited — and its core principles are derived from the Belmont Report (1979): respect for persons, beneficence, and justice.

For ethnomusicological research, the most relevant provisions concern informed consent — the obligation to ensure that research participants understand what they are participating in and agree freely, without coercion — and confidentiality — the obligation to protect participants’ privacy by anonymizing data or restricting access to identifying information. Many kinds of ethnomusicological observation fall into the IRB’s “minimal risk” category, which allows for expedited review or exemption; formal interviews, recording sessions, and any research involving vulnerable populations require more rigorous review.

The IRB framework was designed for biomedical and psychological research and fits ethnomusicological practice imperfectly. Ethnomusicologists frequently encounter situations in which the IRB's standard informed consent procedures are culturally inappropriate — in communities where written consent forms are unfamiliar, where signing one's name to a document has associations with land appropriation or legal disadvantage, or where the concept of individual consent is superseded by community-level decision-making. Professional organizations (the Society for Ethnomusicology, the American Anthropological Association) have developed ethical guidelines that supplement IRB requirements with discipline-specific norms.

7.3 Positionality and Reflexivity

Positionality refers to the researcher’s social location: the complex of identities, experiences, assumptions, and privileges she brings to the field. Race, gender, nationality, class background, musical training, institutional affiliation, and physical appearance all shape what a researcher can observe, what consultants will reveal, and how she interprets what she sees. A white American woman studying Haitian Vodou ceremonies occupies a different social position than a Haitian-American woman studying the same ceremonies, and both occupy a different position than a Haitian man doing the same research; these differences are not incidental but constitutive of what each researcher will be able to learn.

Reflexivity is the practice of making positionality explicit — not as a confession or an apology, but as an analytical move that allows readers to calibrate the researcher’s perspective and account for its limitations. A reflexive ethnomusicological account does not merely describe what the researcher observed; it also reflects on how the researcher’s presence and identity shaped what she was shown, what was withheld, how people behaved in her presence, and what interpretive assumptions she brought to her observations.

The reflexive turn in ethnomusicology (and anthropology more broadly) refers to the methodological commitment to include the researcher herself as an explicit object of analysis, to treat the fieldwork encounter as a triangular relationship between the researcher, the community, and the knowledge produced — rather than treating the researcher as a neutral recording instrument whose personal characteristics are irrelevant to the research output.

7.4 Research Benefits and Intellectual Property

The question of research benefits — what a community receives in return for its members’ time, knowledge, and hospitality — has moved to the center of contemporary ethnomusicological ethics. The model of extractive research, in which the researcher departs with recordings and analytical insight while the community receives nothing, is widely regarded as ethically indefensible. What the researcher owes the community varies with context: it may mean paying musicians for their time, providing copies of recordings (repatriation), assisting with grant applications for community cultural programs, teaching instrument-making or other skills, writing liner notes for community-produced recordings, or providing testimony in legal proceedings relating to land rights or cultural claims.

Intellectual property is one of the most contested areas of ethnomusicological ethics. The question is deceptively simple: who owns a community’s music? The practical answer under current law is often complex and unsatisfying. Western intellectual property regimes protect only fixed, individual creative works; they do not easily accommodate communal, oral, traditional forms of creativity in which authorship is collective and texts are inherently variable. The World Intellectual Property Organization (WIPO) Intergovernmental Committee on Genetic Resources, Traditional Knowledge and Folklore has been working since 2000 on international legal instruments that would provide greater protection for traditional cultural expressions, but progress has been slow.

7.5 The Ethnomusicologist as Advocate

Ethnomusicologists have increasingly understood their role as extending beyond scholarly documentation to include advocacy for the communities they study. The documentation of a musical tradition can serve as evidence in legal proceedings relating to cultural rights; recorded performances can support claims to indigenous land that require demonstrating continuous cultural occupation; published analyses can raise public awareness of cultural practices threatened by commercial exploitation or political suppression.

Barz and Cooley’s concept of “the shadow in the field” captures an inescapable dimension of ethnomusicological work: the researcher’s presence changes what she observes. A ceremony performed with a foreign scholar present is not quite the same event as the ceremony would have been without her; an interview conducted with a recorder running elicits responses shaped by awareness of the recording. Every fieldwork situation is thus a triangulation between what would happen without the researcher, what happens with her present, and what the recorded and written representation captures — and the honest researcher acknowledges all three dimensions in her analysis.


Chapter 8: Writing Ethnomusicography

8.1 What Ethnomusicography Is

Ethnomusicography — writing about music ethnographically — is the genre in which fieldwork observation, musical analysis, theoretical argument, and the consultant’s own voice are woven into a scholarly text. It is a hybrid form: not purely humanistic (it draws on social scientific methods and aims at systematic documentation) but not purely scientific (it relies on thick description, narrative, and interpretation). Its difficulties are partly technical (how do you write about sound in words?) and partly political (whose perspective does the text represent, and who gets to authorize its claims?).

The relationship between ethnomusicography and the practice of thick description — the phrase coined by Gilbert Ryle and elaborated by Clifford Geertz in The Interpretation of Cultures (1973) — is foundational. Thick description does not mean simply detailed description; it means the layered account that moves from what happened (the behavioral surface) to what it meant (the cultural logic), reading a social event as a text and unpacking the codes that make it intelligible to its participants.

Example — Geertz's Balinese cockfight: Geertz's celebrated essay on the Balinese cockfight demonstrates thick description by treating what appears to be an animal-fighting spectacle as a text about Balinese social structure, masculine status competition, and cultural aesthetics. Applied to ethnomusicological writing, thick description means treating a performance not only as a musical event but as a social text: who performs, in what order, with what instruments, in what spatial arrangement, in front of what audience, with what expected outcomes — and what all of this communicates about the community's values, social structure, and sense of itself.

8.2 The Ethnomusicological Monograph

The ethnomusicological monograph is the field’s primary scholarly form, and its structure has been relatively stable since the 1960s. A typical monograph opens with an introduction that establishes the field site, the musical tradition, the research questions, the theoretical framework, and the nature of the author’s fieldwork. It then proceeds through a series of chapters that combine ethnographic description with musical analysis, typically organized by topic (instruments, repertoire, performance contexts, social roles, ritual function) or by an analytical thread. It concludes with a chapter or epilogue that synthesizes the theoretical contribution and situates it in the broader disciplinary conversation.

The arrival story — the opening vignette in which the ethnomusicologist describes her first encounter with the music she will study — has become almost a genre convention, and like all conventions it carries the risk of cliché. The arrival story can be genuinely illuminating: a naive outsider’s first encounter with an unfamiliar musical practice dramatizes the strangeness that the analysis will work to explain. But it can also sentimentalize the fieldwork experience, position the researcher as a heroic discoverer, and reduce a complex community to the backdrop for a personal awakening narrative. Thoughtful ethnomusicographers have become self-conscious about these risks, and some have deliberately subverted the arrival story form.

8.3 Vignettes and Thick Description

Vignettes — brief, vivid narrative accounts of specific fieldwork events — serve multiple functions in ethnomusicological writing. They ground the abstract theoretical argument in concrete experience; they introduce the human beings who are the actual subjects of the research rather than treating the community as an undifferentiated mass; they convey to the reader the texture of musical life in ways that systematic description cannot.

The best ethnomusicological vignettes are specific: they name individuals (or give them carefully chosen pseudonyms), describe the physical setting in precise detail, convey the sounds and sensations of the event, and render the moment’s emotional quality without sentimentalizing it. They are also theoretically motivated: the vignette is not illustrative decoration but analytical argument in a different register.

The politics of representation in ethnomusicological writing has become an explicit site of scholarly debate. Ruth Behar's The Vulnerable Observer (1996), though written from within anthropology, articulates a concern that applies directly to ethnomusicography: the researcher who writes about communities she has lived in is exercising power — the power to represent, to interpret, to publish — and that power is never neutral. It can be exercised responsibly (by being accurate, by giving consultants voice, by representing complexity rather than flattening it) or irresponsibly (by exoticizing, by presenting a community as a unified, timeless entity, by centering the researcher's experience at the expense of the community's).

8.4 Collaborative Ethnomusicography

The collaborative turn discussed in Chapter 6 finds its most ambitious expression in collaborative ethnomusicography: the production of scholarly texts that are genuinely co-authored by the ethnomusicologist and members of the community she studies. Such works are rare, because they require an unusual combination of circumstances: a community member willing and able to write for academic audiences, a scholarly community willing to grant legitimacy to non-credentialed authorial voices, and an adequate framework for negotiating the inevitable disagreements between the ethnomusicologist’s analytical concerns and the community member’s own priorities.

Examples of collaborative ethnomusicography include work on Inuit throat singing in which Inuit performers have co-authored analytical essays with academic ethnomusicologists, and the growing body of autoethnomusicography — first-person accounts by musician-scholars who are themselves members of the traditions they analyze. Anthony Seeger’s work with the Suyá of Brazil, which includes his support for the Suyá’s own recording and publication projects, represents an adjacent model: not co-authored writing but collaborative practice.

8.5 The Dissertation and Publication Venues

The doctoral dissertation in ethnomusicology follows a recognizable structure: an introduction establishing the theoretical stakes and the fieldwork context; two or three substantive chapters each organized around a specific analytical problem; a chapter that synthesizes the argument; and appendices containing full transcriptions, maps, glossaries, and — increasingly — links to accompanying audio and video archives. Transcriptions in the dissertation serve both analytical and documentary functions: they are analyzed in the text as evidence for specific claims about musical structure, but they also constitute an independent record of the music that readers can consult independently of the argument.

The primary journal of the discipline is Ethnomusicology, the official publication of the Society for Ethnomusicology (SEM), founded in 1955. It publishes articles combining musical analysis with ethnographic fieldwork across all world musical traditions, as well as methodological and theoretical essays and book reviews. Asian Music, the journal of the Society for Asian Music, focuses on the musics of South, East, and Southeast Asia. The Yearbook for Traditional Music (formerly the Yearbook of the International Folk Music Council) publishes work on folk and traditional music traditions globally. Popular Music, published by Cambridge University Press, applies ethnomusicological and critical methods to popular music in commercial contexts. Black Music Research Journal focuses on African American music in historical and ethnomusicological perspective.

The Society for Ethnomusicology (SEM), founded in 1955 in the United States, is the primary professional organization for ethnomusicologists in North America and has substantial international membership. Its annual conference is the central professional gathering of the field, bringing together scholars working on every world musical tradition and every major theoretical framework. Membership involves obligations beyond attendance: the SEM's ethics statement, regularly revised, articulates the community norms that govern fieldwork, publication, and professional practice.

8.6 Writing Sound: The Central Challenge

The fundamental challenge of ethnomusicographical writing — writing sound in words — has no complete solution, but it has many partial ones. Musical examples (transcribed scores or graphic representations included in the text) allow readers with musical literacy to follow the analysis. Audio examples (now routinely made available online as supplementary material, and sometimes embedded directly in digital publications) allow readers to hear what is being described. Verbal description of sonic qualities — the strategy of using language to evoke the acoustic experience — draws on an extensive tradition of music criticism and poetic description that ethnomusicographers inherit and adapt.

The tension between analytical precision and experiential vividness pervades ethnomusicological writing at every level. The language of music theory — “a rising fourth followed by a descending major scale segment” — is precise but bloodless; it does not convey what it is like to hear or perform the passage. The language of phenomenological description — “the melody lifts suddenly, as if catching its breath, before plunging through a cascade of descending tones” — is vivid but imprecise; it does not adequately specify what occurred acoustically. Skilled ethnomusicographers learn to move between these registers, using the language of analysis when precision is required and the language of description when evocation is required, and to signal clearly which mode they are in at any given moment.

The emergence of online publication and multimedia ethnomusicography has opened new possibilities for the integration of sound into text. Jonathan Sterne has argued that the constraint of print publication imposed a systematic bias on ethnomusicological writing: the music was necessarily represented in reduced, secondary form (transcription, verbal description) rather than as sound itself. As ethnomusicological publishing moves toward digital platforms, the opportunity to embed audio and video directly in the text transforms not only the presentation of research but potentially its underlying analytical logic — when readers can hear for themselves, the researcher's obligation to evoke sound verbally changes character.
Example — Multimedia ethnomusicography: Steven Feld's Rainforest Soundwalks and its associated recordings constitute a kind of multimedia ethnomusicography avant la lettre: the recordings are not illustrations of the written analysis but an independent form of ethnomusicological argument, presenting the acoustic ecology of the Bosavi rainforest in Papua New Guinea as itself a form of knowledge about Kaluli musical aesthetics. More recently, journals such as SEM Student News and online platforms associated with Ethnomusicology have experimented with embedded audio, interactive transcriptions, and video supplements that allow the analytical text and the sound it analyses to be accessed simultaneously. These experiments raise productive questions about what ethnomusicological writing is for, and who it is written for.

8.7 Situating One’s Work in the Field

A final, indispensable skill of ethnomusicographical writing is the ability to situate one’s own work within the existing scholarly conversation — to write, as it were, in dialogue with predecessors and contemporaries. This is partly a matter of citation and acknowledgment: the researcher must demonstrate familiarity with the relevant literature and acknowledge those on whose work she builds. But it is also a matter of intellectual positioning: articulating how one’s own research question and theoretical approach differ from, extend, or challenge what has been done before.

For many ethnomusicological projects, the existing literature is sparse: the musical tradition in question has been studied by few or no previous scholars, or it has been studied only in passing as part of a larger regional survey. In such cases, the researcher’s contribution is partly cartographic — simply mapping a musical terrain that was previously unmarked on the disciplinary map. For traditions that have been more extensively studied, the researcher must engage more critically with the existing literature, identifying its gaps, its theoretical limitations, its methodological blind spots, and the changed social circumstances that make a new study timely.

The best ethnomusicological writing models a kind of intellectual honesty that is worth aspiring to: an honesty about the limitations of one’s fieldwork access (what one did not see, what one was not shown, what one could not understand), about the provisional character of one’s interpretations (which are always based on incomplete evidence and are always subject to revision), and about the politics of the representational act itself — the fact that writing about a community is an exercise of power, and that this power carries obligations of accuracy, fairness, and respect.

The discipline’s vitality — reflected in the continued growth of SEM membership, the proliferation of ethnomusicology programs globally, and the expanding range of musical traditions represented in the literature — is a product of precisely this combination: rigorous method, theoretical reflection, ethical seriousness, and genuine curiosity about the extraordinary diversity of ways in which human beings organize sound into meaning.

The chapters that follow in the course seminar will examine specific case studies — fieldwork in West Africa, South and Southeast Asia, the Middle East, Indigenous North America, and the global urban popular music industries — that concretize the methods and theoretical debates surveyed here. In each case, the reader should ask not only what the researcher found but how she found it: what methods she deployed, what choices she made, what she could and could not access, and how the conventions of ethnomusicographical writing shaped the knowledge she was able to produce and communicate. Ethnomusicology is, in the end, a discipline that takes seriously both the rigor of method and the humanity of its subject matter — the irreducible fact that music is made by and for human beings living in time, in relationship, and in culture.

Back to top