AMATH 852: Mathematical Geophysics and Seismic Wave Propagation
Estimated study time: 2 hr 18 min
Table of contents
These notes synthesize material from K. Aki and P.G. Richards’s Quantitative Seismology, F.A. Dahlen and J. Tromp’s Theoretical Global Seismology, A. Ben-Menahem and S.J. Singh’s Seismic Waves and Sources, J. Shearer’s Introduction to Seismology, and supplementary material from Caltech Ge/ACM seismology notes (Tromp, Helmberger), Cambridge DAMTP geophysics lecture notes, and Princeton geosciences course materials.
Chapter 1: Elastic Waves in Homogeneous Media
1.1 Linearized Elastodynamics and the Stress Tensor
The mathematical study of seismic waves begins with continuum mechanics. The Earth, to a first approximation, behaves as an elastic solid: when deformed by a passing seismic wave, each material element experiences restoring forces proportional to the strain. This section develops the linearized equations of motion that govern small-amplitude elastic disturbances, the foundation upon which all of seismology rests.
Consider a continuous medium occupying a region \(\Omega \subset \mathbb{R}^3\). Let \(\mathbf{u}(\mathbf{x}, t)\) denote the displacement of the material particle originally at position \(\mathbf{x}\) at time \(t\). The deformation is characterized by the strain tensor.
where \(u_i\) denotes the \(i\)-th component of the displacement field \(\mathbf{u}\). This is the symmetric part of the displacement gradient \(\nabla \mathbf{u}\).
The strain tensor captures how distances between nearby material points change under the deformation. It is a linearization valid for small deformations, which is an excellent approximation for seismic waves far from the rupture zone: typical seismic strains are of order \(10^{-6}\) or smaller. The antisymmetric part \(\frac{1}{2}(\partial_j u_i - \partial_i u_j)\) corresponds to rigid rotation and produces no restoring force.
The internal forces in the medium are described by the Cauchy stress tensor \(\sigma_{ij}\), where \(\sigma_{ij}\) represents the force per unit area in the \(i\)-direction acting on a surface element with outward normal in the \(j\)-direction. For a linearly elastic material, stress is related to strain by the generalized Hooke’s law.
where \(C_{ijkl}\) is the fourth-order elastic stiffness tensor (81 components). The symmetries \(C_{ijkl} = C_{jikl} = C_{ijlk} = C_{klij}\) reduce the number of independent components to at most 21. For an isotropic medium, the elastic tensor takes the form
\[ C_{ijkl} = \lambda\, \delta_{ij}\delta_{kl} + \mu\,(\delta_{ik}\delta_{jl} + \delta_{il}\delta_{jk}), \]where \(\lambda\) and \(\mu\) are the Lame parameters.
The parameter \(\mu\) is the shear modulus (rigidity), measuring resistance to shear deformation, and \(\lambda\) is related to resistance to volumetric change. In the Earth’s mantle, typical values are \(\mu \sim 70\) GPa and \(\lambda \sim 100\) GPa. The bulk modulus is \(\kappa = \lambda + \frac{2}{3}\mu\). Cauchy’s equation of motion, obtained by applying Newton’s second law to an infinitesimal material element, gives the fundamental dynamical equation.
equivalently written as
\[ \rho\, \ddot{\mathbf{u}} = (\lambda + \mu)\,\nabla(\nabla \cdot \mathbf{u}) + \mu\,\nabla^2 \mathbf{u} + \mathbf{f}, \]where \(\mathbf{f}(\mathbf{x}, t)\) is the body-force density and \(\ddot{\mathbf{u}} = \partial^2 \mathbf{u}/\partial t^2\).
Since \(\partial_j \partial_j u_i = \nabla^2 u_i\) and \(\partial_i \partial_j u_j = \partial_i(\nabla \cdot \mathbf{u})\), this becomes \((\lambda + \mu)\,\partial_i(\nabla \cdot \mathbf{u}) + \mu\,\nabla^2 u_i + f_i\), which in vector form is the stated result. The first form follows from the vector identity \(\nabla^2 \mathbf{u} = \nabla(\nabla \cdot \mathbf{u}) - \nabla \times (\nabla \times \mathbf{u})\). \(\square\)
This equation, derived by Navier in 1821 and placed on rigorous footing by Cauchy and Poisson, governs all elastic wave phenomena in a homogeneous isotropic medium. Its structure immediately reveals that elastic waves come in two fundamental types.
1.2 P-Wave and S-Wave Decomposition
The Navier equation, while compact, couples all three components of displacement. The Helmholtz decomposition provides the key to decoupling them into independently propagating wave types, a decomposition of profound physical significance first recognized by Poisson (1831) and Stokes (1849).
where \(\phi\) is the scalar potential and \(\boldsymbol{\psi}\) is the vector potential (with the gauge condition \(\nabla \cdot \boldsymbol{\psi} = 0\)).
Substituting this decomposition into the source-free Navier equation (\(\mathbf{f} = 0\)) and using the identities \(\nabla \times (\nabla \phi) = 0\) and \(\nabla \cdot (\nabla \times \boldsymbol{\psi}) = 0\), one obtains two decoupled wave equations.
where the P-wave velocity (compressional) and S-wave velocity (shear) are
\[ \alpha = \sqrt{\frac{\lambda + 2\mu}{\rho}}, \qquad \beta = \sqrt{\frac{\mu}{\rho}}. \]The P-wave (primary wave, or pressure wave) is irrotational: \(\nabla \times \mathbf{u}_P = 0\). The particle motion is parallel to the direction of propagation (longitudinal). The S-wave (secondary wave, or shear wave) is solenoidal: \(\nabla \cdot \mathbf{u}_S = 0\). The particle motion is perpendicular to the propagation direction (transverse). Since \(\lambda + 2\mu > \mu\) for all positive elastic moduli, we always have \(\alpha > \beta\): P-waves travel faster than S-waves. In the Earth’s crust, \(\alpha \approx 6\) km/s and \(\beta \approx 3.5\) km/s. The ratio \(\alpha/\beta = \sqrt{2(1-\nu)/(1-2\nu)}\), where \(\nu\) is Poisson’s ratio, equals \(\sqrt{3} \approx 1.73\) for a Poisson solid (\(\nu = 0.25\), \(\lambda = \mu\)).
The P- and S-wave velocities in the Earth vary by roughly a factor of two from the crust to the inner core: \(\alpha\) ranges from about 5.8 km/s in the upper crust to 13.7 km/s at the base of the mantle and 11.0 km/s in the inner core, while \(\beta\) ranges from 3.2 km/s to 7.3 km/s in the mantle. The velocity ratio \(\alpha/\beta\) is sensitive to rock composition and state: it is approximately 1.73 for most crustal rocks, but increases significantly in the presence of fluids or partial melt (since fluids reduce \(\mu\) without strongly affecting \(\lambda + 2\mu\)). Monitoring temporal changes in \(\alpha/\beta\) is therefore a tool for detecting fluid migration in volcanic and geothermal systems.
1.3 Plane Waves, Phase Velocity, and Energy Flux
We now examine the simplest solutions of the elastic wave equations: plane waves. These are the building blocks from which more complex wavefield solutions are constructed via superposition.
where \(\mathbf{A}\) is the polarization vector (constant amplitude and direction), \(\mathbf{k}\) is the wave vector, \(\omega\) is the angular frequency, and \(f\) is an arbitrary twice-differentiable function. The phase velocity is \(c = \omega / |\mathbf{k}|\), and the surfaces of constant phase are planes perpendicular to \(\mathbf{k}\).
Substituting a monochromatic plane wave \(\mathbf{u} = \mathbf{A}\, e^{i(\mathbf{k}\cdot\mathbf{x} - \omega t)}\) into the Navier equation leads to the Christoffel equation for the polarization.
This has two families of solutions: (i) \(\mathbf{A} \parallel \mathbf{k}\) with \(\omega^2 = \alpha^2 |\mathbf{k}|^2\) (P-wave), and (ii) \(\mathbf{A} \perp \mathbf{k}\) with \(\omega^2 = \beta^2 |\mathbf{k}|^2\) (S-wave, with two independent polarizations).
The energy carried by elastic waves is characterized by the elastic energy density and the energy flux vector.
The energy flux vector (seismic intensity) is
\[ \mathcal{F}_i = -\sigma_{ij}\, \dot{u}_j. \]These satisfy the energy conservation equation \(\partial_t E + \nabla \cdot \boldsymbol{\mathcal{F}} = \mathbf{f} \cdot \dot{\mathbf{u}}\).
For a plane P-wave \(\mathbf{u} = A\, \hat{\mathbf{k}}\, \cos(\mathbf{k}\cdot\mathbf{x} - \omega t)\), the time-averaged energy flux is \(\langle \boldsymbol{\mathcal{F}} \rangle = \frac{1}{2}\rho \alpha \omega^2 A^2\, \hat{\mathbf{k}}\). The quantity \(Z_P = \rho\alpha\) is the acoustic impedance for P-waves, and similarly \(Z_S = \rho\beta\) for S-waves. Impedance contrasts at interfaces control the amplitude of reflected and transmitted waves, a point of central importance in exploration seismology.
1.4 Reflection and Transmission at a Planar Interface
When an elastic wave encounters an interface between two media with different elastic properties, part of the energy is reflected and part is transmitted. Unlike acoustic waves in fluids, elastic waves can undergo mode conversion: an incident P-wave can generate both reflected and transmitted P- and S-waves. This phenomenon, first studied systematically by Knott (1899) and Zoeppritz (1919), is fundamental to seismic exploration.
Consider a planar interface at \(z = 0\) separating medium 1 (above, with parameters \(\rho_1, \alpha_1, \beta_1\)) from medium 2 (below, with \(\rho_2, \alpha_2, \beta_2\)). An incident P-wave in medium 1 with incidence angle \(\theta_1\) generates reflected P and SV waves and transmitted P and SV waves, with angles determined by Snell’s law.
where \(\theta_1, \theta_1'\) are the incident and reflected P-wave angles, \(\phi_1, \phi_2\) are the reflected and transmitted S-wave angles, and \(\theta_2\) is the transmitted P-wave angle.
The amplitudes of the reflected and transmitted waves are determined by a system of four linear equations (the Zoeppritz equations) arising from the four boundary conditions (continuity of \(u_x, u_z, \sigma_{xz}, \sigma_{zz}\)).
where \(Z_k = \rho_k \alpha_k\) is the P-wave impedance of medium \(k\). Note that \(|R|^2 + (Z_2/Z_1)|T|^2 = 1\) (energy conservation). At the Mohorovicic discontinuity (crust-mantle boundary), the impedance contrast produces \(|R| \approx 0.1\), so about 1% of the energy is reflected.
When \(p > 1/\alpha_2\) (equivalently \(\sin\theta_1 > \alpha_1/\alpha_2\)), the transmitted P-wave angle becomes imaginary and the transmitted wave becomes evanescent: this is total internal reflection for P-waves, and the corresponding angle is the critical angle \(\theta_c = \sin^{-1}(\alpha_1/\alpha_2)\). Head waves (refracted waves) travel along the interface at the velocity of the faster medium and are observed as first arrivals at intermediate distances in refraction seismology.
Chapter 2: Layered Media and Surface Waves
2.1 SH Waves in Layered Half-Spaces
The Earth’s crust and upper mantle are well approximated as a stack of homogeneous layers, each with distinct elastic properties. Seismic waves propagating in such layered structures exhibit phenomena not present in homogeneous media: dispersion, waveguide effects, and surface waves. We begin with the simplest case, SH (horizontally polarized shear) waves, which decouple completely from the P-SV system in a horizontally layered medium.
In each homogeneous layer, the SH displacement for a monochromatic wave with horizontal wavenumber \(k_x = \omega p\) (where \(p\) is the ray parameter) takes the form \(u_y = [A\, e^{i\nu z} + B\, e^{-i\nu z}]\, e^{i(k_x x - \omega t)}\), where the vertical wavenumber is \(\nu = \sqrt{\omega^2/\beta^2 - k_x^2}\). The wave is propagating if \(\nu^2 > 0\) (i.e., \(p < 1/\beta\)) and evanescent if \(\nu^2 < 0\).
2.2 The Thomson-Haskell Transfer Matrix Method
To handle a stack of \(N\) layers, we need an efficient method for propagating the wavefield through the structure. The transfer matrix method, developed independently by Thomson (1950) and Haskell (1953), provides exactly this.
Note that \(\det(\mathbf{T}) = 1\).
For a stack of \(N\) layers overlying a half-space, the total transfer matrix is the product \(\mathbf{T}_{\mathrm{total}} = \mathbf{T}_N \cdots \mathbf{T}_2\, \mathbf{T}_1\). The boundary conditions are: free surface at the top (\(\tau = 0\) at \(z=0\)) and a radiation condition in the half-space (only downgoing waves, or evanescent decay). This approach reduces the problem of matching boundary conditions at every interface to a single matrix equation, an enormous computational simplification.
2.3 Love Waves and Their Dispersion Relation
Love waves, discovered theoretically by A.E.H. Love in 1911, are surface waves consisting of horizontally polarized shear motion that are trapped in a low-velocity layer overlying a higher-velocity half-space. They are among the most prominent arrivals on seismograms of shallow earthquakes.
where \(\mu_k = \rho_k \beta_k^2\). This transcendental equation has infinitely many solutions (modes), indexed by the overtone number \(n = 0, 1, 2, \ldots\).
Dividing the second equation by the first eliminates \(A\) and \(C\), yielding \(\mu_1 \nu_1 \tan(\nu_1 H) = \mu_2 \hat{\nu}_2\), which is the stated dispersion relation. \(\square\)
The dispersion of Love waves is a consequence of the waveguide geometry: longer-period waves penetrate deeper into the half-space and sample the faster material, so they travel faster. This normal dispersion is characteristic of continental crustal structures and is routinely observed on broadband seismograms.
At a period of 20 s (\(\omega = 2\pi/20\) rad/s), solving numerically gives \(c \approx 3.87\) km/s. At 100 s, \(c \approx 4.35\) km/s. The group velocity, computed from \(U = c^2 / (c - \omega\, dc/d\omega)\), reaches a minimum (the Airy phase) near 15–20 s for this model, producing a characteristic amplitude build-up on seismograms that is easily identified.
2.4 Rayleigh Waves and the Secular Equation
Rayleigh waves, predicted theoretically by Lord Rayleigh in 1885, are the most prominent surface waves on seismograms. Unlike Love waves, they involve coupled P-SV motion and exist even in a homogeneous half-space.
This equation always has exactly one real root satisfying \(0 < c_R < \beta\).
Setting the determinant to zero and simplifying using \(\lambda + 2\mu = \rho\alpha^2\) and \(\mu = \rho\beta^2\) gives the Rayleigh secular equation. The existence and uniqueness of a root was proven rigorously by Rayleigh; an elegant proof using the argument principle was given by Barnett and Lothe (1985). For a Poisson solid, \(c_R \approx 0.9194\beta\). \(\square\)
For a Poisson solid (\(\gamma = 1/3\)), this becomes \(\eta^3 - 8\eta^2 + \frac{56}{3}\eta - \frac{32}{3} = 0\), with the physically admissible root \(\eta \approx 0.8453\), giving \(c_R \approx 0.9194\beta\). In the Earth’s upper crust with \(\beta \approx 3.5\) km/s, the Rayleigh wave velocity is approximately 3.2 km/s. For granite with \(\nu = 0.25\), \(c_R/\beta \approx 0.9194\); for sandstone with \(\nu = 0.20\), \(c_R/\beta \approx 0.9112\). The ratio is remarkably insensitive to Poisson’s ratio, varying by less than 5% over the entire physically admissible range \(0 \le \nu \le 0.5\).
Rayleigh waves have retrograde elliptical particle motion at the surface, transitioning to prograde at a depth of about \(0.2\) wavelengths. Their amplitude decays exponentially with depth, with most energy confined within one wavelength of the surface. Because they spread geometrically only in two dimensions (along the surface) rather than three, they decay as \(r^{-1/2}\) rather than \(r^{-1}\) for body waves, making them the dominant arrivals at large epicentral distances for shallow earthquakes.
2.5 Surface Wave Dispersion and Group Velocity
In a layered medium, both Love and Rayleigh waves become dispersive: the phase velocity depends on frequency. This dispersion is one of the most powerful tools for determining Earth structure, as different frequencies sample different depth ranges.
A wave packet (superposition of nearby frequencies) travels at the group velocity, which determines the arrival time of energy at a given frequency.
Measuring the group velocity dispersion curve \(U(\omega)\) from seismograms (using, for example, the multiple-filter technique of Dziewonski, Bloch, and Landisman, 1969) provides constraints on the shear-velocity structure of the Earth. The inversion of surface wave dispersion data to obtain velocity-depth profiles is one of the oldest and most robust techniques in observational seismology.
Chapter 3: Ray Theory and the Eikonal Equation
3.1 The High-Frequency Approximation
Seismic waves in the Earth propagate through a medium whose properties vary continuously with position. When the wavelength is short compared to the scale of heterogeneity, the wavefield can be described by ray theory: waves propagate along ray paths that are governed by local velocities. This is the seismological analogue of geometrical optics and was the primary tool for determining Earth structure throughout the 20th century.
We seek an asymptotic solution of the elastic wave equation valid in the high-frequency limit. Consider a scalar wave equation \(\nabla^2 u = c(\mathbf{x})^{-2}\, \ddot{u}\) in a medium with spatially varying velocity \(c(\mathbf{x})\). We make the ray ansatz.
where \(T(\mathbf{x})\) is the travel time (phase function), \(A(\mathbf{x})\) is the amplitude, and the higher-order terms are corrections that become negligible at high frequency.
Substituting this ansatz into the wave equation and collecting powers of \(\omega\) yields a hierarchy of equations. The leading order gives the eikonal equation; the next order gives the transport equation.
3.2 The Eikonal and Transport Equations
This is a first-order nonlinear PDE of Hamilton-Jacobi type. The surfaces \(T(\mathbf{x}) = \text{const}\) are wavefronts, and the rays are curves perpendicular to the wavefronts.
equivalently written as \(\nabla \cdot (A^2 \nabla T) = 0\). This expresses conservation of energy flux along ray tubes.
The eikonal equation can be solved by the method of characteristics, which yields the ray equations. Let \(\mathbf{x}(s)\) be a ray parameterized by arc length \(s\). Defining the slowness vector \(\mathbf{p} = \nabla T\) (with \(|\mathbf{p}| = 1/c\)), the characteristics of the eikonal equation are the Hamiltonian system.
where \(s\) is arc length. Equivalently, parameterized by travel time \(T\),
\[ \frac{d\mathbf{x}}{dT} = c^2\, \mathbf{p}, \qquad \frac{d\mathbf{p}}{dT} = \frac{1}{c}\nabla c. \]These are the fundamental equations of seismic ray tracing. They show that rays bend toward regions of lower velocity (Fermat’s principle), which has profound consequences for wave propagation in the Earth, where velocity generally increases with depth.
The transport equation (Theorem 3.3) determines how amplitudes vary along rays. Its solution can be expressed in terms of the geometric spreading factor, which measures how a narrow tube of rays expands or contracts as it propagates.
where subscript 0 denotes values at the reference point. For a point source in a homogeneous medium, \(J \propto r^2\) in 3D (amplitude decays as \(1/r\)) and \(J \propto r\) in 2D (amplitude decays as \(1/\sqrt{r}\)). At a caustic, \(J = 0\) and the ray-theoretical amplitude diverges, signaling the breakdown of the approximation.
3.3 Snell’s Law and the Ray Parameter in Spherical Geometry
In a spherically symmetric Earth model where velocity depends only on radius, \(c = c(r)\), the ray equations possess a conserved quantity that enormously simplifies the problem.
is constant along any ray, where \(i\) is the angle between the ray and the radial direction. This is the ray parameter (horizontal slowness in flat geometry). It is the spherical analogue of Snell’s law.
The ray parameter \(p\) has dimensions of time per unit distance (s/rad for the spherical case, s/km for the flat case). It completely specifies the geometry of a ray in a spherically symmetric Earth. A ray with a given \(p\) descends to a turning point at radius \(r_t\) where \(\sin i = 1\) (the ray becomes horizontal), so \(r_t / c(r_t) = p\).
3.4 Travel Time Curves and the Wiechert-Herglotz Inversion
The travel time \(T(\Delta)\) as a function of epicentral distance \(\Delta\) for a given phase (P, S, etc.) is one of the most fundamental observables in seismology. The relationship between travel-time curves and Earth structure was first elucidated by Wiechert and his student Herglotz in 1910, providing one of the earliest successful inversions of geophysical data.
where \(\eta(r) = r/c(r)\) is the radial slowness, \(r_s\) is the source (surface) radius, and \(r_t\) is the turning point radius defined by \(\eta(r_t) = p\). Furthermore, \(dT/d\Delta = p\): the slope of the travel-time curve equals the ray parameter.
The relation \(dT/d\Delta = p\) is extremely important: it means that the ray parameter can be measured directly from the slope of the travel-time curve, without any knowledge of the velocity structure. This is the basis for the Wiechert-Herglotz inversion.
where \(p_0 = dT/d\Delta'|_{\Delta'}\) is the ray parameter for distance \(\Delta'\). This is an Abel-type integral inversion.
This remarkable formula was used by Gutenberg and others to construct the first detailed models of the Earth’s velocity structure, revealing the crust, mantle, outer core, and inner core. The condition that \(\eta(r)\) be monotonically decreasing fails when there is a low-velocity zone, leading to complications (triplications, shadow zones) that we discuss next.
3.5 Shadow Zones and Caustics
The most famous shadow zone in seismology is the P-wave shadow between approximately 104 and 140 degrees epicentral distance, caused by refraction into the low-velocity fluid outer core. Waves that enter the core are refracted downward (toward lower velocities), creating a gap in the direct P-wave field. Diffractions around the core-mantle boundary partially illuminate this shadow zone with greatly reduced amplitude.
For a typical crustal gradient \(g = 0.006\) s\(^{-1}\) and \(c_0 = 5.0\) km/s, a ray bottoming at 10 km depth is launched at \(i_0 = \cos^{-1}(c_0/(c_0+g\cdot 10)) \approx 83.2^\circ\) from vertical and travels a horizontal distance of approximately 116 km.
Chapter 4: Normal Modes of the Earth
4.1 Free Oscillations of a Spherical Body
At sufficiently low frequencies, the concept of propagating rays becomes inappropriate, and the natural description of the seismic wavefield is in terms of the normal modes (free oscillations) of the entire Earth. The existence of such oscillations was predicted theoretically in the 19th century, but they were not observed until the great Chilean earthquake of 1960, when Benioff, Press, and Smith detected spectral peaks corresponding to the gravest modes. Normal mode seismology provides the most precise constraints on the large-scale structure of the deep Earth.
where \(\boldsymbol{\sigma}\) is the stress associated with \(\mathbf{s}\), \(\mathbf{g}_0\) is the equilibrium gravitational field, \(\Phi_1\) is the perturbation in the gravitational potential, and \(\rho_1 = -\nabla \cdot (\rho\, \mathbf{s})\) is the Eulerian density perturbation. The boundary conditions are zero traction on the surface and regularity at the center.
The self-gravitation terms (involving \(\Phi_1\) and \(\rho_1\)) are essential for the gravest modes (periods longer than about 200 s) but are negligible for higher-frequency oscillations. The gravitational term modifies the restoring force and introduces buoyancy effects that are responsible for the very-long-period behavior.
4.2 Spheroidal and Toroidal Modes
The spherical symmetry of the Earth allows the normal modes to be classified into two independent families, analogous to the decomposition of the wavefield into P-SV and SH components in flat geometry.
(i) Spheroidal modes \({}_n S_l\), with displacement
\[ \mathbf{s} = U(r)\, Y_l^m(\theta, \phi)\, \hat{\mathbf{r}} + V(r)\, \nabla_1 Y_l^m(\theta, \phi), \]where \(U(r)\) and \(V(r)\) are radial eigenfunctions, \(Y_l^m\) are spherical harmonics, and \(\nabla_1\) is the surface gradient. These involve radial and horizontal motion (P-SV analogue) and are sensitive to compressional and shear velocity structure.
(ii) Toroidal modes \({}_n T_l\), with displacement
\[ \mathbf{s} = W(r)\, \hat{\mathbf{r}} \times \nabla_1 Y_l^m(\theta, \phi), \]where \(W(r)\) is a radial eigenfunction. These involve purely horizontal, divergence-free motion (SH analogue) and are sensitive only to shear velocity structure.
In both cases, \(n\) is the radial overtone number and \(l\) is the angular degree. The eigenfrequencies \({}_n\omega_l\) are \((2l+1)\)-fold degenerate (independent of order \(m\)).
The radial eigenfunctions satisfy coupled ordinary differential equations obtained by substituting the mode ansatz into the equation of motion. For spheroidal modes, this yields a fourth-order system in \((U, V)\) (or sixth-order including self-gravitation); for toroidal modes, a second-order equation in \(W\). These are solved numerically subject to boundary conditions at the center and surface.
The toroidal modes deserve special attention because they decouple entirely from the P-SV system and are confined to the solid parts of the Earth. Since the outer core is fluid (\(\mu = 0\)), toroidal modes have identically zero displacement in the outer core. This means that toroidal modes are sensitive only to the shear structure of the mantle and inner core, making them complementary to spheroidal modes, which sample both compressional and shear structure throughout the entire Earth.
4.3 Variational Principle and the Rayleigh Quotient
The eigenfrequencies of the Earth’s normal modes satisfy a variational principle that is both theoretically illuminating and computationally useful.
where the numerator is the potential energy functional (elastic strain energy plus gravitational potential energy) and the denominator is the kinetic energy functional. The eigenfrequencies are the successive minima of this quotient (min-max principle).
This variational characterization provides a powerful framework for perturbation theory. If the Earth’s structure is perturbed by small amounts \(\delta\rho, \delta\mu, \delta\kappa\), the resulting perturbation in eigenfrequency can be computed to first order using the unperturbed eigenfunctions, without solving the perturbed eigenvalue problem.
where \(K_\rho, K_\mu, K_\kappa\) are sensitivity kernels computed from the unperturbed eigenfunctions. This formula is the basis of normal mode tomography.
4.4 Mode Coupling, Splitting, and Lateral Heterogeneity
In a perfectly spherically symmetric, non-rotating Earth, each multiplet \({}_n S_l\) or \({}_n T_l\) is \((2l+1)\)-fold degenerate. Departures from this idealization lift the degeneracy and cause the spectral lines to split.
[
] where \(\beta_{nl}\) is the splitting parameter and \(m\) ranges from \(-l\) to \(l\). The splitting is proportional to the rotation rate \(\Omega\) of the Earth. This is the seismological analogue of the Zeeman effect in atomic physics.
Lateral heterogeneity produces more complex splitting patterns that depend on the spherical harmonic expansion of the three-dimensional structure. The measurement of mode splitting has provided some of the strongest constraints on the aspherical structure of the deep mantle and core.
where \(\gamma_{mm'}^{st}\) are geometric coupling coefficients (related to Wigner 3-\(j\) symbols) and \(c_{st}\) are structure coefficients that depend on the aspherical perturbations \(\delta\rho, \delta\mu, \delta\kappa\). Analysis of \({}_0 S_2\) splitting by Woodhouse and Dahlen (1978) and Masters, Jordan, Silver, and Gilbert (1982) provided the first evidence for large-scale pattern of velocity heterogeneity in the deep mantle, revealing the dominance of spherical harmonic degree 2 — consistent with the large-scale pattern of subducted slabs and mantle upwellings.
The computation of normal mode eigenfrequencies and eigenfunctions for realistic Earth models requires numerical integration of the radial equations of motion. For spheroidal modes, these form a system of four first-order ODEs (six when self-gravitation is included), which are integrated from the center of the Earth outward using, for example, a Runge-Kutta scheme. The eigenfrequencies are found as zeros of a secular function constructed from the boundary conditions. The Mineos code (Masters, Woodhouse, and Gilbert) and OBANI codes are standard tools for this purpose.
Chapter 5: Earthquake Source Mechanics
5.1 Point Source Representation and the Moment Tensor
An earthquake is a sudden release of accumulated tectonic stress, producing seismic waves that radiate outward from the source. The mathematical representation of the source is central to seismology: it determines the amplitude, polarity, and radiation pattern of the emitted waves. The modern theory, developed by Burridge and Knopoff (1964), Maruyama (1963), and Aki (1966), represents the earthquake source as a set of equivalent body forces.
where \(M_{ij}(t)\) is the seismic moment tensor, a symmetric second-order tensor whose time dependence describes the history of the source. The displacement field radiated by this source is
\[ u_i(\mathbf{x}, t) = M_{jk}(t) * G_{ij,k}(\mathbf{x}, t; \mathbf{x}_s), \]where \(G_{ij}\) is the elastodynamic Green’s function, the comma denotes differentiation with respect to the source coordinate, and \(*\) denotes temporal convolution.
The moment tensor provides a complete description of the source mechanism for a point source. It has six independent components (being symmetric), but for a pure shear dislocation on a planar fault, it has a special structure.
5.2 Double-Couple Source and Radiation Patterns
The vast majority of earthquakes are caused by shear faulting, in which rock slides along a pre-existing fault plane. The equivalent body force for such a source has a characteristic form known as the double couple.
where \(M_0(t) = \mu A \bar{D}(t)\) is the scalar seismic moment, \(\mu\) is the shear modulus at the source, \(A\) is the fault area, and \(\bar{D}(t)\) is the average slip. This tensor has eigenvalues \(+M_0, 0, -M_0\), corresponding to a double-couple (two orthogonal force couples with zero net torque). The moment tensor is invariant under the interchange \(\hat{\mathbf{n}} \leftrightarrow \hat{\mathbf{d}}\), expressing the fundamental ambiguity between the fault plane and the auxiliary plane.
The far-field P-wave displacement from a double-couple source has a characteristic four-lobed radiation pattern.
P-wave:
\[ \mathbf{u}^P(\mathbf{x}, t) = \frac{1}{4\pi\rho\alpha^3}\frac{\dot{M}_0(t - r/\alpha)}{r}\,(\hat{\mathbf{r}} \cdot \hat{\mathbf{d}})(\hat{\mathbf{r}} \cdot \hat{\mathbf{n}})\,\hat{\mathbf{r}}, \]S-wave:
\[ \mathbf{u}^S(\mathbf{x}, t) = \frac{1}{4\pi\rho\beta^3}\frac{\dot{M}_0(t - r/\beta)}{r}\left[(\hat{\mathbf{r}} \cdot \hat{\mathbf{d}})\hat{\mathbf{n}} + (\hat{\mathbf{r}} \cdot \hat{\mathbf{n}})\hat{\mathbf{d}} - 2(\hat{\mathbf{r}} \cdot \hat{\mathbf{d}})(\hat{\mathbf{r}} \cdot \hat{\mathbf{n}})\hat{\mathbf{r}}\right], \]where \(r = |\mathbf{x}|\) and \(\hat{\mathbf{r}} = \mathbf{x}/r\).
The relationship between the moment tensor and the Green’s function (Definition 5.1) shows that the seismogram at any station is a linear combination of the six independent moment tensor elements, weighted by the spatial derivatives of the Green’s function. This linearity enables the routine determination of moment tensors from observed seismograms through linear least-squares inversion, a procedure first implemented at the global scale by Dziewonski, Chou, and Woodhouse (1981) as the Harvard Centroid Moment Tensor (CMT) project. The CMT catalog, now maintained at Columbia University’s Lamont-Doherty Earth Observatory, contains solutions for every earthquake of magnitude 5 and above since 1976 and is one of the most heavily used datasets in seismology.
5.3 Kinematic Rupture Models and Directivity
Real earthquakes are not point sources: they rupture finite fault areas over finite time intervals. The Haskell model (1964) provides the simplest kinematic description of a finite rupture.
where \(\xi\) is the position along the fault measured from the hypocenter and \(H\) is the Heaviside function.
The finite rupture produces a directivity effect: the pulse is compressed (Doppler-shortened) in the direction of rupture propagation and elongated in the opposite direction.
where \(c = \alpha\) for P-waves and \(c = \beta\) for S-waves. The source spectrum at frequency \(f\) is modulated by the directivity factor
\[ D(f, \theta) = \frac{\sin(\pi f T(\theta))}{\pi f T(\theta)}. \]Directivity concentrates energy in the forward direction of rupture, with important implications for seismic hazard: ground motion can be amplified by a factor of 2-3 in the forward-directivity direction. This was dramatically observed in the 1994 Northridge and 1995 Kobe earthquakes.
5.4 Seismic Moment, Magnitude, and Stress Drop
The characterization of earthquake size has evolved from purely empirical magnitude scales (Richter, 1935) to physically based measures rooted in the moment tensor theory.
where \(M_0\) is the scalar seismic moment in N\(\cdot\)m. Unlike the Richter magnitude \(M_L\) and body-wave magnitude \(m_b\), which saturate for large earthquakes (due to spectral limitations), \(M_W\) does not saturate because it is based on the zero-frequency limit of the source spectrum.
where \(\bar{D}\) is the average slip, \(L_c\) is the characteristic fault dimension, \(\mu\) is the shear modulus, and \(C\) is a dimensionless geometric factor of order unity. For a circular crack of radius \(a\), \(\Delta\sigma = (7/16)\, M_0 / a^3\) (Eshelby, 1957). Observed stress drops range from about 1 to 100 MPa, with a median near 3-5 MPa, and show remarkably little dependence on earthquake magnitude (self-similarity).
The self-similarity of stress drops across many orders of magnitude in earthquake size (from \(M_W\) 1 to \(M_W\) 9) is one of the most remarkable empirical observations in seismology. It implies that the relationship \(M_0 \propto L_c^3\) holds on average, meaning that the ratio of slip to fault dimension is roughly constant regardless of earthquake size. This scaling was first noted by Aki (1967) and remains the subject of active research, as deviations from self-similarity could have important implications for seismic hazard assessment. The Brune (1970) source model, which assumes a circular crack with instantaneous stress release, predicts a far-field displacement spectrum that is flat below a corner frequency \(f_c \propto \beta / L_c\) and decays as \(\omega^{-2}\) above it:
\[ \hat{u}(\omega) = \frac{M_0}{1 + (\omega/\omega_c)^2}, \]where \(\omega_c = 2\pi f_c\). This spectral shape, known as the Brune spectrum, is the standard model used in seismic hazard analysis for predicting ground-motion amplitudes.
Chapter 6: Seismic Wave Propagation in 3D Heterogeneous Media
6.1 The Born Approximation and Single Scattering
Real Earth structure is far more complex than the layered or smoothly varying models assumed in ray theory and normal mode analysis. Three-dimensional heterogeneities at all scales — from crystalline grains to mantle plumes — scatter seismic waves, redistributing energy in ways that carry information about the heterogeneity. The mathematical framework for understanding scattering begins with perturbation theory.
Consider a reference medium with velocity \(c_0(\mathbf{x})\) and a perturbation \(\delta c(\mathbf{x})\). The total wavefield \(u = u_0 + u_s\) satisfies the wave equation in the perturbed medium, while the reference field \(u_0\) satisfies the wave equation in the reference medium.
with solution
\[ u_s(\mathbf{x}, \omega) = \int_V G(\mathbf{x}, \mathbf{x}'; \omega)\, \frac{2\omega^2\, \delta c(\mathbf{x}')}{c_0(\mathbf{x}')^3}\, u_0(\mathbf{x}', \omega)\, dV', \]where \(G\) is the Green’s function of the reference medium. This is the Born (single-scattering) approximation, valid when the scattered field is small compared to the incident field.
The Born approximation replaces the unknown total field in the scattering integral by the known incident field, linearizing the scattering problem. This linearization is the foundation of many tomographic methods and is valid for weak heterogeneities with \(\delta c / c_0 \ll 1\).
where \(q = 2k\sin(\theta/2)\) is the magnitude of the scattering vector and \(\theta\) is the scattering angle. In the long-wavelength limit (\(ka \ll 1\)), the scattering is nearly isotropic (Rayleigh scattering) and the scattered power scales as \(\omega^4 a^6\), a strong frequency dependence that explains why high-frequency seismic waves are much more strongly scattered by crustal heterogeneities than low-frequency waves. In the short-wavelength limit (\(ka \gg 1\)), the scattering is concentrated in the forward direction within a cone of half-angle \(\sim 1/(ka)\), and the total scattering cross-section approaches twice the geometric cross-section.
6.2 The Rytov Approximation
An alternative to the Born approximation that better handles the cumulative effects of propagation through weakly heterogeneous media is the Rytov approximation.
The Rytov approximation is better suited for forward-scattering problems (long-range propagation through weak heterogeneity) because it correctly handles the phase wrapping that occurs when the accumulated phase perturbation exceeds \(2\pi\). The Born approximation, by contrast, is better for backscattering problems. Both approximations are widely used in seismic imaging.
6.3 Attenuation and the Quality Factor Q
Real seismic waves lose amplitude not only through geometric spreading but also through anelastic processes: the conversion of elastic energy into heat. This intrinsic attenuation is quantified by the quality factor.
where \(\Delta E\) is the energy lost per cycle and \(E\) is the peak stored energy. In a medium with constant \(Q\), a plane wave at frequency \(\omega\) propagating a distance \(x\) has amplitude
\[ A(x) = A_0\, \exp\left(-\frac{\omega x}{2cQ}\right). \]In the Earth’s mantle, \(Q\) for shear waves ranges from about 80 in the asthenosphere to over 500 in the lower mantle.
Attenuation and dispersion are intimately linked through causality, expressed by the Kramers-Kronig relations.
where \(\omega_r\) is a reference frequency. For a nearly constant-\(Q\) medium, this yields the approximate dispersion relation (Liu, Anderson, and Kanamori, 1976):
\[ c(\omega) = c(\omega_r)\left[1 + \frac{1}{\pi Q}\ln\frac{\omega}{\omega_r}\right]. \]Higher frequencies travel faster: the velocity dispersion is positive (anomalous dispersion).
This dispersion is essential for comparing seismic velocities measured at different frequencies. Body-wave studies (at \(\sim 1\) Hz) and normal mode studies (at \(\sim 1\) mHz) measure velocities that differ by several percent due to physical dispersion, and this must be accounted for when constructing reference Earth models.
For the asthenosphere with \(Q_\mu = 80\), this gives \(\delta\beta/\beta \approx 2.7\%\), a correction comparable to the lateral velocity variations due to temperature and composition that tomography seeks to image. Ignoring this dispersion correction would introduce systematic biases into Earth models derived from multi-frequency data.
6.4 Scattering and Coda Waves
The tail of a local earthquake seismogram, after the direct P and S arrivals, is filled with a slowly decaying, irregular signal known as the coda. Aki (1969) made the seminal observation that the coda is generated by single scattering from random heterogeneities in the Earth’s crust and upper mantle.
where \(|S(\omega)|^2\) is the source spectral density and \(Q_c\) is the coda quality factor. The \(t^{-2}\) decay comes from the geometry of the single-scattering ellipsoid, and the exponential decay from intrinsic attenuation.
6.5 Multiple Scattering and Anderson Localization
When scattering is strong (mean free path comparable to the wavelength), the single-scattering approximation breaks down and we enter the regime of multiple scattering. The seismic energy then diffuses through the medium.
where \(D = cl^*/3\) is the energy diffusion coefficient, \(l^* = c/g_0\) is the transport mean free path, and \(Q_i^{-1}\) accounts for intrinsic absorption. The source term \(S\) represents the initial energy injection.
Chapter 7: Seismic Tomography
7.1 Travel-Time Tomography as a Linear Inverse Problem
Seismic tomography is the art and science of inferring the three-dimensional structure of the Earth’s interior from seismic observations. The basic idea, analogous to medical CT scanning, is that waves traveling along different paths sample different regions of the Earth, and their observed properties (travel times, amplitudes, waveforms) carry information about the material properties along those paths. The mathematical foundations draw on the theory of linear inverse problems, a subject of considerable richness.
where the integral is along the unperturbed ray path (Fermat’s principle guarantees that the travel-time perturbation is first-order even though the ray perturbation is first-order). This is a linear integral equation of the first kind.
After discretization (parameterizing \(\delta s\) on a grid of \(M\) blocks or as an expansion in basis functions), this becomes a linear system \(\mathbf{d} = \mathbf{G}\mathbf{m}\), where \(\mathbf{d} \in \mathbb{R}^N\) is the data vector of \(N\) travel-time residuals, \(\mathbf{m} \in \mathbb{R}^M\) is the model vector of slowness perturbations, and \(\mathbf{G}\) is the \(N \times M\) sensitivity matrix whose entries \(G_{ij}\) give the ray-path length of the \(i\)-th ray in the \(j\)-th cell.
7.2 Backus-Gilbert Theory
Before the advent of large-scale computational tomography, Backus and Gilbert (1968, 1970) developed an elegant framework for analyzing the resolving power of geophysical data. Their approach focuses not on finding a particular model, but on characterizing what linear functionals of the model can be reliably estimated from the data.
where \(R(\mathbf{x}_0, \mathbf{x}) = \sum_i a_i\, G_i(\mathbf{x})\) is the averaging kernel (resolution kernel). The coefficients \(a_i\) are chosen to make \(R\) as close to a delta function at \(\mathbf{x}_0\) as possible, subject to a constraint on the noise amplification. The trade-off between resolution (width of \(R\)) and variance (noise amplification \(\sigma^2 = \mathbf{a}^T \mathbf{C}_\epsilon \mathbf{a}\)) is the fundamental limitation of the inverse problem.
The Backus-Gilbert approach provides a coordinate-free characterization of resolution that does not depend on any particular parameterization. However, for large-scale problems with millions of unknowns, it is computationally prohibitive, and regularized inversion methods are preferred.
7.3 Regularization Methods
The travel-time tomography problem is typically underdetermined (more unknowns than data), mixed-determined (some parameters well-constrained, others not), and ill-conditioned (small data perturbations can cause large model changes). Regularization is essential.
where \(\lambda > 0\) is the regularization parameter and \(\mathbf{L}\) is a regularization operator (e.g., \(\mathbf{L} = \mathbf{I}\) for norm damping, or \(\mathbf{L} = \nabla\) for smoothness). The solution is
\[ \hat{\mathbf{m}} = (\mathbf{G}^T\mathbf{G} + \lambda^2 \mathbf{L}^T\mathbf{L})^{-1}\mathbf{G}^T\mathbf{d}. \]where \(\sigma_1 \ge \sigma_2 \ge \cdots\) are the singular values and \(\mathbf{u}_j, \mathbf{v}_j\) are the left and right singular vectors. The truncation level \(k\) plays the role of the regularization parameter.
and the model covariance matrix is
\[ \mathbf{C}_m = (\mathbf{G}^T\mathbf{G} + \lambda^2 \mathbf{L}^T\mathbf{L})^{-1}\mathbf{G}^T\mathbf{C}_\epsilon\,\mathbf{G}\,(\mathbf{G}^T\mathbf{G} + \lambda^2 \mathbf{L}^T\mathbf{L})^{-1}. \]As \(\lambda \to 0\), \(\mathbf{R} \to \mathbf{I}\) (perfect resolution) but \(\mathrm{tr}(\mathbf{C}_m) \to \infty\) (infinite variance). As \(\lambda \to \infty\), \(\mathbf{C}_m \to 0\) (zero variance) but \(\mathbf{R} \to 0\) (no resolution). The optimal \(\lambda\) is chosen by methods such as the L-curve criterion or generalized cross-validation.
7.4 Finite-Frequency Kernels
Classical ray-theoretical tomography assumes that travel times are sensitive only to structure along the infinitesimally thin geometric ray path. In reality, seismic waves have finite wavelength and the region of sensitivity has finite width. This was formalized by Dahlen, Hung, and Nolet (2000), who showed that the true sensitivity kernels have a surprising structure.
has the following properties: (i) it is zero on the geometric ray path (the “doughnut hole”), (ii) it has a hollow, banana-shaped region of maximum sensitivity surrounding the ray path (the first Fresnel zone), and (iii) it oscillates with decreasing amplitude at larger distances from the ray. The kernel width scales as \(\sqrt{\lambda L}\), where \(\lambda\) is the wavelength and \(L\) is the propagation distance.
where \(\lambda = cT\) is the wavelength. For a teleseismic P-wave at 1 Hz (\(T = 1\) s, \(c \approx 10\) km/s in the lower mantle) recorded at an epicentral distance of 60 degrees (\(L \approx 6700\) km), the Fresnel zone width is \(w \approx \sqrt{10 \cdot 6700 / 2} \approx 183\) km. This is the scale below which ray-based tomography cannot resolve structures, regardless of how many rays are used. At 0.05 Hz (20 s period), the Fresnel zone width increases to approximately 820 km. Finite-frequency kernels account for this frequency-dependent sensitivity and can resolve structures down to approximately half the Fresnel zone width, provided sufficient data coverage exists.
7.5 Full Waveform Inversion and Adjoint Methods
The ultimate goal of seismic tomography is to use the entire seismogram — not just travel times — to constrain Earth structure. Full waveform inversion (FWI) achieves this by minimizing the misfit between observed and synthetic seismograms with respect to the model parameters.
This is a nonlinear optimization problem, typically solved by iterative gradient descent:
\[ \mathbf{m}_{k+1} = \mathbf{m}_k - \alpha_k\, \mathbf{H}_k^{-1}\, \nabla_m \chi(\mathbf{m}_k), \]where \(\alpha_k\) is the step length and \(\mathbf{H}_k\) is an approximation to the Hessian.
The key computational challenge is computing the gradient \(\nabla_m \chi\), which, for \(M\) model parameters, would naively require \(M\) forward simulations. The adjoint method reduces this to just two simulations per source (one forward, one adjoint), regardless of the number of parameters.
where \(\mathcal{L}\) is the wave-equation operator and \(\mathbf{u}^\dagger\) is the solution of the adjoint wave equation with sources at the receivers, driven by the time-reversed data residuals:
\[ f_i^\dagger(\mathbf{x}, t) = \sum_r [d_i^{\text{obs}}(T-t) - d_i(\mathbf{m}; T-t)]\, \delta(\mathbf{x} - \mathbf{x}_r). \]For the self-adjoint elastic wave equation, this reduces to back-propagating the data residuals from the receiver locations.
The elegance of the adjoint method lies in its computational efficiency: computing the gradient with respect to millions of model parameters requires exactly two wave-propagation simulations per earthquake source, regardless of the number of parameters. The forward simulation propagates the wavefield from source to receivers, computing and storing the wavefield at each time step. The adjoint simulation injects the time-reversed data residuals at the receiver locations and propagates them backward in time. The interaction of the forward and adjoint fields at each grid point gives the local gradient, which indicates the direction in model space that most reduces the data misfit. This process, repeated for many earthquakes and summed, yields the total gradient — a three-dimensional map showing where and how the Earth model should be modified to better fit the observations.
The adjoint method, introduced to seismology by Tarantola (1984) and brought to computational maturity by Tromp, Tape, and Liu (2005), has revolutionized global and regional tomography. Modern adjoint tomography models (e.g., GLAD-M25, SPECFEM-based) achieve unprecedented resolution of mantle structure.
Chapter 8: Computational Seismology
8.1 Finite Difference Methods for the Elastic Wave Equation
The development of numerical methods for solving the elastic wave equation has transformed seismology from a discipline limited to analytical approximations to one capable of simulating wave propagation through arbitrarily complex three-dimensional structures. The finite difference (FD) method, owing to its conceptual simplicity and computational efficiency, remains one of the most widely used approaches.
In the staggered-grid scheme (Virieux, 1986), the velocity and stress components are defined at staggered grid positions, and spatial derivatives are approximated by centered finite differences. For a fourth-order scheme, the spatial derivative is
\[ \frac{\partial f}{\partial x}\bigg|_{i+1/2} \approx \frac{c_1(f_{i+1} - f_i) + c_2(f_{i+2} - f_{i-1})}{\Delta x}, \]with \(c_1 = 9/8\) and \(c_2 = -1/24\).
The staggered-grid approach avoids the need to explicitly impose continuity conditions at interfaces between different materials: the natural discretization automatically satisfies the correct interface conditions (to the order of the scheme). This was a key insight of Virieux (1984, 1986) that made FD methods practical for heterogeneous media.
where \(c_{\max}\) is the maximum P-wave velocity in the model and the sum is over the FD coefficients. For the standard fourth-order scheme in 2D, this gives \(\Delta t \lesssim 0.606\, \Delta x / c_{\max}\).
which shows that the numerical phase velocity is always less than the true velocity: the FD scheme is dispersive, with short wavelengths traveling too slowly. At 10 points per wavelength, the second-order scheme has a phase velocity error of about 0.4%, while the fourth-order scheme reduces this to approximately 0.01%. For a wave propagating 100 wavelengths (a typical regional simulation), even the 0.4% error of the second-order scheme accumulates to a significant time shift, motivating the use of high-order stencils.
8.2 Spectral Element Methods
The spectral element method (SEM) combines the flexibility of finite elements (unstructured meshes that conform to complex geometries) with the accuracy of spectral methods (high-order polynomial basis functions that converge exponentially for smooth solutions). Its application to seismology, pioneered by Komatitsch and Vilotte (1998) and implemented in the widely used SPECFEM codes (Komatitsch and Tromp, 2002), has become the gold standard for global and regional wave propagation simulations.
where \(\boldsymbol{\xi} = (\xi_1, \xi_2, \xi_3)\) are the local coordinates on the reference cube \([-1, 1]^3\), \(\ell_i\) are Lagrange interpolants based on the Gauss-Lobatto-Legendre (GLL) points, and \(N\) is the polynomial degree (typically 4 or 5).
where \(w_i\) are the GLL quadrature weights and \(J_e\) is the Jacobian of the element mapping. This diagonality enables explicit time stepping without the need to solve a linear system at each time step, making the SEM highly efficient for time-domain simulations.
The SEM naturally handles the free surface (it is a natural boundary condition in the weak formulation), fluid-solid interfaces (through appropriate coupling conditions), and complex geometries (through mesh deformation). The SPECFEM3D_GLOBE code, for example, uses a cubed-sphere mesh to avoid the polar singularity and deforms element boundaries to honor the major internal discontinuities (Moho, 410-km, 660-km, core-mantle boundary, inner-core boundary).
8.3 Perfectly Matched Layers
Numerical simulations of wave propagation are performed in finite computational domains, but the physical problem often involves unbounded domains (e.g., a half-space). Artificial reflections from the boundaries of the computational domain must be suppressed.
where \(d_x(x') \ge 0\) is a damping profile that is zero inside the physical domain and increases into the absorbing layer. In the frequency domain, this is equivalent to replacing \(\partial/\partial x\) by \((1 + d_x/(i\omega))^{-1}\, \partial/\partial x\).
In practice, the continuous PML achieves perfect absorption, but the discretized PML has small reflections that depend on the damping profile, the layer thickness, and the discretization. A polynomial damping profile \(d_x(x) = d_0 (x/L)^n\) with \(n = 2\) or \(3\) and a layer thickness of 5-10 elements typically reduces reflections to \(10^{-3}\) or less. The convolutional PML (C-PML) formulation, due to Roden and Gedney (2000), avoids the frequency-domain implementation and works directly in the time domain using recursive convolution.
8.4 Reciprocity and Green’s Functions
A fundamental principle that pervades computational seismology is reciprocity: the wavefield observed at point B due to a source at point A is related to the wavefield at A due to a source at B. This principle has profound computational implications.
In terms of the Green’s function, this is \(G_{ij}(\mathbf{x}_A, \mathbf{x}_B; t) = G_{ji}(\mathbf{x}_B, \mathbf{x}_A; t)\): the displacement at \(\mathbf{x}_A\) in direction \(i\) due to a force at \(\mathbf{x}_B\) in direction \(j\) equals the displacement at \(\mathbf{x}_B\) in direction \(j\) due to a force at \(\mathbf{x}_A\) in direction \(i\).
Integrating by parts in time (using zero initial conditions) and in space (using the divergence theorem and zero surface traction), and exploiting the symmetry \(C_{ijkl} = C_{klij}\), all terms cancel except \(\int_V (f_i u_i' - f_i' u_i)\, dV = 0\), giving the stated result. \(\square\)
Reciprocity allows the computation of synthetic seismograms for many sources and few receivers by interchanging the roles of source and receiver: one simulation with a source at the receiver location yields the Green’s function for all possible source locations. This is the computational basis of the adjoint method described in Chapter 7.
where the integration is over the fault surface \(\Sigma\). This formula is the theoretical foundation for computing synthetic seismograms from kinematic rupture models: one precomputes the Green’s function between each point on the fault and each receiver (using reciprocity, this requires one simulation per receiver rather than one per fault point), then integrates over the fault with the specified slip history. The approach was implemented by Graves (1996) and is the basis of most kinematic ground-motion simulation methods used in seismic hazard assessment.
8.5 Synthetic Seismograms
The computation of synthetic seismograms — theoretical predictions of what a seismometer would record for a given Earth model and earthquake source — is the fundamental forward problem of seismology. The choice of method depends on the frequency range, the complexity of the Earth model, and the available computational resources.
(i) Normal mode summation: Sum the contributions of individual normal modes. Efficient at long periods but requires precomputation of all modes up to the maximum frequency.
(ii) Reflectivity method: Compute the frequency-wavenumber response of a layered medium using the Thomson-Haskell matrices, then evaluate the inverse Hankel transform to obtain the time-domain response. Exact for 1D models; the standard tool for computing body-wave and surface-wave synthetics.
(iii) Generalized ray theory (Cagniard-de Hoop method): Deform the integration contour in the complex ray-parameter plane to obtain exact closed-form expressions for each ray arrival. Gives physically transparent results and exact waveforms for canonical problems.
(iv) Numerical methods (FD, SEM): Directly solve the wave equation in 3D heterogeneous media. No approximations beyond discretization; handles all wave types and effects simultaneously. Computationally the most expensive.
8.6 Ambient Noise Seismology
A remarkable development of the 21st century is the extraction of structural information from the ambient seismic noise field, without relying on earthquakes. The theoretical foundation rests on a connection between noise cross-correlations and Green’s functions.
where \(\langle \cdot \rangle\) denotes the long-time average and \(*\) denotes cross-correlation. The right-hand side is the causal Green’s function plus its time-reversed (acausal) counterpart.
These notes provide a mathematical foundation for the study of seismic wave propagation. The subject sits at the intersection of applied mathematics, physics, and Earth science, drawing on PDE theory, asymptotic analysis, spectral theory, inverse problems, and computational methods. For the student of applied mathematics, seismology offers a uniquely rich collection of problems where rigorous mathematical analysis has direct consequences for our understanding of the planet’s interior.