AMATH 883: Mathematical Epidemiology and Population Dynamics
Estimated study time: 2 hr 4 min
Table of contents
These notes synthesize material from F. Brauer, C. Castillo-Chávez, and Z. Feng’s Mathematical Models in Epidemiology, O. Diekmann, H. Heesterbeek, and T. Britton’s Mathematical Tools for Understanding Infectious Disease Dynamics, J.D. Murray’s Mathematical Biology (Volumes I and II), H.W. Hethcote’s The Mathematics of Infectious Diseases (SIAM Review 2000), and Oxford Mathematical Biology lecture notes and Cambridge DAMTP Mathematical Biology notes.
Chapter 1: Introduction to Mathematical Epidemiology
1.1 Historical Roots
The application of mathematics to the study of infectious disease has a remarkably long and distinguished history, stretching back well before the germ theory of disease was established. The central insight — that the spread of infection through a population is a dynamical process amenable to quantitative reasoning — has proved extraordinarily fruitful, and the mathematical frameworks developed in this field now inform public health policy worldwide.
The earliest mathematical contribution to epidemiology is generally attributed to Daniel Bernoulli, who in 1760 published a study analyzing the potential benefits of inoculation against smallpox. Bernoulli constructed a life-table model to compare the age-specific mortality in a population with and without variolation, concluding that universal inoculation would increase life expectancy at birth by roughly three years. His work was remarkable not only for its mathematical content but for its explicitly policy-oriented framing: Bernoulli was using mathematics to argue for a public health intervention.
The next major landmark came from Sir Ronald Ross, who in the early 1900s developed mathematical models for the transmission of malaria. Ross recognized that malaria transmission involves a vector (the Anopheles mosquito) and formulated differential equation models capturing the interaction between human and mosquito populations. His work led to a crucial insight: it is not necessary to eliminate every mosquito to eradicate malaria; rather, one need only reduce the mosquito population below a critical threshold. This was among the first appearances of the concept of a transmission threshold in epidemiology.
The foundational framework for most of modern mathematical epidemiology was laid by W.O. Kermack and A.G. McKendrick in their celebrated series of papers beginning in 1927. They introduced the compartmental modeling approach, in which individuals in a population are classified by their disease status — susceptible, infected, recovered — and the dynamics of transitions between these compartments are described by differential equations. Their 1927 paper established the epidemic threshold theorem, which states that an epidemic can only occur if the density of susceptibles exceeds a critical value.
1.2 Why Mathematical Models?
Mathematical models serve several distinct purposes in epidemiology. First, they provide a precise language for stating assumptions about disease transmission. The act of writing down equations forces the modeler to be explicit about what is assumed: how transmission occurs, what the timescales are, which factors are included and which are neglected. Second, mathematical analysis can reveal consequences of assumptions that are not intuitively obvious — the epidemic threshold theorem is a prime example. Third, models enable quantitative prediction and scenario analysis: given estimates of key parameters, what is the expected trajectory of an epidemic? What would happen if a vaccination campaign were implemented?
It is essential to recognize that all models are simplifications. The famous aphorism attributed to George Box — “all models are wrong, but some are useful” — applies with particular force in epidemiology, where the systems being modeled involve billions of interacting individuals with heterogeneous behavior, genetics, and social networks. The art of mathematical epidemiology lies in constructing models that capture the essential features of disease transmission while remaining tractable enough to yield insight.
1.3 Compartmental Modeling Philosophy
The compartmental approach rests on several simplifying assumptions. The population is assumed to be large enough that a deterministic (continuous) description is appropriate. Individuals within a compartment are assumed to be homogeneous — they share the same risk of infection, the same recovery rate, and so on. Mixing is typically assumed to be homogeneous as well: every susceptible individual is equally likely to encounter any infected individual. These assumptions can be relaxed in more refined models, as we shall see in later chapters.
Let us denote the total population size by \(N\) and the sizes of the various compartments by uppercase letters: \(S\) for susceptibles, \(I\) for infectives, \(R\) for recovered (or removed), and so on. In many models we work with proportions rather than absolute numbers, writing \(s = S/N\), \(i = I/N\), and \(r = R/N\). The choice between absolute numbers and proportions affects the form of the incidence term, as we discuss below.
- Mass-action incidence (or bilinear incidence): new infections occur at rate \(\beta S I\), where \(\beta\) is the transmission coefficient.
- Standard incidence (or frequency-dependent incidence): new infections occur at rate \(\beta S I / N\), reflecting the idea that the contact rate per individual is independent of population size.
The choice between mass-action and standard incidence has significant modeling implications. Mass-action incidence is appropriate when the contact rate between individuals increases with population density (as might occur for animal populations in a confined habitat). Standard incidence is more appropriate for human populations, where each individual has a roughly fixed number of contacts per unit time regardless of population size.
1.4 Timescales in Epidemiology
A crucial consideration in epidemic modeling is the separation of timescales. Infectious disease dynamics typically operate on two very different timescales: the epidemic timescale (the duration of an outbreak, often weeks to months) and the demographic timescale (the timescale of births, deaths, and population turnover, typically decades).
When the epidemic timescale is much shorter than the demographic timescale, we may treat the total population \(N\) as approximately constant and ignore births and deaths. This leads to the simplest and most classical models. When the two timescales are comparable, or when we wish to study the long-term endemic behavior of a disease, we must incorporate demographic processes into the model.
Chapter 2: The SIR Model and Basic Reproduction Number
2.1 The Kermack-McKendrick SIR Model
We now develop in detail the most fundamental model of mathematical epidemiology: the SIR model for a closed population. This model, introduced by Kermack and McKendrick in 1927, describes the spread of an infectious disease through a population that is divided into three compartments: Susceptible (S), Infected (I), and Recovered (R). The flow is unidirectional: \(S \to I \to R\).
The assumptions underlying this model are worth stating explicitly. First, the population is closed: \(S + I + R = N\) is constant (adding the three equations gives \(d(S+I+R)/dt = 0\)). Second, transmission follows mass-action kinetics with rate \(\beta\). Third, recovery occurs at a constant per-capita rate \(\gamma\), so the infectious period is exponentially distributed with mean \(1/\gamma\). Fourth, recovered individuals are permanently immune.
\[ \frac{dS}{dt} = -\beta S I, \qquad \frac{dI}{dt} = (\beta S - \gamma) I. \]From the second equation, we see immediately that \(dI/dt > 0\) if and only if \(S > \gamma/\beta\). This is the essence of the epidemic threshold: an epidemic (initial increase in infected individuals) occurs if and only if the initial number of susceptibles exceeds the critical value \(\gamma/\beta\).
2.2 The Basic Reproduction Number
The quantity \(\gamma/\beta\) arising from the threshold condition can be repackaged into what is arguably the single most important quantity in mathematical epidemiology. The idea is to ask: if a single infected individual is placed into a fully susceptible population, how many secondary infections will it produce during its entire infectious period? If this number exceeds 1, the infection can spread; if it is less than 1, the infection dies out. This quantity, denoted \(\mathcal{R}_0\), provides a dimensionless threshold parameter that summarizes the transmissibility of a disease in a single number.
The interpretation is straightforward: an infected individual makes \(\beta N\) potentially infectious contacts per unit time (under mass-action incidence in a fully susceptible population of size \(N\)) and remains infectious for an average duration of \(1/\gamma\), so the expected number of secondary cases is \(\beta N / \gamma\).
- If \(\mathcal{R}_0 \leq 1\), then \(I(t)\) is monotonically decreasing for all \(t > 0\): no epidemic occurs.
- If \(\mathcal{R}_0 > 1\), then \(I(t)\) initially increases, reaches a unique maximum, and then decreases to zero: an epidemic occurs.
If \(S_0 > \gamma/\beta\), i.e., \(\mathcal{R}_0 > 1\), then initially \(dI/dt > 0\) so \(I\) increases. Since \(S\) is decreasing, there exists a unique time \(t^*\) at which \(S(t^*) = \gamma/\beta\), and \(I\) achieves its maximum at \(t^*\).
To show \(I(t) \to 0\), note that \(S\) is decreasing and bounded below by 0, so \(S(\infty)\) exists. If \(I\) did not tend to 0, the integral \(\int_0^\infty I(t)\,dt\) would diverge. But from \(dS/dt = -\beta SI\), we have \(S_0 - S(\infty) = \beta \int_0^\infty S(t)I(t)\,dt \leq \beta S_0 \int_0^\infty I(t)\,dt\), and since the left side is finite, \(\int_0^\infty I(t)\,dt < \infty\). Combined with the fact that \(dI/dt\) is bounded, Barbalat’s lemma gives \(I(t) \to 0\).
Finally, \(S(\infty) > 0\) because the equation \(dS/dt = -\beta SI\) implies \(\ln S(t) - \ln S_0 = -\beta \int_0^t I(\tau)\,d\tau\), so \(S(t) = S_0 \exp(-\beta \int_0^t I(\tau)\,d\tau) > 0\) for all \(t\), including the limit. \(\square\)
2.3 Phase Portrait Analysis
The phase portrait of the SIR system in the \((S, I)\)-plane reveals the global dynamics with elegant clarity. We can eliminate time by dividing the two equations:
\[ \frac{dI}{dS} = \frac{\beta S I - \gamma I}{-\beta S I} = -1 + \frac{\gamma}{\beta S}. \]This separable ODE integrates to give the conserved quantity
\[ V(S, I) = S + I - \frac{\gamma}{\beta} \ln S = \text{constant}. \]The level curves of \(V\) are the phase trajectories. Each trajectory is a curve in the first quadrant of the \((S, I)\)-plane. Since \(S\) is always decreasing (when \(I > 0\)), trajectories move from right to left. When \(\mathcal{R}_0 > 1\), trajectories rise (as \(I\) increases) until \(S\) drops to \(\gamma/\beta\), then fall to the \(S\)-axis.
2.4 The Final Size Relation
A remarkable feature of the SIR model is that the total number of individuals ultimately infected — the final epidemic size — can be determined without solving the differential equations explicitly. This is the final size relation.
The final size relation is one of the most practically useful results in mathematical epidemiology. Given \(\mathcal{R}_0\), one can immediately determine what fraction of the population will ultimately be infected without simulating the full time course. For instance, if \(\mathcal{R}_0 = 2\), the final size relation predicts that approximately 80% of the population will eventually be infected.
\(\mathcal{R}_0 = 1.5\): attack rate \(\approx 58\%\). \(\mathcal{R}_0 = 2.0\): attack rate \(\approx 80\%\). \(\mathcal{R}_0 = 3.0\): attack rate \(\approx 94\%\). \(\mathcal{R}_0 = 5.0\): attack rate \(\approx 99.3\%\). \(\mathcal{R}_0 = 10.0\): attack rate \(\approx 99.99\%\).
These values illustrate the steep relationship between transmissibility and final epidemic size. Even a modest \(\mathcal{R}_0\) of 1.5 (typical of pandemic influenza) results in a majority of the population being infected. For highly transmissible diseases like measles (\(\mathcal{R}_0 \approx 12\text{--}18\)), essentially the entire susceptible population is eventually infected.
2.5 The SIS Model and Endemic Equilibria
Not all diseases confer lasting immunity. For diseases such as gonorrhea, chlamydia, or the common cold, individuals return to the susceptible class upon recovery, since infection does not generate durable protective immunity. This fundamental distinction between immunizing and non-immunizing infections leads to qualitatively different dynamical behavior. The simplest model for a non-immunizing infection is the SIS model.
This is a logistic-type equation. Unlike the SIR model, the SIS model admits a nontrivial equilibrium.
- The disease-free equilibrium (DFE): \(I^* = 0\), which is globally asymptotically stable if \(\mathcal{R}_0 \leq 1\).
- The endemic equilibrium: \(I^* = N(1 - 1/\mathcal{R}_0)\), which exists and is globally asymptotically stable when \(\mathcal{R}_0 > 1\).
The SIS model illustrates a key principle: when immunity is not permanent, the disease can persist indefinitely in the population as an endemic infection. The endemic prevalence \(I^*/N = 1 - 1/\mathcal{R}_0\) increases with \(\mathcal{R}_0\), approaching 100% as \(\mathcal{R}_0 \to \infty\).
Chapter 3: Extensions of Compartmental Models
The basic SIR and SIS models of Chapter 2, while enormously instructive, omit many important features of real infectious diseases. In this chapter, we systematically extend the compartmental framework to accommodate latent periods, waning immunity, vaccination, demographics, and vector-borne transmission. Each extension adds biological realism at the cost of mathematical complexity, but the fundamental concepts — \(\mathcal{R}_0\), threshold behavior, equilibrium analysis — continue to organize our understanding.
3.1 The SEIR Model
Many infectious diseases have a latent period: after becoming infected, an individual is not immediately infectious. During this exposed (or latent) period, the pathogen is replicating within the host but has not yet reached the level necessary for onward transmission. Diseases such as measles (latent period 8-13 days), influenza (1-3 days), and Ebola (2-21 days) all exhibit significant latent periods.
which is smaller than in the SIR case, reflecting the delay introduced by the latent period.
3.2 The SIRS Model
For some diseases, immunity wanes over time, so that recovered individuals eventually return to the susceptible class. This gives rise to the SIRS model.
When \(\delta > 0\), recovered individuals re-enter the susceptible pool, which can sustain endemic transmission even in a closed population. The SIRS model with \(\delta > 0\) can exhibit a stable endemic equilibrium analogous to that of the SIS model, with infected fraction at equilibrium given by appropriate algebraic expressions in \(\beta\), \(\gamma\), \(\delta\), and \(N\).
3.3 Models with Vaccination
Vaccination is the most powerful tool available for disease control. To model vaccination, we introduce a vaccinated compartment \(V\), or equivalently, we modify the inflow into the susceptible class.
The effect of vaccination is to reduce the effective reproduction number. If a fraction \(p\) of the population is vaccinated (with a perfectly effective vaccine), the effective reproduction number becomes \(\mathcal{R}_e = \mathcal{R}_0(1-p)\).
This result has profound public health implications. For measles, with \(\mathcal{R}_0 \approx 12\text{--}18\), the critical vaccination coverage is \(p_c \approx 92\text{--}95\%\). This explains why measles outbreaks can occur even in highly vaccinated populations: if coverage drops even slightly below the critical threshold, herd immunity is lost. The concept of herd immunity — the indirect protection of unvaccinated individuals when a sufficient fraction of the population is immune — is one of the most important insights of mathematical epidemiology, and it emerges naturally from the vaccination threshold formula.
3.4 Models with Demographics
When studying diseases that persist over long time periods, demographic processes — births and deaths — must be included. The standard SIR model with demographics assumes a constant birth rate equal to the death rate, maintaining constant population size.
The introduction of demographics qualitatively changes the long-term behavior. Unlike the closed-population SIR model (where \(I \to 0\) always), the model with demographics can sustain an endemic equilibrium because births continuously replenish the susceptible population. The key mechanism is that as the epidemic depletes susceptibles, the birth of new (uninfected) individuals eventually restores the susceptible pool sufficiently for a new wave of infection. In the long run, the system reaches a balance between the influx of new susceptibles through births and their depletion through infection.
- A disease-free equilibrium \((S^*, I^*, R^*) = (N, 0, 0)\), which is locally asymptotically stable if \(\mathcal{R}_0 < 1\) and unstable if \(\mathcal{R}_0 > 1\).
- A unique endemic equilibrium \[ S^* = \frac{\gamma + \mu}{\beta} = \frac{N}{\mathcal{R}_0}, \qquad I^* = \frac{\mu N}{\gamma + \mu}\left(1 - \frac{1}{\mathcal{R}_0}\right), \qquad R^* = \frac{\gamma}{\mu}I^*, \] which exists and is locally asymptotically stable when \(\mathcal{R}_0 > 1\).
3.5 Vector-Borne Diseases: The Ross-Macdonald Model
Many important infectious diseases — malaria, dengue, Zika, West Nile virus — are transmitted by arthropod vectors, typically mosquitoes. The Ross-Macdonald model, originating in the work of Ronald Ross (1911) and refined by George Macdonald (1957), captures the essential features of vector-borne transmission.
The basic reproduction number for the Ross-Macdonald model is
\[ \mathcal{R}_0 = \frac{m a^2 b c}{r \mu_V}, \]where \(m = N_V / N_H\) is the vector-to-host ratio. The quadratic dependence on the biting rate \(a\) reflects the fact that a mosquito must bite twice — once to acquire the infection, once to transmit it — and this is a key insight of the Ross-Macdonald theory. It implies that reducing biting rates (e.g., through bed nets) is doubly effective compared to what one might naively expect.
Chapter 4: Stability Analysis and Bifurcations
The previous chapters introduced several compartmental models and stated results about their equilibria and stability. In this chapter, we develop the general mathematical machinery needed to rigorously establish these results. The tools — linearization, Routh-Hurwitz criteria, Lyapunov functions, bifurcation theory — are drawn from the theory of ordinary differential equations and dynamical systems, but we present them here in the specific context of epidemiological models, emphasizing the biological interpretation at each step.
4.1 Disease-Free and Endemic Equilibria
The qualitative behavior of epidemic models is governed by their equilibria and the stability properties thereof. In this chapter, we develop the systematic tools for analyzing equilibria in compartmental models, drawing on the theory of ordinary differential equations and dynamical systems.
The central question of mathematical epidemiology is: under what conditions is the DFE stable (disease dies out) versus unstable (disease invades), and when does a stable EE exist? The answer, as we shall see, is organized by the basic reproduction number \(\mathcal{R}_0\).
4.2 Local Stability via Linearization
The local stability of an equilibrium is determined by linearization. Consider a general ODE system \(\dot{x} = f(x)\) with equilibrium \(x^*\) (so \(f(x^*) = 0\)). The Jacobian matrix at \(x^*\) is
\[ J = Df(x^*) = \left(\frac{\partial f_i}{\partial x_j}\right)\bigg|_{x = x^*}. \]For the SIR model with demographics, the Jacobian at the DFE \((N, 0, 0)\) (restricted to the \((S, I)\) subsystem since \(R = N - S - I\)) is
\[ J_{\text{DFE}} = \begin{pmatrix} -\mu & -\beta N \\ 0 & \beta N - \gamma - \mu \end{pmatrix}. \]The eigenvalues are \(\lambda_1 = -\mu < 0\) and \(\lambda_2 = \beta N - \gamma - \mu = (\gamma + \mu)(\mathcal{R}_0 - 1)\). Thus the DFE is locally asymptotically stable if and only if \(\mathcal{R}_0 < 1\), confirming the threshold result.
4.3 Routh-Hurwitz Criteria
For higher-dimensional systems, computing eigenvalues explicitly may be impractical. The Routh-Hurwitz criteria provide algebraic conditions on the coefficients of the characteristic polynomial that are equivalent to all roots having negative real part.
4.4 Global Stability via Lyapunov Functions
Local stability tells us about the behavior of solutions starting near an equilibrium. For many epidemiological questions, we need global stability: does every trajectory (with appropriate initial conditions) converge to the equilibrium? Lyapunov’s direct method provides the principal tool.
- \(V(x^*) = 0\) and \(V(x) > 0\) for \(x \neq x^*\) in \(D\).
- \(\dot{V}(x) = \nabla V \cdot f(x) \leq 0\) for all \(x \in D\).
The challenge, of course, is constructing an appropriate Lyapunov function. For epidemic models, a particularly successful class of Lyapunov functions was introduced by Korobeinikov and collaborators.
This Lyapunov function approach, based on the Volterra-type function \(x - x^* \ln x\), has become a standard technique in mathematical epidemiology and has been extended to a wide variety of compartmental models.
4.5 Bifurcations at \(\mathcal{R}_0 = 1\)
The transition from disease-free to endemic behavior as \(\mathcal{R}_0\) crosses 1 is a bifurcation. In the simplest models, this is a forward (transcritical) bifurcation: the endemic equilibrium emerges smoothly from the DFE at \(\mathcal{R}_0 = 1\) and is stable for \(\mathcal{R}_0 > 1\).
Backward bifurcation has profound implications for disease control. In models exhibiting backward bifurcation, reducing \(\mathcal{R}_0\) below 1 is necessary but not sufficient for disease eradication: the disease can persist at the stable endemic equilibrium even though \(\mathcal{R}_0 < 1\). To achieve eradication, one must reduce \(\mathcal{R}_0\) below the critical value \(\mathcal{R}_c < 1\), or reduce the infected population below the unstable equilibrium.
4.6 Hopf Bifurcations and Oscillatory Dynamics
Many infectious diseases exhibit periodic outbreaks: measles epidemics occur on a roughly biennial cycle, while pertussis (whooping cough) has a 3-4 year cycle. Can compartmental models explain such oscillations?
In the simplest SIR model with demographics, the endemic equilibrium is a globally stable node or spiral, and sustained oscillations do not occur (though damped oscillations toward the equilibrium are observed). However, more complex models can exhibit Hopf bifurcations, leading to stable limit cycles and sustained periodic behavior.
- The Jacobian \(Df(x^*(\mu_0); \mu_0)\) has a pair of purely imaginary eigenvalues \(\lambda = \pm i\omega_0\) with \(\omega_0 > 0\), and all other eigenvalues have nonzero real part.
- The transversality condition holds: \(\frac{d}{d\mu}\operatorname{Re}(\lambda(\mu))\big|_{\mu=\mu_0} \neq 0\).
Chapter 5: Age-Structured Models
The ODE compartmental models studied in previous chapters treat all individuals within each disease class as interchangeable. In this chapter, we introduce age structure into both demographic and epidemic models, reflecting the biological reality that disease risk, contact patterns, and immune status all vary with age. The mathematical framework shifts from systems of ODEs to partial differential equations (PDEs) and integral operators.
5.1 The McKendrick-von Foerster Equation
The compartmental ODE models of the preceding chapters treat all individuals within a compartment as identical. In reality, both demographic and epidemiological rates vary strongly with age: children and the elderly face different infection risks, disease severity varies with age, and vaccination programs often target specific age groups. Age-structured models provide the mathematical framework for incorporating these effects.
The McKendrick-von Foerster equation is the fundamental equation of age-structured population dynamics. The left-hand side represents the aging process: both \(t\) and \(a\) increase together along characteristics. The right-hand side represents mortality, and the boundary condition represents the influx of newborns.
\[ \frac{dn}{dt} = -\mu(a) n, \]\[ \ell(a) = \exp\left(-\int_0^a \mu(s)\, ds\right), \]the survivorship function.
The Euler-Lotka equation is the demographic analogue of the characteristic equation for a linear ODE: just as the eigenvalues of a matrix determine the long-term growth of a linear system, the Malthusian parameter \(r\) determines the long-term growth of an age-structured population. The stable age distribution \(\ell(a)e^{-ra}\) is the demographic analogue of an eigenvector.
5.2 Age-Structured Epidemic Models
To incorporate age into epidemic models, we introduce age-specific compartments. Let \(s(a, t)\), \(i(a, t)\), and \(r(a, t)\) denote the densities of susceptible, infected, and recovered individuals of age \(a\) at time \(t\).
The transmission kernel \(\beta(a, a')\) encodes the age-dependent contact pattern. Empirical data on contact patterns (such as the POLYMOD study by Mossong et al., 2008) reveal that contacts are strongly age-assortative: school-age children have frequent contacts with other children, adults contact other adults in the workplace, and intergenerational contacts occur primarily within households. The WAIFW (Who Acquires Infection From Whom) matrix, a discrete approximation to \(\beta(a, a')\), is a central object in practical applications of age-structured models.
5.3 The Next-Generation Operator and \(\mathcal{R}_0\)
For age-structured models, the basic reproduction number can no longer be computed as a simple ratio of parameters. Instead, it is defined as the spectral radius of an integral operator: the next-generation operator.
This result is a cornerstone of modern mathematical epidemiology. It provides a rigorous and general definition of \(\mathcal{R}_0\) that applies to models of arbitrary complexity, including age structure, spatial structure, and multiple host species. The definition reduces to the familiar ratio-of-parameters formula for simple ODE models. The power of the next-generation operator formulation is that it separates the computation of \(\mathcal{R}_0\) from the details of the model’s dynamics: one need only identify the next-generation operator (which describes one “generation” of disease transmission) and compute its spectral radius.
5.4 The Characteristic Equation
For the age-structured SIR model at the disease-free equilibrium, the linearized dynamics lead to a renewal equation, and the stability is determined by a characteristic equation.
Chapter 6: Spatial Spread and Traveling Waves
All of the models considered so far have been spatially homogeneous: they assume that all individuals in the population mix uniformly, with no spatial structure. In this chapter, we relax this assumption and study the spatial spread of infectious diseases, using both continuum (reaction-diffusion PDE) and discrete (network, metapopulation) approaches.
6.1 Reaction-Diffusion Models
The models studied so far assume spatial homogeneity: all individuals interact in a single well-mixed population. In reality, diseases spread geographically, and the spatial dynamics of this spread are of great interest both theoretically and practically. The simplest approach to incorporating spatial spread is through reaction-diffusion equations.
In many settings, it is reasonable to assume that infected individuals have reduced mobility compared to susceptibles (due to illness), or even that \(D_I = 0\) (infected individuals are immobile). However, the basic spatial spread phenomena can be captured with uniform diffusion.
The diffusion coefficients can be estimated from movement data. For human diseases in an urban setting, effective diffusion coefficients can be estimated from commuting data or travel surveys. For wildlife diseases, mark-recapture data or GPS tracking provide estimates of animal movement patterns that can be translated into diffusion coefficients.
6.2 The Fisher-KPP Equation
Before analyzing the full spatial SIR model, we study a simpler equation that captures the essential mechanism of spatial invasion: the Fisher-KPP (Fisher-Kolmogorov-Petrovsky-Piskunov) equation.
This equation models the spatial spread of a population (or a trait, or an infection front) combining local diffusion with logistic-type growth. The key question is: does the equation admit traveling wave solutions, and if so, what is the wave speed?
- Traveling wave solutions exist for all speeds \(c \geq c^* = 2\sqrt{Dr}\).
- No traveling wave exists for \(c < c^*\).
- For compactly supported initial data with \(u(x,0) \in [0,1]\), the solution approaches the traveling wave with the minimum speed \(c^*\) as \(t \to \infty\).
At \((1,0)\), one eigenvalue is positive and one is negative (a saddle point for all \(c > 0\)). The traveling wave corresponds to the unique trajectory (heteroclinic connection) leaving the saddle \((1,0)\) and entering the stable node \((0,0)\). A phase plane argument using the Wazewski principle or a shooting method establishes existence for all \(c \geq c^*\). \(\square\)
The minimum wave speed result has a beautiful interpretation: the spatial spread of an epidemic (or an invading population) proceeds at a speed determined by the diffusion rate and the growth rate at the leading edge, where the invader is rare.
6.3 Traveling Waves for the Spatial SIR Model
The full reaction-diffusion SIR model admits traveling wave solutions representing the spatial advance of an epidemic.
This result shows that the minimum speed of an epidemic wave increases with the diffusion coefficient (faster individual movement), the transmission rate, and \(\mathcal{R}_0\). It provides a quantitative prediction for the speed at which an epidemic front advances through a susceptible population. The formula is directly analogous to the Fisher-KPP minimum wave speed, with the linearized growth rate \(r\) replaced by the epidemic growth rate \(\beta N - \gamma = \gamma(\mathcal{R}_0 - 1)\).
Behind the epidemic wave front, the population transitions from the pre-epidemic state \((S, I) \approx (N, 0)\) to the post-epidemic state \((S, I) = (S_\infty, 0)\), where \(S_\infty\) is determined by the final size relation. The wave profile of \(I\) is a pulse: it rises from zero ahead of the front, peaks at the wave crest, and decays back to zero behind the front (as the local epidemic burns through the available susceptibles).
6.4 Network Epidemic Models
In human populations, disease transmission occurs through social contact networks rather than through spatial diffusion. Network epidemic models describe disease spread on graphs, where nodes represent individuals and edges represent contacts along which transmission can occur.
The formula \(\mathcal{R}_0 = (\beta/\gamma)(\langle k^2 \rangle - \langle k \rangle)/\langle k \rangle\) reveals a fundamental insight: the heterogeneity of the contact network, as measured by the ratio \(\langle k^2 \rangle / \langle k \rangle\), amplifies transmission. For scale-free networks with power-law degree distributions \(p_k \sim k^{-\alpha}\) with \(2 < \alpha \leq 3\), the second moment \(\langle k^2 \rangle\) diverges, implying that \(\mathcal{R}_0 \to \infty\) in the limit of large networks. This means that scale-free networks have no epidemic threshold: any disease with \(\beta > 0\) can spread.
6.5 Metapopulation Models
Between the extremes of well-mixed ODE models and fully spatial PDE models lie metapopulation models, in which the population is divided into discrete patches connected by movement.
Chapter 7: Stochastic Epidemic Models
The deterministic models of the preceding chapters describe the average behavior of an epidemic. In this chapter, we confront the reality that disease transmission is inherently stochastic: each infection event is the result of a chance encounter between a specific susceptible and a specific infected individual. Stochastic models capture the randomness in transmission, leading to probability distributions over epidemic outcomes rather than single deterministic trajectories. The stochastic perspective is essential for understanding the early and late phases of epidemics, where random fluctuations dominate.
7.1 Motivation for Stochasticity
The deterministic ODE models studied in previous chapters are continuous approximations to what is fundamentally a discrete, random process. Each transmission event is a chance encounter between a specific susceptible individual and a specific infected individual. When population sizes are large, the law of large numbers ensures that the deterministic approximation is accurate. However, stochastic effects become important in several situations: when the number of infected individuals is small (as at the beginning or end of an epidemic), when the total population is small, or when we wish to assess the variability in epidemic outcomes.
7.2 The Stochastic SIR Model
The transition from deterministic to stochastic modeling requires a fundamental shift in perspective. Instead of tracking the exact numbers \(S(t)\) and \(I(t)\) as continuous variables, we model them as integer-valued random variables whose evolution is governed by probabilistic rules. The appropriate mathematical framework is that of continuous-time Markov chains.
- Infection: \((s, i) \to (s-1, i+1)\) at rate \(\beta s i\).
- Recovery: \((s, i) \to (s, i-1)\) at rate \(\gamma i\).
The stochastic SIR model is a pure jump process. Unlike the deterministic model, different realizations of the process produce different epidemic trajectories. The state space is finite (since \(s + i \leq N\)), so the process is a finite-state continuous-time Markov chain. The state \((s, 0)\) (for any \(s\)) is absorbing: once all infected individuals have recovered, no further infections can occur. This means the process always terminates in finite time, unlike the deterministic model which approaches its limit only asymptotically. The model is analytically tractable in certain limits, and can always be simulated exactly using the Gillespie algorithm.
7.3 Probability of a Major Epidemic
The most important question about the stochastic SIR model is: given a small number of initial infectives in a large susceptible population, what is the probability that a major epidemic occurs?
- 0 if \(\mathcal{R}_0 \leq 1\),
- \(1 - 1/\mathcal{R}_0\) (approximately) if \(\mathcal{R}_0 > 1\), more precisely the unique solution in \((0,1)\) of \(q = e^{-\mathcal{R}_0(1-q)}\) gives the extinction probability \(q\), and the major epidemic probability is \(1-q\).
The extinction probability \(q\) of a branching process with offspring distribution \(Z\) satisfies \(q = G_Z(q)\), where \(G_Z\) is the probability generating function (pgf). For a geometric offspring distribution with parameter \(p = \gamma/(\beta N + \gamma)\) and mean \(\mathcal{R}_0\), the pgf is \(G_Z(s) = \gamma / (\beta N + \gamma - \beta N s) = 1/(\mathcal{R}_0(1-s) + 1)\). Setting \(q = G_Z(q)\) and solving gives the smallest nonnegative root. For \(\mathcal{R}_0 > 1\), this root satisfies \(q < 1\), and the major epidemic probability is \(1 - q\). For the Poisson offspring approximation, \(q = e^{-\mathcal{R}_0(1-q)}\), which is exactly the final size relation viewed from a different angle. \(\square\)
7.4 Bartlett’s Fadeout and Persistence
In finite populations, stochastic fluctuations can drive the infected population to zero even when \(\mathcal{R}_0 > 1\). This phenomenon, known as Bartlett’s fadeout (or stochastic extinction), is the primary mechanism by which infectious diseases disappear from small populations.
Bartlett (1957, 1960) observed that measles persisted continuously in large cities like London and New York but suffered repeated fadeouts in smaller cities, requiring reintroduction from larger population centers. The CCS depends on the disease parameters: diseases with higher \(\mathcal{R}_0\) and shorter infectious periods tend to have larger critical community sizes, because their inter-epidemic troughs are deeper (more susceptibles are depleted during an epidemic), making stochastic extinction more likely.
7.5 The Gillespie Algorithm
Exact simulation of the stochastic SIR model is straightforward using the Gillespie algorithm (also known as the stochastic simulation algorithm, SSA).
- Compute the total rate: \(a_0 = \beta S I + \gamma I\).
- If \(a_0 = 0\) or \(I = 0\), stop (absorbing state reached).
- Draw the time to next event: \(\tau \sim \mathrm{Exp}(a_0)\), i.e., \(\tau = -\ln(u_1)/a_0\) where \(u_1 \sim \mathrm{Uniform}(0,1)\).
- Update \(t \leftarrow t + \tau\).
- Determine the event type: with probability \(\beta SI / a_0\) the event is an infection (\(S \to S-1\), \(I \to I+1\)); with probability \(\gamma I / a_0\) it is a recovery (\(I \to I-1\)).
- Return to step 1.
The Gillespie algorithm generates exact sample paths of the continuous-time Markov chain. Each realization is a sequence of discrete events occurring at random times. The algorithm is computationally efficient for small to moderate population sizes but becomes expensive for very large populations, motivating approximate methods such as tau-leaping or hybrid deterministic-stochastic approaches.
7.6 Quasi-Stationary Distributions
In the stochastic SIR model with demographics (births replenishing susceptibles), the state \(I = 0\) is absorbing: once the disease disappears, it cannot return without external reintroduction. However, before reaching this absorbing state, the process may spend a very long time fluctuating around the deterministic endemic equilibrium. The statistical distribution of the process conditioned on non-extinction is the quasi-stationary distribution.
The exponential scaling of the mean extinction time with \(N\) explains why large populations can sustain endemic diseases almost indefinitely, while small populations experience frequent fadeouts. For the stochastic SIR model with demographics, the QSD is approximately Gaussian centered near the deterministic endemic equilibrium, with variance of order \(1/N\) (as predicted by the system-size expansion of van Kampen).
Chapter 8: Population Dynamics
The final chapter of this course broadens our perspective from disease dynamics to the dynamics of populations themselves. Many of the mathematical tools we have developed — phase plane analysis, stability theory, bifurcation analysis — apply equally well to models of population growth, predator-prey interactions, and interspecific competition. Moreover, understanding population dynamics is essential background for epidemiology, since the host population provides the substrate on which disease transmission occurs.
8.1 Single-Species Models
We now turn from the dynamics of disease to the broader field of population dynamics, which provides both the demographic backdrop for epidemic models and a rich source of mathematical structures in its own right. We begin with single-species models.
The logistic equation is the simplest model incorporating density dependence: the per-capita growth rate decreases linearly from \(r\) (at low density) to 0 (at carrying capacity). The solution is
\[ N(t) = \frac{K N_0}{N_0 + (K - N_0)e^{-rt}}, \]a sigmoid curve approaching \(K\) as \(t \to \infty\). The equilibrium \(N = K\) is globally asymptotically stable (for \(N_0 > 0\)), and \(N = 0\) is unstable.
While the logistic equation is a useful starting point, many populations exhibit more complex density-dependent dynamics. A particularly important phenomenon is the Allee effect.
The Allee effect model has three equilibria: \(N = 0\) (stable), \(N = A\) (unstable threshold), and \(N = K\) (stable). Populations above \(A\) grow to \(K\); populations below \(A\) decline to extinction. This bistability has important implications for conservation biology (endangered species may be driven below the Allee threshold by habitat fragmentation) and for biological invasions (an invading species must establish a population above the threshold to persist).
The strong Allee effect creates a “tipping point” in population dynamics that is mathematically analogous to the backward bifurcation in epidemiology (Section 4.5): in both cases, the system has two stable states separated by an unstable threshold, and perturbations beyond the threshold lead to qualitatively different outcomes. The connection is not merely analogical — both phenomena arise from the same mathematical structure (a saddle-node bifurcation creating bistability).
8.2 Predator-Prey Models
The interaction between predators and their prey is one of the most fundamental ecological relationships, and its mathematical description has been a central topic in mathematical biology since the pioneering work of Lotka (1925) and Volterra (1926).
The Lotka-Volterra model has a unique coexistence equilibrium at \(N^* = d/(ba)\), \(P^* = r/a\), and a family of neutrally stable periodic orbits surrounding it. The system has a conserved quantity
\[ H(N, P) = baN - d\ln N + aP - r\ln P, \]and the trajectories are closed curves (level sets of \(H\)) in the \((N, P)\)-plane. The period of the oscillations depends on the initial conditions (larger orbits have longer periods), and the time-averaged population sizes satisfy \(\langle N \rangle = N^*\) and \(\langle P \rangle = P^*\), where the averages are taken over one complete cycle.
8.3 Functional Responses
A key simplification in the Lotka-Volterra model is that the per-capita predation rate is linear in prey density: each predator consumes prey at rate \(aN\) regardless of how abundant the prey is. In reality, predators become satiated when prey is abundant, and they spend time handling prey, reducing their effective search rate. The concept of a functional response, introduced by Holling (1959), captures these effects.
- Type I: \(f(N) = aN\) (linear, unsaturating). This is the Lotka-Volterra assumption.
- Type II: \(f(N) = \frac{aN}{1 + ahN}\), where \(h\) is the handling time. This saturates at \(1/h\) as \(N \to \infty\).
- Type III: \(f(N) = \frac{aN^2}{1 + ahN^2}\). This is sigmoidal, with an accelerating phase at low \(N\) (due to prey switching or learning) followed by saturation.
The Type II functional response is the most commonly used and arises from a simple mechanistic argument. A predator alternates between searching for prey (at rate \(a\)) and handling prey (taking time \(h\) per item). If \(T\) is the total time available, the number of prey consumed is \(aN_{\text{search}}T_{\text{search}}\), where the search time satisfies \(T = T_{\text{search}} + h \cdot (\text{number consumed})\). Solving self-consistently gives the Holling Type II expression.
8.4 The Rosenzweig-MacArthur Model
Replacing the linear functional response in the Lotka-Volterra model with a Type II response and adding logistic prey growth gives the Rosenzweig-MacArthur model, which is one of the most important and well-studied models in theoretical ecology.
This model has a coexistence equilibrium at prey density \(N^* = d/(ba - dah) = d/(a(b - dh))\), provided \(b > dh\) (i.e., the conversion efficiency exceeds the death rate times the handling time). The predator equilibrium is found by setting the prey equation to zero.
The stability of the coexistence equilibrium depends critically on the carrying capacity \(K\). Intuitively, increasing \(K\) (enriching the environment with more resources) should benefit both species. However, the mathematical analysis reveals a surprising and counterintuitive outcome. This dependence leads to one of the most striking results in theoretical ecology.
- For small \(K\), the prey-only equilibrium \((K, 0)\) is the only positive equilibrium and is globally stable.
- At \(K = N^*\), a transcritical bifurcation occurs and the coexistence equilibrium appears.
- For \(K\) slightly above \(N^*\), the coexistence equilibrium is stable.
- At a critical value \(K = K_H\), a supercritical Hopf bifurcation occurs: the equilibrium loses stability and a stable limit cycle appears.
- For \(K > K_H\), the system exhibits sustained predator-prey oscillations with increasing amplitude as \(K\) increases.
The name “paradox of enrichment” was coined by Rosenzweig (1971) and has become one of the most widely discussed results in theoretical ecology. The paradox is the counterintuitive result that enriching the environment (increasing \(K\)) can destabilize the predator-prey interaction, leading to large-amplitude oscillations that bring both populations dangerously close to zero. In ecological terms, environmental enrichment can increase the risk of extinction through large population fluctuations — a cautionary tale for ecosystem management.
8.5 Competitive Exclusion
When two species compete for the same resource, a fundamental question is whether coexistence is possible. The competitive exclusion principle, formalized mathematically by the Lotka-Volterra competition model, provides a sharp answer for the simplest models.
- If \(K_1/\alpha_{12} > K_2\) and \(K_2/\alpha_{21} < K_1\): species 1 wins (globally attracts all trajectories with \(N_1(0) > 0\)).
- If \(K_1/\alpha_{12} < K_2\) and \(K_2/\alpha_{21} > K_1\): species 2 wins.
- If \(K_1/\alpha_{12} > K_2\) and \(K_2/\alpha_{21} > K_1\): stable coexistence at a unique interior equilibrium (intraspecific competition exceeds interspecific competition).
- If \(K_1/\alpha_{12} < K_2\) and \(K_2/\alpha_{21} < K_1\): bistability — the winner depends on initial conditions (interspecific competition exceeds intraspecific competition).
8.6 Trophic Cascades
The interactions studied above — predator-prey and competition — involve two trophic levels. In real ecosystems, multiple trophic levels interact, and perturbations can propagate up and down the food chain. A trophic cascade occurs when changes at one trophic level cause alternating effects at successively lower levels.
The mathematical analysis of three-level food chains reveals a rich dynamical repertoire. With Type II functional responses at both trophic links, the system can exhibit stable equilibria, limit cycles (via Hopf bifurcation as the carrying capacity increases), and even chaotic dynamics. The transition from regular to chaotic behavior in food chain models has been studied extensively by Hastings and Powell (1991) and is related to the interaction between the Hopf bifurcation at the resource-consumer level (paradox of enrichment) and the dynamics introduced by the top predator. The Hastings-Powell model demonstrates that chaos in ecology is not merely a theoretical curiosity but arises naturally in simple, biologically motivated food chain models with saturating functional responses.
- With an odd number of trophic levels, the basal resource is consumer-limited (suppressed below carrying capacity).
- With an even number of trophic levels, the basal resource is near carrying capacity (released from consumer control).
This elegant prediction has been broadly confirmed empirically: in systems with wolves, elk, and vegetation (three levels), the vegetation is suppressed; removing the wolves releases the elk from predator control and further suppresses vegetation. In four-level systems, enrichment would benefit the second and fourth levels. The mathematical formalization through food chain models provides rigorous conditions under which these predictions hold and identifies the parameter regimes where more complex dynamics (oscillations, chaos) can arise.
8.7 Synthesis: Epidemiology Meets Ecology
The mathematical frameworks developed in this course — compartmental models, stability analysis, bifurcation theory, spatial dynamics, stochastic processes, and population dynamics — are deeply interconnected. Epidemic models are, at their core, models of interacting populations: susceptible “prey” consumed by infectious “predators,” with the additional feature that new “predators” are produced from consumed “prey” rather than by independent reproduction.
The mathematical tools we have developed — phase plane analysis, Lyapunov functions, bifurcation theory, next-generation operators, traveling wave analysis, branching processes — form a coherent toolkit applicable across the biological sciences. The challenge for the applied mathematician is to deploy these tools judiciously, constructing models that capture the essential features of the biological system at hand while remaining mathematically tractable.
The interplay between mathematical theory and biological application has been a recurring theme throughout this course. The theorems we have proved are not sterile abstractions: each one — from the epidemic threshold theorem to the paradox of enrichment — illuminates a genuine biological phenomenon and provides quantitative tools for prediction and control. As the COVID-19 pandemic vividly demonstrated, mathematical epidemiology is not merely an academic exercise: the models and analyses developed in this course have direct and immediate relevance to public health decision-making and the welfare of human populations worldwide.
The frontier of mathematical epidemiology and population dynamics continues to expand. Current research areas include:
Multi-scale models coupling within-host pathogen dynamics to between-host transmission, where the immune response within an individual affects their infectiousness to others.
Models incorporating behavioral feedback, where human behavior changes in response to perceived disease risk, creating a coupled human-disease dynamical system.
Evolutionary epidemiology, studying the co-evolution of pathogens and hosts, including the evolution of virulence and drug resistance.
Phylodynamics, which integrates genomic sequence data with epidemiological models to reconstruct transmission histories and estimate key epidemiological parameters.
Each of these areas demands the same combination of biological insight and mathematical rigor that has characterized the field since the pioneering work of Bernoulli, Ross, and Kermack-McKendrick.