CO 759: Algorithmic Game Theory

Chaitanya Swamy

Estimated study time: 2 hr 15 min

Table of contents

These notes synthesize material from three primary sources: Tim Roughgarden’s Twenty Lectures on Algorithmic Game Theory (Stanford CS364A, Fall 2013) and the accompanying YouTube lecture videos; the textbook Algorithmic Game Theory by Nisan, Roughgarden, Tardos, and Vazirani (Cambridge, 2007); and Chaitanya Swamy’s CO 759 course at the University of Waterloo. Research papers by Lavi–Swamy, Dughmi–Roughgarden, Christodoulou–Koutsoupias, and Bhawalkar–Roughgarden are cited where their results appear.


Chapter 1: Games and Solution Concepts

Algorithmic game theory sits at the intersection of two intellectual traditions. Economics has long studied how rational agents interact — what equilibria look like, when they exist, and whether they are efficient. Computer science brings a complementary set of questions: Can equilibria be computed efficiently? Can we design systems where strategic behavior leads to approximately optimal outcomes, even when exact optimization is intractable? The field was born from the realization that neither discipline alone could answer the questions raised by the Internet, online markets, and large-scale decentralized systems.

This chapter establishes the foundational language: games, strategies, equilibria, and the metrics that quantify how badly selfish behavior can damage social welfare.

1.1 Strategic-Form Games and Dominant Strategies

The basic object of study is a strategic-form game (also called a normal-form game). Such a game consists of three ingredients:

Definition 1.1 (Strategic-Form Game). A strategic-form game consists of:
  1. A finite set \(N = \{1, 2, \ldots, n\}\) of players.
  2. For each player \(i \in N\), a set \(S_i\) of strategies (or actions).
  3. For each player \(i \in N\), a cost function \(C_i : S_1 \times \cdots \times S_n \to \mathbb{R}\) (or, equivalently, a payoff/utility function \(u_i = -C_i\)).
A strategy profile (or outcome) is a tuple \(\mathbf{s} = (s_1, \ldots, s_n) \in S_1 \times \cdots \times S_n\). We write \(\mathbf{s}_{-i}\) for the strategies of all players except \(i\), so that \(\mathbf{s} = (s_i, \mathbf{s}_{-i})\).

When we say a player is “rational,” we mean the player chooses a strategy to minimize its own cost (or maximize its own utility), taking the other players’ behavior into account. The simplest — and strongest — form of rational behavior is playing a dominant strategy: one that is optimal regardless of what everyone else does.

Definition 1.2 (Dominant Strategy). A strategy \(s_i^* \in S_i\) is a dominant strategy for player \(i\) if for every strategy profile \(\mathbf{s}_{-i}\) of the other players and every alternative strategy \(s_i' \in S_i\), \[ C_i(s_i^*, \mathbf{s}_{-i}) \leq C_i(s_i', \mathbf{s}_{-i}). \] A strategy profile in which every player plays a dominant strategy is a dominant-strategy equilibrium.

Dominant-strategy equilibria are the gold standard: they require no assumptions about others’ rationality, no coordination, no beliefs about the game. Unfortunately, most interesting games do not possess them. The Prisoner’s Dilemma is a celebrated exception.

Example 1.3 (Prisoner's Dilemma). Two suspects are interrogated separately. Each can cooperate (stay silent) or defect (betray). The cost matrix (lower is better) is:
CooperateDefect
Cooperate2, 25, 0
Defect0, 54, 4

Defecting is a dominant strategy for both players: regardless of the other’s choice, defecting yields a lower cost. The dominant-strategy equilibrium (Defect, Defect) costs each player 4 — yet both would prefer (Cooperate, Cooperate) at cost 2 each. This gap between individual rationality and collective welfare is the central tension of game theory.

Prisoner's Dilemma: Normal FormCooperateDefectCooperateDefect2, 25, 00, 54, 4 ★RowColumn★ = dominant-strategy NE

The Prisoner’s Dilemma teaches a lesson that will recur throughout these notes: rational behavior by individuals can produce outcomes that are socially suboptimal. The entire field of mechanism design (Chapters 2–6) is, in a sense, the science of closing this gap by carefully choosing the rules of the game.

Best response correspondences for Matching Pennies: $BR_1(q)$ (blue) and $BR_2(p)$ (red) intersect at the unique mixed Nash equilibrium $(p,q)=(1/2,1/2)$ (green star).

1.2 Pure and Mixed Nash Equilibria

When dominant strategies do not exist, we need a weaker solution concept. The most fundamental is the Nash equilibrium, where each player’s strategy is a best response to the strategies of the others — no one can unilaterally improve.

Definition 1.4 (Pure Nash Equilibrium). A strategy profile \(\mathbf{s} = (s_1, \ldots, s_n)\) is a pure Nash equilibrium (PNE) if for every player \(i \in \{1, \ldots, n\}\) and every alternative strategy \(s_i' \in S_i\), \[ C_i(s_i, \mathbf{s}_{-i}) \leq C_i(s_i', \mathbf{s}_{-i}). \]

Every dominant-strategy equilibrium is a Nash equilibrium, but not conversely. In a Nash equilibrium, a player’s strategy need only be optimal given the specific strategies chosen by the others — not for all possible strategies of the others.

Not every game has a pure Nash equilibrium. The canonical example is Rock-Paper-Scissors. Whatever pure strategy profile is proposed, one player can switch and improve:

RockPaperScissors
Rock0, 0−1, 11, −1
Paper1, −10, 0−1, 1
Scissors−1, 11, −10, 0

A crucial idea, developed largely by John von Neumann, is to allow players to randomize. A mixed strategy for player \(i\) is a probability distribution \(\sigma_i\) over \(S_i\). In Rock-Paper-Scissors, if both players randomize uniformly over the three strategies, neither can improve their expected payoff by deviating — this is a mixed-strategy Nash equilibrium.

Definition 1.5 (Mixed Nash Equilibrium). A profile of distributions \((\sigma_1, \ldots, \sigma_n)\) over strategy sets \(S_1, \ldots, S_n\) is a mixed Nash equilibrium (MNE) if for every player \(i\) and every alternative strategy \(s_i' \in S_i\), \[ \mathbf{E}_{\mathbf{s} \sim \sigma}[C_i(\mathbf{s})] \leq \mathbf{E}_{\mathbf{s} \sim \sigma}[C_i(s_i', \mathbf{s}_{-i})], \] where \(\sigma = \sigma_1 \times \cdots \times \sigma_n\) is the product distribution (players randomize independently).

Every PNE is a special case of an MNE (where each \(\sigma_i\) is a point mass). The remarkable guarantee is that mixed Nash equilibria always exist:

Theorem 1.6 (Nash, 1950). Every finite game — that is, every game with finitely many players and finite strategy sets — has at least one mixed Nash equilibrium.

Nash’s theorem is proved via Brouwer’s (or Kakutani’s) fixed-point theorem. It is an existence result, not an algorithmic one. As we will see in Chapter 12, computing a Nash equilibrium is PPAD-complete even for two-player games — a complexity class that captures problems where existence is guaranteed by a parity argument but efficient computation is believed to be impossible.

1.3 Correlated and Coarse Correlated Equilibria

The computational intractability of Nash equilibria motivates the search for more permissive equilibrium concepts. The key relaxation is to allow players’ strategies to be correlated — coordinated by a trusted mediator or public signal — rather than independently randomized.

The equilibrium concepts we study form a nested hierarchy:

\[ \text{PNE} \subseteq \text{MNE} \subseteq \text{CE} \subseteq \text{CCE}, \]

where each inclusion is strict in general. Moving outward enlarges the set of equilibria, which trades off stronger existence and tractability guarantees against weaker behavioral assumptions.

Definition 1.7 (Correlated Equilibrium). A distribution \(\sigma\) over outcomes \(S_1 \times \cdots \times S_n\) (not necessarily a product distribution) is a correlated equilibrium (CE) if for every player \(i\), every strategy \(s_i \in S_i\), and every alternative \(s_i' \in S_i\), \[ \mathbf{E}_{\mathbf{s} \sim \sigma}[C_i(\mathbf{s}) \mid s_i] \leq \mathbf{E}_{\mathbf{s} \sim \sigma}[C_i(s_i', \mathbf{s}_{-i}) \mid s_i]. \]

The interpretation is elegant. A trusted third party (the “mediator”) samples an outcome \(\mathbf{s}\) from \(\sigma\), then privately tells each player \(i\) its recommended strategy \(s_i\). Player \(i\) observes \(s_i\) but not the full outcome. Knowing the distribution \(\sigma\) and its own recommendation, player \(i\) conditions on \(s_i\) and checks: is following the recommendation at least as good as any deviation? If so for every player and every recommendation, \(\sigma\) is a correlated equilibrium.

Example 1.8 (Traffic Light as Correlated Equilibrium). Consider two drivers approaching an intersection, modeled as a game:
StopGo
Stop0, 00, 1
Go1, 0−5, −5

There are two pure Nash equilibria: (Stop, Go) and (Go, Stop). Randomizing 50/50 between these two outcomes — the traffic light alternating green and red — gives a correlated equilibrium that is not a product distribution and hence not a mixed Nash equilibrium. When told “go,” a driver infers the other was told “stop” and has no incentive to deviate.

An even more permissive concept drops the conditioning:

Definition 1.9 (Coarse Correlated Equilibrium). A distribution \(\sigma\) over outcomes is a coarse correlated equilibrium (CCE) if for every player \(i\) and every alternative \(s_i' \in S_i\), \[ \mathbf{E}_{\mathbf{s} \sim \sigma}[C_i(\mathbf{s})] \leq \mathbf{E}_{\mathbf{s} \sim \sigma}[C_i(s_i', \mathbf{s}_{-i})]. \]

The difference is subtle but important. In a CE, a player considers deviating after seeing its recommendation — a conditional guarantee. In a CCE, a player commits to a deviation before seeing its recommendation — an unconditional guarantee. Since the unconditional constraint is weaker, every CE is a CCE, but not conversely.

Why should we care about CCE? Two reasons. First, as we will prove in Chapter 10, natural learning dynamics (specifically, no-regret algorithms) converge to the set of CCE. This means that if players repeatedly play a game and adjust their strategies using simple, well-studied learning rules, the empirical distribution of play approaches a CCE. Second, for a large and important class of games — the smooth games of Chapter 9 — price-of-anarchy bounds extend automatically from PNE all the way to CCE. This means we can prove efficiency guarantees for the outcomes of learning dynamics, not just for idealized equilibria.

1.4 Price of Anarchy and Price of Stability

The central quantitative question of algorithmic game theory is: how much efficiency is lost due to selfish behavior? The two standard metrics compare equilibrium outcomes to the social optimum.

Definition 1.10 (Social Cost and Welfare). Given a strategy profile \(\mathbf{s}\), the social cost is \(C(\mathbf{s}) = \sum_{i=1}^n C_i(\mathbf{s})\) and the social welfare is \(W(\mathbf{s}) = \sum_{i=1}^n u_i(\mathbf{s})\), where \(u_i = -C_i\). An optimal outcome \(\mathbf{s}^*\) minimizes social cost (or maximizes social welfare).
Definition 1.11 (Price of Anarchy and Price of Stability). For a game with a set \(\mathcal{E}\) of equilibria (under a given equilibrium concept), the price of anarchy (PoA) and price of stability (PoS) are: \[ \text{PoA} = \frac{\max_{\mathbf{s} \in \mathcal{E}} C(\mathbf{s})}{\min_{\mathbf{s}} C(\mathbf{s})}, \qquad \text{PoS} = \frac{\min_{\mathbf{s} \in \mathcal{E}} C(\mathbf{s})}{\min_{\mathbf{s}} C(\mathbf{s})}. \] Both ratios are at least 1 (the optimum is always at least as good as any equilibrium). The PoA measures the damage of the worst equilibrium; the PoS measures the cost of the best.

When the PoA is close to 1, selfish behavior is essentially benign — there is little reason to intervene. When the PoA is large, mechanism design or regulation may be needed to improve outcomes.

Price of Anarchy: Social Cost ComparisonSocial OptimumC*Nash Equil. (best)C_NENE (worst)C_maxPoS = C_NE / C*PoA = C_max / C* ≥ PoS ≥ 1

The most vivid illustration of PoA > 1 is Braess’s paradox, which shows that adding capacity to a network can make all users worse off.

Example 1.12 (Braess's Paradox). Consider a network with source \(s\), destination \(t\), and four nodes arranged in a diamond. There are two routes, each consisting of one "narrow" edge with cost \(c(x) = x\) (proportional to congestion) and one "wide" edge with cost \(c(x) = 1\) (constant):
         v
        / \
  c=x  /   \ c=1
      /     \
  s          t
      \     /
  c=1  \   / c=x
        \ /
         w

With one unit of traffic, the equilibrium splits traffic evenly: each route costs \(1 + \frac{1}{2} = \frac{3}{2}\). Now add a zero-cost shortcut from \(v\) to \(w\). The route \(s \to v \to w \to t\) is never worse than the original routes and is strictly better when any traffic avoids it. In equilibrium, all traffic uses the shortcut, and the cost rises to \(1 + 1 = 2\). Adding a free link increased every driver’s travel time from 1.5 to 2 hours.

The price of anarchy before the shortcut is \(\frac{3/2}{3/2} = 1\) (the equilibrium is optimal). After the shortcut, the optimal flow still achieves cost \(\frac{3}{2}\) (split evenly, ignoring the shortcut), but the equilibrium costs 2, giving \(\text{PoA} = \frac{2}{3/2} = \frac{4}{3}\).

Braess’s paradox is not a mathematical curiosity. Real road networks exhibit this phenomenon: Seoul, South Korea, demolished a major highway through the city center in 2003 and found that traffic flow improved. The paradox arises in electrical networks too — Cohen and Horowitz (1991) demonstrated a mechanical analogue with strings and springs where severing a taut string causes a heavy weight to rise, in direct correspondence with removing the zero-cost link in Braess’s network.

The overarching theme of Part III (Chapters 7–9) will be to identify broad classes of games where the PoA is provably small — where selfish behavior, despite being suboptimal, is not too far from the social optimum.


Chapter 2: Mechanism Design Foundations

Chapter 1 took games as given and asked how players behave. Mechanism design inverts the question: given strategic players, how should we design the rules of the game so that self-interested behavior leads to a desirable outcome? As Hartline and Kleinberg memorably put it after the 2012 Olympic badminton scandal — in which teams deliberately tried to lose matches to secure favorable bracket positions — “the next time we bemoan people exploiting loopholes to subvert the intent of the rule makers, instead of asking ‘What’s wrong with these people?’ let’s instead ask ‘What’s wrong with the rules?’ and then adopt a scientifically principled approach to fixing them.”

Mechanism design is the science of rule-making. Its killer applications include Internet search auctions, wireless spectrum sales, matching medical residents to hospitals, school choice, and kidney exchange markets. This chapter builds the foundational framework: single-parameter environments, dominant-strategy incentive compatibility, Myerson’s Lemma, the Revelation Principle, and the VCG mechanism.

2.1 Single-Parameter Environments

The cleanest setting for mechanism design is one where each participant’s private information is captured by a single number. This abstraction covers a remarkable range of applications.

Definition 2.1 (Single-Parameter Environment). A single-parameter environment consists of:
  1. \(n\) bidders (or agents).
  2. Each bidder \(i\) has a private valuation \(v_i \geq 0\), representing its value per unit of "stuff" received.
  3. A feasible set \(X \subseteq \mathbb{R}^n\), where each \(\mathbf{x} = (x_1, \ldots, x_n) \in X\) specifies the amount of stuff allocated to each bidder.
Under quasilinear utility, bidder \(i\)'s utility from allocation \(\mathbf{x}\) and payment \(p_i\) is \(u_i = v_i \cdot x_i - p_i\).

The term “single-parameter” refers to the fact that each bidder’s private information is the single number \(v_i\). The feasible set \(X\) is public knowledge. Examples include:

  • Single-item auction: \(X = \{0,1\}^n\) with \(\sum_i x_i \leq 1\). Each \(x_i\) indicates whether bidder \(i\) wins the item.
  • \(k\) identical items: \(X = \{0,1\}^n\) with \(\sum_i x_i \leq k\).
  • Sponsored search: \(X\) consists of assignments of bidders to advertising slots, where \(x_i = \alpha_j\) if bidder \(i\) is assigned to slot \(j\) with click-through rate \(\alpha_j\), and \(x_i = 0\) if unassigned.

A sealed-bid mechanism in this setting operates in three steps: (1) collect bids \(\mathbf{b} = (b_1, \ldots, b_n)\); (2) choose an allocation \(\mathbf{x}(\mathbf{b}) \in X\) via an allocation rule; (3) choose payments \(\mathbf{p}(\mathbf{b}) \in \mathbb{R}^n\) via a payment rule.

2.2 Dominant-Strategy Incentive Compatibility

The designer’s goal is to construct mechanisms where bidders willingly reveal their true valuations, regardless of what others do.

Definition 2.2 (DSIC). A sealed-bid mechanism \((\mathbf{x}, \mathbf{p})\) is dominant-strategy incentive compatible (DSIC) if for every bidder \(i\), every valuation \(v_i\), and every vector of bids \(\mathbf{b}_{-i}\) by the other bidders:
  1. Truthfulness: Bidding \(b_i = v_i\) maximizes \(i\)'s utility, i.e., \(v_i \cdot x_i(v_i, \mathbf{b}_{-i}) - p_i(v_i, \mathbf{b}_{-i}) \geq v_i \cdot x_i(b_i, \mathbf{b}_{-i}) - p_i(b_i, \mathbf{b}_{-i})\) for all \(b_i\).
  2. Individual rationality: A truthful bidder never receives negative utility, i.e., \(v_i \cdot x_i(v_i, \mathbf{b}_{-i}) - p_i(v_i, \mathbf{b}_{-i}) \geq 0\).

DSIC mechanisms are desirable for two reasons. From a bidder’s perspective, participating is risk-free: bid truthfully and you can’t lose money. From the designer’s perspective, outcomes are predictable: since truthful bidding is a dominant strategy, the mechanism’s performance doesn’t depend on assumptions about bidder sophistication or beliefs about others.

The prototype of a DSIC mechanism is the Vickrey auction (or second-price auction) for a single item.

Theorem 2.3 (Vickrey, 1961). In a second-price sealed-bid auction — where the highest bidder wins and pays the second-highest bid — truthful bidding is a dominant strategy for every bidder.
Proof. Fix bidder \(i\) with valuation \(v_i\), and let \(B = \max_{j \neq i} b_j\) be the highest competing bid. Bidder \(i\)'s utility is 0 if it loses (\(b_i < B\)) and \(v_i - B\) if it wins (\(b_i \geq B\)). The optimal utility is \(\max\{0, v_i - B\}\). If \(v_i > B\), bidding \(b_i = v_i\) wins and achieves \(v_i - B > 0\). If \(v_i < B\), bidding \(b_i = v_i\) loses and achieves 0. In both cases, the bid \(b_i = v_i\) achieves the maximum possible utility.

The Vickrey auction is remarkable: it simultaneously achieves DSIC, maximizes social surplus (the item goes to the bidder who values it most), and runs in linear time. Roughgarden calls such mechanisms “awesome” — they satisfy all three desiderata of incentive compatibility, welfare maximization, and computational efficiency. The central question of mechanism design is: for which problems beyond single-item auctions can we design “awesome” mechanisms?

2.3 Myerson’s Lemma

The Vickrey auction solves Step 2 of mechanism design (determining payments) for single-item auctions. But what about more complex allocation problems — sponsored search with multiple slots, knapsack auctions, scheduling? Myerson’s Lemma provides a universal answer for all single-parameter environments.

The key definitions concern the allocation rule alone:

Definition 2.4 (Implementable and Monotone Allocation Rules). An allocation rule \(\mathbf{x}\) for a single-parameter environment is:
  • Implementable if there exists a payment rule \(\mathbf{p}\) such that \((\mathbf{x}, \mathbf{p})\) is DSIC.
  • Monotone if for every bidder \(i\) and bids \(\mathbf{b}_{-i}\), the allocation \(x_i(z, \mathbf{b}_{-i})\) to bidder \(i\) is nondecreasing in its bid \(z\).

Myerson’s Lemma: piecewise-constant monotone allocation rule $x_i(z)$ (left) and its uniquely determined payment $p_i(z)$ — the area to the left of the allocation curve (right).

Monotonicity is a natural condition: bidding higher should only get you more stuff. The allocation rule that gives the item to the highest bidder is monotone. The rule that gives the item to the second-highest bidder is not — raising your bid enough causes you to go from winner to loser.

Theorem 2.5 (Myerson's Lemma). Fix a single-parameter environment.
  1. An allocation rule \(\mathbf{x}\) is implementable if and only if it is monotone.
  2. If \(\mathbf{x}\) is monotone, there is a unique payment rule \(\mathbf{p}\) such that \((\mathbf{x}, \mathbf{p})\) is DSIC (assuming the normalization \(p_i(\mathbf{b}) = 0\) whenever \(b_i = 0\)).
  3. This payment rule is given by an explicit formula (see below).

Myerson’s Lemma is the foundation of single-parameter mechanism design. Part (a) says that monotonicity completely characterizes which allocation rules can be part of a DSIC mechanism — a clean, checkable condition replaces the unwieldy DSIC definition. Part (b) says there is no design freedom left once the allocation rule is fixed: the payments are uniquely determined. Part (c) gives a constructive formula.

Proof sketch. The proof proceeds by deriving constraints on payments from the DSIC condition, then showing these constraints pin down a unique solution.

Fix bidder \(i\) and bids \(\mathbf{b}_{-i}\). Write \(x(z) = x_i(z, \mathbf{b}_{-i})\) and \(p(z) = p_i(z, \mathbf{b}_{-i})\). For any two bids \(0 \leq y < z\), DSIC requires both:

\[ z \cdot x(z) - p(z) \geq z \cdot x(y) - p(y), \qquad y \cdot x(y) - p(y) \geq y \cdot x(z) - p(z). \]

Rearranging yields the “payment difference sandwich”:

\[ z \cdot [x(y) - x(z)] \leq p(y) - p(z) \leq y \cdot [x(y) - x(z)]. \]

If \(x\) is not monotone — say \(x(y) > x(z)\) for some \(y < z\) — then the left side is positive (since \(z > 0\) and \(x(y) - x(z) > 0\)) while the right side has factor \(y < z\), making the sandwich impossible. This proves the “only if” direction of part (a).

For the “if” direction, assume \(x\) is monotone and piecewise constant with breakpoints \(z_1 < z_2 < \cdots < z_\ell\) in the range \([0, b_i]\). Taking \(y \uparrow z\) in the sandwich forces every jump in \(p\) to equal \(z\) times the corresponding jump in \(x\). With the normalization \(p(0) = 0\), this gives the unique payment:

\[ p_i(b_i, \mathbf{b}_{-i}) = \sum_{j=1}^{\ell} z_j \cdot [\text{jump in } x_i(\cdot, \mathbf{b}_{-i}) \text{ at } z_j]. \]

For general monotone allocation rules (not necessarily piecewise constant), the payment generalizes to:

\[ p_i(b_i, \mathbf{b}_{-i}) = \int_0^{b_i} z \cdot \frac{d}{dz} x_i(z, \mathbf{b}_{-i})\, dz. \]

Myerson’s payment formula has a beautiful geometric interpretation: \(p_i\) equals the area to the left of the allocation curve \(x_i(z, \mathbf{b}_{-i})\) in the range \([0, b_i]\). The bidder’s utility — the area under the allocation curve — is the “consumer surplus.”

As a sanity check, consider the single-item auction with the highest-bidder-wins allocation rule. The allocation curve \(x_i(z, \mathbf{b}_{-i})\) jumps from 0 to 1 at \(z = B = \max_{j \neq i} b_j\). The payment formula gives \(p_i = B \cdot 1 = B\) — exactly the second-price rule. Myerson’s Lemma subsumes the Vickrey auction as a special case.

Application to Sponsored Search. Consider \(k\) slots with click-through rates \(\alpha_1 \geq \alpha_2 \geq \cdots \geq \alpha_k\). The welfare-maximizing allocation assigns the \(j\)-th highest bidder to the \(j\)-th best slot. Re-index so \(b_1 \geq b_2 \geq \cdots \geq b_n\). For the bidder assigned to slot \(i\), the allocation \(x_i(z, \mathbf{b}_{-i})\) increases from 0 to \(\alpha_1\) as \(z\) rises from 0 to \(b_1\), with a jump of \(\alpha_j - \alpha_{j+1}\) at each breakpoint \(b_{j+1}\) (where \(\alpha_{k+1} = 0\)). Applying Myerson’s payment formula yields:

\[ p_i(\mathbf{b}) = \sum_{j=i}^{k} b_{j+1}(\alpha_j - \alpha_{j+1}). \]

This is the unique DSIC payment rule for sponsored search. The actual payment used by Google and Bing is the simpler Generalized Second Price (GSP) auction, where each bidder pays the next-lowest bid per click. Since GSP differs from the Myerson payment, Myerson’s Lemma immediately tells us that GSP is not DSIC — though it has other nice equilibrium properties.

2.4 Social Choice Functions and the Revelation Principle

Myerson’s Lemma is powerful but limited to single-parameter settings. For problems where agents have preferences over multiple items — combinatorial auctions, resource allocation, facility location — we need a more general framework.

Definition 2.6 (General Mechanism Design Environment). A general mechanism design environment consists of:
  • \(n\) strategic agents.
  • A finite set \(\Omega\) of outcomes.
  • For each agent \(i\), a private valuation function \(v_i : \Omega \to \mathbb{R}\), assigning a value to each outcome.
Agent \(i\)'s type is its entire valuation function \(v_i\). Under quasilinear utility, agent \(i\)'s utility from outcome \(\omega\) and payment \(p_i\) is \(v_i(\omega) - p_i\).

The outcome set \(\Omega\) can be very large. In a combinatorial auction with \(n\) bidders and \(m\) items, \(\Omega\) consists of all \((n+1)^m\) allocations of items to bidders (each item goes to one bidder or remains unallocated), and each bidder’s type is a valuation over all \(2^m\) bundles — an exponentially large object.

A social choice function \(f\) maps the agents’ types to an outcome: \(f(v_1, \ldots, v_n) \in \Omega\). The goal of mechanism design is to implement a desirable social choice function — that is, to design a game whose equilibrium outcome coincides with the output of \(f\), even though the agents’ types are private.

A direct mechanism asks each agent to report its type and then applies \(f\) and a payment rule. The Revelation Principle (whose proof extends to multi-parameter settings) says this is without loss of generality:

Theorem 2.7 (Revelation Principle). For every mechanism that implements a social choice function \(f\) in dominant strategies (possibly indirectly, via a complex game), there is a direct-revelation mechanism that implements \(f\) in dominant strategies, where truthful reporting is a dominant strategy for every agent.

The Revelation Principle vastly simplifies the design problem: instead of searching over all possible game forms, we need only consider direct mechanisms where agents report their types truthfully.

2.5 The VCG Mechanism

The most important social choice function is welfare maximization: choose the outcome \(\omega^*\) that maximizes the sum of agents’ values. In single-parameter environments, the Vickrey auction solves this. The Vickrey-Clarke-Groves (VCG) mechanism extends the idea to arbitrary multi-parameter environments.

Theorem 2.8 (The VCG Mechanism). In every general mechanism design environment, there exists a DSIC welfare-maximizing mechanism. It uses:
  • Allocation rule: Choose the welfare-maximizing outcome: \[ \mathbf{x}(\mathbf{b}) = \arg\max_{\omega \in \Omega} \sum_{i=1}^n b_i(\omega). \]
  • Payment rule (Clarke pivot): Each agent \(i\) pays the externality it imposes on the others: \[ p_i(\mathbf{b}) = \underbrace{\max_{\omega \in \Omega} \sum_{j \neq i} b_j(\omega)}_{\text{welfare of others without } i} - \underbrace{\sum_{j \neq i} b_j(\omega^*)}_{\text{welfare of others with } i}, \] where \(\omega^* = \mathbf{x}(\mathbf{b})\) is the chosen outcome.

The Clarke pivot payment has a beautiful interpretation: agent \(i\) pays the damage its presence inflicts on the other agents. In a single-item Vickrey auction, the winning bidder’s externality is \(b_2\) (the second-highest bid), since without the winner, the second-highest bidder would have won at zero cost. VCG thus generalizes the second-price rule.

Proof. Fix agent \(i\) and bids \(\mathbf{b}_{-i}\). Under truthful bidding \(\mathbf{b}_i = \mathbf{v}_i\), agent \(i\)'s utility from the chosen outcome \(\omega^*\) is: \[ v_i(\omega^*) - p_i(\mathbf{b}) = \left[v_i(\omega^*) + \sum_{j \neq i} b_j(\omega^*)\right] - \left[\max_{\omega \in \Omega} \sum_{j \neq i} b_j(\omega)\right]. \]

The second term (B) is a constant independent of \(i\)’s bid. So maximizing utility is equivalent to maximizing the first term (A), which equals the total welfare \(\sum_{j} b_j(\omega^*)\) when \(b_i = v_i\). But the mechanism already maximizes total declared welfare by its allocation rule. If \(i\) bids truthfully, the term it wants to maximize is exactly the term the mechanism is already maximizing. No other bid can improve the outcome for \(i\).

The VCG mechanism is the crown jewel of mechanism design: it works in any environment and always maximizes welfare. But it has three significant limitations that motivate the rest of our study:

  1. Computational intractability. The allocation rule requires solving \(\arg\max_{\omega \in \Omega} \sum_i b_i(\omega)\), which is NP-hard for combinatorial auctions and many other settings. You cannot simply plug in an approximation algorithm — swapping the exact welfare maximizer for an approximation can destroy the DSIC property.

  2. Communication complexity. In a combinatorial auction with \(m\) items, each bidder’s type has \(2^m\) parameters. No protocol can elicit these types with polynomial communication in the worst case.

  3. Revenue pathology. The VCG mechanism can generate zero revenue even in competitive environments. Consider two items \(A, B\) and three bidders: bidder 1 values \(\{A,B\}\) at 1 and everything else at 0; bidder 2 values \(\{A\}\) at 1 and everything else at 0; bidder 3 values \(\{B\}\) at 1 and everything else at 0. The welfare-maximizing allocation gives \(A\) to bidder 2 and \(B\) to bidder 3 (total welfare 2, versus 1 for giving both to bidder 1). But each winning bidder’s Clarke pivot payment is 0: removing bidder 2, the optimal welfare from the others is still 1 (give \(\{A,B\}\) to bidder 1); with bidder 2, the others get welfare 1 (bidder 3 gets \(B\)). No externality, no payment.

These limitations drive the three major research directions we pursue in the following chapters: characterizing implementability beyond VCG (Chapter 3), maximizing revenue rather than welfare (Chapter 4), and designing computationally efficient approximate mechanisms (Chapter 5).


Chapter 3: Characterizing Implementability

Myerson’s Lemma (Theorem 2.5) gave a complete characterization of DSIC mechanisms in single-parameter environments: an allocation rule is implementable if and only if it is monotone. But many important problems — combinatorial auctions, multi-facility location, scheduling with multiple job types — are inherently multi-parameter: each agent’s private type is a vector of valuations, not a single number. In such settings, what characterizes implementability? This chapter develops the answer, following Swamy’s treatment of weak monotonicity, Roberts’ theorem, and the role of randomization.

3.1 Weak Monotonicity (WMON)

The natural generalization of monotonicity to multi-parameter domains is weak monotonicity, sometimes called WMON or 2-cycle monotonicity.

Definition 3.1 (Weak Monotonicity). A social choice function \(f : V_1 \times \cdots \times V_n \to \Omega\) satisfies weak monotonicity (WMON) if for every agent \(i\), every pair of types \(\mathbf{v}_i, \mathbf{v}_i' \in V_i\), and every fixed types \(\mathbf{v}_{-i}\) of the other agents, \[ v_i(f(\mathbf{v}_i, \mathbf{v}_{-i})) - v_i(f(\mathbf{v}_i', \mathbf{v}_{-i})) \geq v_i'(f(\mathbf{v}_i, \mathbf{v}_{-i})) - v_i'(f(\mathbf{v}_i', \mathbf{v}_{-i})). \] Equivalently, rearranging: \[ (v_i - v_i')(f(\mathbf{v}_i, \mathbf{v}_{-i})) \geq (v_i - v_i')(f(\mathbf{v}_i', \mathbf{v}_{-i})). \]

The intuition is clean: if agent \(i\) changes its reported type from \(\mathbf{v}_i'\) to \(\mathbf{v}_i\), the change in the outcome should be “aligned” with the change in the agent’s preferences. When \(i\) reports a type that values outcome \(\omega\) more highly, the mechanism should be more likely to choose \(\omega\).

In single-parameter environments, WMON reduces to ordinary monotonicity: if there is only one dimension of private information (a scalar valuation \(v_i\)), then the condition says that increasing \(v_i\) should lead to a weakly larger allocation \(x_i\) — exactly the monotonicity condition of Myerson’s Lemma.

Theorem 3.2. If a social choice function \(f\) is DSIC implementable (i.e., there exist payments \(p_1, \ldots, p_n\) such that truthful reporting is a dominant strategy), then \(f\) satisfies WMON.
Proof. Suppose \((\mathbf{f}, \mathbf{p})\) is DSIC. Fix agent \(i\) and types \(\mathbf{v}_i, \mathbf{v}_i'\), with \(\mathbf{v}_{-i}\) fixed. Let \(\omega = f(\mathbf{v}_i, \mathbf{v}_{-i})\) and \(\omega' = f(\mathbf{v}_i', \mathbf{v}_{-i})\). By DSIC, type \(\mathbf{v}_i\) prefers reporting truthfully: \[ v_i(\omega) - p_i(\mathbf{v}_i, \mathbf{v}_{-i}) \geq v_i(\omega') - p_i(\mathbf{v}_i', \mathbf{v}_{-i}). \] Similarly, type \(\mathbf{v}_i'\) prefers reporting truthfully: \[ v_i'(\omega') - p_i(\mathbf{v}_i', \mathbf{v}_{-i}) \geq v_i'(\omega) - p_i(\mathbf{v}_i, \mathbf{v}_{-i}). \] Adding these two inequalities, the payment terms cancel, yielding WMON.

WMON is a necessary condition for implementability. The deep question is: when is it sufficient?

3.2 Characterization in Single-Dimensional and Convex Domains

For single-parameter environments, Myerson’s Lemma tells us that WMON (which reduces to monotonicity) is both necessary and sufficient for implementability. This extends to certain multi-parameter settings.

A domain \(V_i\) for agent \(i\) is the set of possible types. The domain is convex if for any two types \(\mathbf{v}_i, \mathbf{v}_i' \in V_i\) and any \(\lambda \in [0,1]\), the type \(\lambda \mathbf{v}_i + (1-\lambda)\mathbf{v}_i'\) is also in \(V_i\). Many natural domains are convex: the set of all additive valuations, the set of all submodular valuations over a continuous relaxation, or the full type space \(\mathbb{R}^{|\Omega|}\).

Theorem 3.3 (Rochet, 1987; Bikhchandani et al., 2006). If the domain \(V_i\) is convex for every agent \(i\), then a social choice function \(f\) is DSIC implementable if and only if it satisfies WMON.

The proof, which we omit, proceeds by showing that WMON implies the absence of negative-weight cycles in a certain “type graph” whose nodes are types and whose edge weights encode the incentive constraints. The absence of negative cycles allows one to construct payments via shortest-path distances from a reference type — a beautiful connection to network optimization.

For non-convex domains, WMON is generally not sufficient. One needs the stronger condition of cycle monotonicity (no negative-weight cycles of any length in the type graph, not just length 2). Cycle monotonicity is always necessary and sufficient, but it is much harder to verify in practice than WMON.

3.3 Roberts’ Theorem and Affine Maximizers

What happens when the domain is completely unrestricted — when agents can have arbitrary valuations over all outcomes? Roberts’ theorem gives a striking and restrictive answer.

Definition 3.4 (Affine Maximizer). A social choice function \(f\) is an affine maximizer if there exist nonnegative weights \(\lambda_1, \ldots, \lambda_n \geq 0\) (not all zero) and outcome bonuses \(\gamma_\omega \in \mathbb{R}\) for each \(\omega \in \Omega\) such that \[ f(\mathbf{v}) = \arg\max_{\omega \in \Omega} \left[\sum_{i=1}^n \lambda_i v_i(\omega) + \gamma_\omega\right]. \]

Affine maximizers are a slight generalization of welfare maximization: instead of maximizing the unweighted sum of values, they maximize a weighted sum plus outcome-specific bonuses. The VCG mechanism is the special case where all \(\lambda_i = 1\) and all \(\gamma_\omega = 0\).

Theorem 3.5 (Roberts, 1979). If the domain is unrestricted (\(V_i = \mathbb{R}^{|\Omega|}\) for all \(i\)) and \(|\Omega| \geq 3\), then a social choice function \(f\) is DSIC implementable if and only if \(f\) is an affine maximizer.

Roberts’ theorem is the multi-parameter analogue of the Gibbard-Satterthwaite theorem (which concerns settings without money). Its message is both elegant and discouraging: in unrestricted domains, the only truthful mechanisms are minor variants of VCG. There is essentially no room for creative mechanism design. Revenue maximization, approximate welfare maximization, fairness objectives — none of these can be achieved truthfully in unrestricted domains.

This impossibility is what makes restricted domains so important. The positive results in algorithmic mechanism design (Chapter 5) work precisely because the domains are restricted — single-minded bidders, submodular valuations, single-parameter environments — where WMON is weaker than full cycle monotonicity and a richer class of mechanisms becomes implementable.

3.4 Truthful-in-Expectation Mechanisms

Another way to escape the straitjacket of Roberts’ theorem is to relax the solution concept from deterministic DSIC to randomized truthfulness.

Definition 3.6 (Truthful-in-Expectation). A randomized mechanism is truthful-in-expectation (TIE) if for every agent \(i\), every type \(\mathbf{v}_i\), every alternative report \(\mathbf{v}_i'\), and every \(\mathbf{v}_{-i}\), \[ \mathbf{E}[v_i(f(\mathbf{v}_i, \mathbf{v}_{-i})) - p_i(\mathbf{v}_i, \mathbf{v}_{-i})] \geq \mathbf{E}[v_i(f(\mathbf{v}_i', \mathbf{v}_{-i})) - p_i(\mathbf{v}_i', \mathbf{v}_{-i})], \] where the expectation is over the mechanism's internal randomness.

This is weaker than universal truthfulness, which requires that truthfulness hold for every realization of the mechanism’s randomness (i.e., every deterministic mechanism in the support is DSIC). In a universally truthful mechanism, a risk-averse agent has no incentive to lie; in a TIE mechanism, an agent who dislikes variance might prefer to misreport.

Despite this weakness, TIE mechanisms are tremendously useful in algorithmic mechanism design because they are much easier to construct. The key insight, developed by Lavi and Swamy (2011), is that any LP-based approximation algorithm can be converted into a TIE mechanism with the same approximation guarantee — a black-box reduction that we will develop in detail in Chapter 5.

The gap between deterministic DSIC and TIE can be dramatic. For combinatorial auctions with submodular valuations, the best deterministic DSIC mechanism achieves an \(O(\sqrt{m})\)-approximation to welfare, while TIE mechanisms can achieve a \((1 - 1/e)\)-approximation — exponentially better in terms of the number of items.


Chapter 4: Revenue-Maximizing Auctions

Chapters 2–3 focused on welfare maximization: choosing the outcome that maximizes the sum of agents’ values. But in many settings — a seller on eBay, a government auctioning spectrum licenses, a search engine selling keyword ads — the designer’s goal is to maximize revenue: the total payment collected from the agents. This chapter develops the theory of revenue-optimal auctions, from Myerson’s foundational characterization through the prophet inequality and the Bulow-Klemperer theorem.

4.1 The Revenue Maximization Challenge

Revenue maximization is fundamentally different from welfare maximization in one crucial respect: there is no input-independent optimal mechanism. For welfare, the VCG mechanism is optimal regardless of agents’ valuations. For revenue, the best mechanism depends on what the valuations are — which are, of course, private information.

Consider a single bidder and a single item. The only DSIC mechanisms are posted prices: set a price \(r\), and the bidder buys if and only if \(v \geq r\). If you knew \(v\), you would set \(r = v\) and extract the full surplus. Without knowing \(v\), different prices perform differently on different inputs. A price of 20 is excellent when \(v = 25\) but terrible when \(v = 5\).

This input-dependence forces a Bayesian approach: assume bidders’ valuations are drawn from known distributions and optimize expected revenue. The distributions are typically estimated from historical data (past auction outcomes, bidding histories, market research).

Definition 4.1 (Bayesian Auction Setting). A Bayesian single-parameter environment consists of:
  • A single-parameter environment with bidders \(1, \ldots, n\) and feasible set \(X\).
  • For each bidder \(i\), a distribution \(F_i\) (with density \(f_i\)) from which \(v_i\) is drawn, independently of the others.
  • The distributions \(F_1, \ldots, F_n\) are known to the mechanism designer.
The expected revenue of a DSIC mechanism \((\mathbf{x}, \mathbf{p})\) is \(\mathbf{E}_{\mathbf{v} \sim F}[\sum_i p_i(\mathbf{v})]\), and the goal is to find the DSIC mechanism maximizing this quantity.

Vickrey auction with reserve price: expected revenue vs reserve price $r$ for 2 bidders Uniform$[0,1]$ (left); virtual valuation $\varphi(v)=2v-1$, excluding $v<1/2$ from allocation (right).

4.2 Virtual Valuations and Myerson’s Optimal Auction

The key insight of Myerson (1981) is that expected revenue can be expressed purely in terms of the allocation rule — no reference to payments — via a change of variables called virtual valuations.

Definition 4.2 (Virtual Valuation). The virtual valuation of bidder \(i\) with valuation \(v_i\) drawn from distribution \(F_i\) is \[ \varphi_i(v_i) = v_i - \frac{1 - F_i(v_i)}{f_i(v_i)}. \]

The virtual valuation has a compelling economic interpretation. The first term, \(v_i\), is what the mechanism designer would ideally charge bidder \(i\). The second term, \(\frac{1-F_i(v_i)}{f_i(v_i)}\), is the information rent — the inevitable revenue loss due to not knowing \(v_i\) in advance. A bidder with high valuation earns a surplus because the mechanism cannot distinguish it from a lower-valuation bidder without giving up revenue. The virtual valuation accounts for this trade-off.

For a bidder drawn from the uniform distribution on \([0,1]\): \(F(v) = v\), \(f(v) = 1\), so \(\varphi(v) = 2v - 1\). The virtual valuation is negative for \(v < 1/2\), meaning that selling to a low-valuation bidder actually decreases expected revenue — the information rent exceeds the payment.

Theorem 4.3 (Expected Revenue = Expected Virtual Welfare). For every DSIC mechanism \((\mathbf{x}, \mathbf{p})\) in a Bayesian single-parameter environment, \[ \mathbf{E}_{\mathbf{v}}\left[\sum_{i=1}^n p_i(\mathbf{v})\right] = \mathbf{E}_{\mathbf{v}}\left[\sum_{i=1}^n \varphi_i(v_i) \cdot x_i(\mathbf{v})\right]. \] That is, expected revenue equals expected virtual welfare.
Proof sketch. Start from Myerson's payment formula \(p_i(b_i, \mathbf{b}_{-i}) = \int_0^{b_i} z \cdot x_i'(z, \mathbf{b}_{-i})\,dz\). Take expectations over \(v_i \sim F_i\) and apply integration by parts (exchanging the order of integration and using the identity \(\int_z^{v_{\max}} f_i(v_i)\,dv_i = 1 - F_i(z)\)): \[ \mathbf{E}_{v_i}[p_i(\mathbf{v})] = \mathbf{E}_{v_i}\left[\left(v_i - \frac{1-F_i(v_i)}{f_i(v_i)}\right) x_i(\mathbf{v})\right] = \mathbf{E}_{v_i}[\varphi_i(v_i) \cdot x_i(\mathbf{v})]. \]

Summing over \(i\) and taking expectations over all \(\mathbf{v}_{-i}\) gives the result.

This identity transforms the revenue-maximization problem into a virtual-welfare-maximization problem. Instead of searching over the complex space of DSIC mechanisms, we simply choose the allocation that maximizes the sum of virtual valuations — then apply Myerson’s payment formula.

Definition 4.4 (Regular Distribution). A distribution \(F\) is regular if its virtual valuation function \(\varphi(v) = v - \frac{1-F(v)}{f(v)}\) is strictly increasing.

Regular distributions include uniform, exponential, normal, and lognormal distributions. Irregular distributions (e.g., bimodal distributions with sufficiently heavy tails) have non-monotone virtual valuation functions.

Theorem 4.5 (Myerson's Optimal Auction, 1981). In a Bayesian single-parameter environment where every distribution \(F_i\) is regular, the revenue-maximizing DSIC mechanism is:
  1. Allocation rule: For each input \(\mathbf{v}\), choose the feasible allocation maximizing virtual welfare \(\sum_i \varphi_i(v_i) x_i\), subject to \(\mathbf{x} \in X\). If all virtual valuations are negative, allocate nothing.
  2. Payment rule: Apply Myerson's payment formula from Theorem 2.5.

When distributions are regular, the virtual-welfare-maximizing allocation rule is monotone (because \(\varphi_i\) is increasing, a higher \(v_i\) means a higher \(\varphi_i(v_i)\), which leads to a weakly larger allocation), so Myerson’s Lemma guarantees that the corresponding payments yield a DSIC mechanism.

For a single-item auction with i.i.d. regular bidders, the optimal auction has a beautifully simple form: run a Vickrey auction with a reserve price of \(\varphi^{-1}(0)\). Since all bidders share the same increasing virtual valuation function, the bidder with the highest real valuation also has the highest virtual valuation — so the item goes to the highest bidder, just as in a Vickrey auction. The reserve price ensures that bidders with negative virtual valuations (those below \(\varphi^{-1}(0)\)) are excluded. For uniform \([0,1]\) valuations, the reserve price is \(\varphi^{-1}(0) = 1/2\).

For irregular distributions, the virtual valuation function is not monotone, and the virtual-welfare-maximizing allocation rule may not be monotone either. Myerson’s solution is to iron the virtual valuation function — replace non-monotone portions with their ironed (flattened) versions — to recover monotonicity while preserving expected virtual welfare. We refer to Hartline’s book for the details.

4.3 The Prophet Inequality

The theory of Bayesian optimal auctions is powerful but requires detailed knowledge of the distributions \(F_1, \ldots, F_n\). The prophet inequality, from optimal stopping theory, leads to simpler mechanisms that are approximately optimal without requiring identical distributions.

Theorem 4.6 (Prophet Inequality; Samuel-Cahn, 1984). Let \(\pi_1, \ldots, \pi_n\) be independent nonnegative random variables with known distributions \(G_1, \ldots, G_n\). There exists a threshold \(t\) such that the strategy "accept prize \(\pi_i\) if and only if \(\pi_i \geq t\)" guarantees expected reward at least \(\frac{1}{2}\mathbf{E}[\max_i \pi_i]\). The threshold satisfies \(\Pr[\pi_j < t \text{ for all } j] = \frac{1}{2}\).
Proof. Let \(z^+ = \max\{z, 0\}\). For a threshold \(t\), let \(q(t) = \Pr[\pi_j < t \text{ for all } j]\) be the probability of accepting no prize. The expected payoff of the \(t\)-threshold strategy is at least: \[ (1 - q(t)) \cdot t + q(t) \sum_{i=1}^n \mathbf{E}[(\pi_i - t)^+]. \]

The first term accounts for the guaranteed value \(t\) when at least one prize exceeds the threshold. The second term credits extra value when exactly one prize exceeds the threshold (conservatively ignoring the case where multiple prizes exceed \(t\)).

For the prophet’s expected reward:

\[ \mathbf{E}\left[\max_i \pi_i\right] = \mathbf{E}\left[t + \max_i(\pi_i - t)\right] \leq t + \mathbf{E}\left[\max_i (\pi_i - t)^+\right] \leq t + \sum_{i=1}^n \mathbf{E}[(\pi_i - t)^+]. \]

Setting \(q(t) = \frac{1}{2}\) makes the payoff lower bound at least \(\frac{1}{2}\) times the prophet upper bound.

The connection to mechanism design is elegant. In a single-item auction with non-i.i.d. regular bidders, regard the nonnegative virtual valuation \(\varphi_i(v_i)^+\) as the “prize” \(\pi_i\). The optimal auction’s expected revenue is \(\mathbf{E}[\max_i \varphi_i(v_i)^+]\). By the prophet inequality, the following simple auction achieves at least half this revenue:

  1. Set a bidder-specific reserve price \(r_i = \varphi_i^{-1}(t)\) for each bidder \(i\), where \(t\) is the prophet threshold.
  2. Give the item to the highest bidder among those meeting their reserve.

This auction is qualitatively simpler than the optimal auction: the highest real bidder wins (not the highest virtual bidder), and the reserve prices depend only on a single threshold rather than on the full distributions.

4.4 Simple Near-Optimal Auctions and the Bulow-Klemperer Theorem

The most celebrated result on simple auctions is the Bulow-Klemperer theorem, which shows that competition is more valuable than optimization: a simple Vickrey auction with one extra bidder generates more revenue than the optimal auction.

Theorem 4.7 (Bulow-Klemperer, 1996). Let \(F\) be a regular distribution and \(n\) a positive integer. Then \[ \mathbf{E}_{v_1, \ldots, v_{n+1} \sim F}[\text{Rev}(\text{Vickrey with } n+1 \text{ bidders})] \geq \mathbf{E}_{v_1, \ldots, v_n \sim F}[\text{Rev}(\text{OPT}_F \text{ with } n \text{ bidders})], \] where \(\text{OPT}_F\) is the Myerson optimal auction for \(F\).

The left-hand side is a Vickrey auction — the simplest possible format, with no reserve price — in an environment with \(n+1\) i.i.d. bidders. The right-hand side is the best possible auction in an environment with only \(n\) bidders but with full knowledge of \(F\). The Vickrey auction is prior-independent: its description doesn’t depend on \(F\). Yet it outperforms the tailored optimal auction, simply by having one more participant.

Proof. Define a fictitious auction \(\mathcal{A}\) for \(n+1\) bidders: (1) run \(\text{OPT}_F\) on bidders \(1, \ldots, n\); (2) if the item is unsold, give it to bidder \(n+1\) for free. The revenue of \(\mathcal{A}\) equals that of \(\text{OPT}_F\) (bidder \(n+1\) never pays), and \(\mathcal{A}\) always allocates the item.

The Vickrey auction with \(n+1\) i.i.d. regular bidders maximizes expected revenue among all auctions that always allocate the item. To see this: by the revenue = virtual welfare identity, revenue-maximization reduces to virtual-welfare-maximization. Among auctions that always allocate, the one maximizing virtual welfare gives the item to the bidder with the highest virtual valuation. Since \(F\) is regular (so \(\varphi\) is increasing), this is the bidder with the highest real valuation — which is exactly the Vickrey auction. Since \(\mathcal{A}\) always allocates, the Vickrey auction’s revenue is at least that of \(\mathcal{A}\), which equals that of \(\text{OPT}_F\).

The practical lesson of Bulow-Klemperer is profound: invest in attracting more bidders rather than in optimizing the auction format. This insight applies far beyond formal auctions — it suggests that, in many market settings, encouraging competition is more effective than fine-tuning pricing.

Case study: Yahoo! keyword auctions. Ostrovsky and Schwarz (2009) tested Myerson’s theory on Yahoo!’s sponsored search auctions. They fitted lognormal distributions to half a million keywords and computed theoretically optimal reserve prices. Yahoo! had been using a uniform reserve of $0.10 across all keywords — far below the optimal reserve for valuable keywords like “divorce lawyer” (optimal reserve: $0.30–$0.40) and above it for cheap keywords. Implementing keyword-specific optimal reserve prices increased auction revenue by several percent — a substantial gain given Yahoo!’s multi-billion-dollar search revenue.


Chapter 5: Combinatorial Auctions and Approximation

The VCG mechanism (Theorem 2.8) provides a DSIC welfare-maximizing mechanism for any multi-parameter environment. But in combinatorial auctions — where \(n\) bidders compete for bundles of \(m\) items — the VCG mechanism requires solving an NP-hard welfare-maximization problem as a subroutine. Worse, one cannot simply substitute an approximation algorithm: replacing the exact welfare maximizer with an approximation can destroy the DSIC property. This chapter develops the theory of computationally efficient truthful approximation mechanisms, following Swamy’s treatment of the key results by Lavi–Swamy and Dughmi–Roughgarden.

5.1 Valuations: Subadditive, Submodular, XOS, Single-Minded

In a combinatorial auction, each bidder \(i\) has a private valuation function \(v_i : 2^M \to \mathbb{R}_{\geq 0}\), assigning a value to every bundle of items \(S \subseteq M = \{1, \ldots, m\}\). We normalize \(v_i(\emptyset) = 0\) and assume free disposal: \(v_i(S) \leq v_i(T)\) whenever \(S \subseteq T\). The structure of valuations dramatically affects both the computational complexity of welfare maximization and the possibility of truthful approximation.

Definition 5.1 (Valuation Hierarchy). Valuations are classified by how bundle values interact:
  • Additive: \(v(S) = \sum_{j \in S} v(\{j\})\). Items are independent.
  • Submodular: \(v(S \cup \{j\}) - v(S) \geq v(T \cup \{j\}) - v(T)\) for all \(S \subseteq T\) and \(j \notin T\). Diminishing marginal returns.
  • XOS (fractionally subadditive): \(v(S) = \max_{\ell} \sum_{j \in S} a_j^\ell\) for some collection of additive functions \(\{a^\ell\}\). Each \(a^\ell\) is a "supporting price."
  • Subadditive: \(v(S \cup T) \leq v(S) + v(T)\) for all \(S, T\). No complementarities.
  • General: no restrictions beyond free disposal.
These classes are strictly nested: \[ \text{additive} \subsetneq \text{submodular} \subsetneq \text{XOS} \subsetneq \text{subadditive} \subsetneq \text{general}. \]

Items are substitutes when \(v(S \cup T) \leq v(S) + v(T)\) (subadditivity — having one item doesn’t make others more valuable) and complements when \(v(S \cup T) > v(S) + v(T)\) (synergies). In spectrum auctions, adjacent frequency licenses are often complements (a contiguous block is more useful than scattered frequencies). Submodular valuations model pure diminishing returns, common in advertising (additional ad impressions have decreasing marginal value).

A particularly tractable special case is single-minded bidders: each bidder \(i\) desires a specific bundle \(S_i^*\) and has valuation \(v_i(S) = v_i^*\) if \(S_i^* \subseteq S\) and 0 otherwise.

5.2 VCG Limitations and the Need for Approximation

The fundamental barrier is computational. For general valuations, the welfare-maximization problem \(\max \sum_i v_i(S_i)\) subject to the bundles \(\{S_i\}\) being a partition of items is NP-hard. Even for single-minded bidders, it reduces to weighted set packing. For submodular valuations, the problem is NP-hard to approximate within a factor of \(1 - 1/e + \varepsilon\).

The crucial observation, emphasized by Nisan and Ronen (2001), is that you cannot simply plug an approximation algorithm into VCG. The VCG payment formula \(p_i = \max_{\omega \in \Omega} \sum_{j \neq i} b_j(\omega) - \sum_{j \neq i} b_j(\omega^*)\) requires exactly solving the welfare-maximization problem both with and without agent \(i\). If you replace the exact solution with an approximation, the resulting mechanism is generally not DSIC — agents can gain by manipulating their reports to steer the approximation algorithm.

This motivates the search for new mechanisms — not based on VCG — that are simultaneously truthful, computationally efficient, and approximately welfare-maximizing.

5.3 Maximal-in-Range and MIDR Mechanisms

The most natural approach to combining truthfulness with approximation is to restrict the set of outcomes that the mechanism considers, then optimize exactly over this restricted set.

Definition 5.2 (Maximal-in-Range). A mechanism is maximal-in-range (MIR) if there is a fixed subset \(R \subseteq \Omega\) of outcomes (the "range") such that the mechanism always chooses the outcome in \(R\) maximizing the declared welfare: \[ \omega^*(\mathbf{b}) = \arg\max_{\omega \in R} \sum_{i=1}^n b_i(\omega). \] VCG payments are then computed with respect to this restricted optimization.

MIR mechanisms are automatically DSIC (they are VCG mechanisms on a restricted outcome space). The challenge is choosing \(R\) so that (a) optimizing over \(R\) is computationally tractable and (b) the welfare of the best outcome in \(R\) is close to the unconstrained optimum.

For certain valuation classes, MIR mechanisms can achieve good approximations. For single-minded bidders, an MIR mechanism based on greedy selection achieves an \(O(\sqrt{m})\)-approximation.

A more powerful generalization is maximal-in-distributional-range (MIDR): instead of a fixed range \(R\), the mechanism uses a distribution over outcomes, and the allocation rule maximizes the expected declared welfare over this distribution. MIDR mechanisms are truthful-in-expectation (TIE) rather than deterministically DSIC, but they can achieve significantly better approximation ratios.

5.4 LP-Based Black-Box Reductions (Lavi-Swamy)

The breakthrough result of Lavi and Swamy (2011) shows that any LP-based approximation algorithm can be converted into a truthful-in-expectation mechanism with the same approximation guarantee — a black-box reduction that requires no mechanism-specific analysis.

The setting is a general allocation problem where the welfare-maximization problem can be written as an integer program (IP), and we have access to its LP relaxation. The LP relaxation, being a convex program, can be solved in polynomial time (using the ellipsoid method, given a separation oracle).

Theorem 5.3 (Lavi-Swamy, 2011). Suppose there exists an \(\alpha\)-approximation algorithm for welfare maximization that works by (1) solving an LP relaxation and (2) rounding the fractional solution. Then there exists a truthful-in-expectation mechanism that achieves an \(\alpha\)-approximation to welfare and runs in polynomial time.

The key idea is elegant. Rather than rounding a fixed LP solution (which could destroy monotonicity), the mechanism randomizes over integral solutions in the support of the LP optimum. The LP optimal solution can be expressed as a convex combination of integer solutions (vertices of the LP polytope). The mechanism samples an integer solution from this decomposition and applies VCG payments to the sampled outcome. Because the expectation over the randomization matches the LP objective, the expected welfare achieves the LP bound. Because the randomization is over outcomes (not payments), the mechanism is truthful-in-expectation.

More precisely, the construction proceeds as follows:

  1. Solve the LP relaxation of the welfare-maximization problem using reported valuations as the objective coefficients.
  2. Decompose the fractional LP solution into a convex combination of integral solutions.
  3. Sample an integral solution from this decomposition.
  4. Charge VCG payments for the sampled outcome, computed over the restricted range of outcomes in the decomposition.

The decomposition step uses the fact that any point in a polytope can be written as a convex combination of at most \(d+1\) vertices, where \(d\) is the dimension of the polytope. This decomposition can be computed efficiently via Carathéodory’s theorem.

5.5 FPTAS to Truthful FPTAS (Dughmi-Roughgarden)

For problems where an FPTAS (fully polynomial-time approximation scheme) exists, Dughmi and Roughgarden (2010) showed that this can be converted into a truthful FPTAS — a \((1-\varepsilon)\)-approximation truthful-in-expectation mechanism for any \(\varepsilon > 0\).

The key technical challenge is that standard FPTAS constructions (based on dynamic programming with rounded values) are not monotone. A bidder who increases its reported valuation might get less allocation due to rounding artifacts.

Theorem 5.4 (Dughmi-Roughgarden, 2010). For packing problems that admit an FPTAS via the standard dynamic-programming approach, there exists a truthful-in-expectation FPTAS. Specifically, for any \(\varepsilon > 0\), there is a TIE mechanism running in time \(\text{poly}(n, m, 1/\varepsilon)\) that achieves a \((1-\varepsilon)\)-approximation to welfare.

The proof uses a technique called smoothed analysis of monotonicity: by slightly perturbing the rounding in the FPTAS, one can ensure that the resulting allocation rule is monotone in a distributional sense, enabling a truthful-in-expectation mechanism. The perturbation does not significantly affect the approximation ratio — the mechanism still achieves a \((1-\varepsilon)\)-approximation.

5.6 Applications

These black-box reduction frameworks have been applied to several concrete problems from Swamy’s course:

Set cover. The greedy set cover algorithm achieves an \(O(\log n)\)-approximation to welfare for set cover auctions (where items are elements to be covered and bidders own sets). The greedy algorithm is inherently monotone — adding an element to a set can only help the set’s coverage ratio — so it directly yields a DSIC \(O(\log n)\)-approximation mechanism without needing the Lavi-Swamy reduction.

Facility location (Mettu-Plaxton). In metric facility location, the goal is to open facilities and assign clients to them, minimizing the total connection cost plus facility opening cost. The Mettu-Plaxton algorithm gives a constant-factor approximation. Combined with the Lavi-Swamy framework, this yields a constant-factor TIE mechanism for facility location with strategic clients.

Makespan minimization on related machines. Jobs arrive online and must be assigned to machines with different speeds. The goal is to minimize the maximum load (makespan). This is a single-parameter environment (each machine’s private information is its speed), so Myerson’s Lemma applies. The key insight is that the natural greedy algorithm (assign each job to the machine that increases makespan least) is monotone: a faster machine is always assigned at least as much work. Combined with the approximation guarantee of the greedy algorithm, this yields a DSIC mechanism with a small constant-factor approximation to the optimal makespan.


Chapter 6: Advanced Auction Design

This chapter covers three important settings that extend mechanism design beyond the standard quasi-linear framework: spectrum auctions (where the stakes are billions of dollars and the item space is combinatorial), auctions with budget constraints (where agents cannot pay more than a fixed amount), and matching markets (where monetary transfers are forbidden entirely).

6.1 Spectrum Auctions

Wireless spectrum is among the most valuable resources auctioned by governments. Since the mid-1990s, spectrum auctions have raised hundreds of billions of dollars worldwide. The design challenges are formidable: items (spectrum licenses) are non-identical, bidders have complex combinatorial preferences (adjacent frequencies in the same geographic area are complements), and the stakes are enormous.

Simultaneous ascending auctions (SAAs) are the dominant format. Conceptually, an SAA runs parallel English auctions for all items simultaneously. Each round, bidders can place new bids on any items, subject to an activity rule that forces them to stay engaged (the number of items a bidder bids on can only decrease as prices rise). The auction ends when a round passes with no new bids.

The main advantage of SAAs over sequential or sealed-bid formats is price discovery: as the auction progresses, bidders learn about likely selling prices and can adjust their strategies accordingly — abandoning licenses whose prices are too high, snapping up bargains, and rethinking which packages to assemble. This is dramatically better than sealed-bid formats, which produced notorious disasters. In the 1990 New Zealand spectrum auction (simultaneous sealed-bid Vickrey), one license received a high bid of $100,000 and a second-highest bid of $6; another had bids of $7 million and $5,000. The revenue was a fraction of expectations.

SAAs have two well-known vulnerabilities:

  1. Demand reduction. A bidder may bid for fewer items than it truly wants, to reduce competition and lower prices. Consider two identical items and two bidders: bidder 1 values each at 10 (and both at 20), bidder 2 values each at 8. In the VCG outcome, bidder 1 gets both items for revenue 8 each. But in an SAA, bidder 2 bids up to 8 on both items. If bidder 1 targets just one item, it wins cheaply; if it insists on both, it pays 16 for value 20. The temptation to reduce demand is strong.

  2. The exposure problem. When items are complements, a bidder who needs a package of items faces risk: it might win some items in the package at high prices but lose others, leaving it with items it doesn’t value. Consider two non-identical items that bidder 1 values at 100 together but 0 separately. Bidder 2 will pay up to 75 for either item. Bidder 1 faces a no-win situation: to get both items, it must outbid bidder 2 on each, paying 150+ for items worth 100 together. This often leads to cautious, tentative bidding by bidders with complementary preferences.

Modern spectrum auctions address these issues through package bidding (allowing bids on bundles of items, not just individual items) and proxy rounds (an extra round after the SAA phase where bidders can submit package bids that compete with individual-item bids from the main phase). The 2017 FCC Incentive Auction — a groundbreaking “double auction” that used a reverse auction to buy back TV broadcasting licenses and a forward auction to resell the freed spectrum to wireless carriers — represented the cutting edge of spectrum auction design, incorporating greedy approximate welfare-maximization algorithms with DSIC-compatible payment rules.

6.2 Budget Constraints and the Clinching Auction

All mechanism design theory so far has assumed quasi-linear utility: bidder \(i\)’s utility is \(v_i(\omega) - p_i\), with no upper bound on payments. In practice, agents have budgets that cap their total spending. Budget constraints fundamentally change the landscape: the VCG mechanism can violate budgets, and no DSIC mechanism can simultaneously maximize welfare and respect budgets.

Definition 6.1 (Budget-Constrained Setting). Each bidder \(i\) has a per-unit valuation \(v_i\), a budget \(B_i\), and utility: \[ u_i = \begin{cases} v_i \cdot k - p_i & \text{if } p_i \leq B_i, \\ -\infty & \text{if } p_i > B_i, \end{cases} \] where \(k\) is the number of goods received and \(p_i\) is the total payment.

The clinching auction (Dobzinski, Lavi, and Nisan, 2012) elegantly handles budget constraints for identical goods. It is an ascending-price auction where goods are sold incrementally as the price rises:

Clinching Auction for Budgeted Bidders:

  1. Initialize price \(p = 0\), supply \(s = m\), residual budget \(\hat{B}_i = B_i\) for each bidder.
  2. While \(s > 0\): increase \(p\) until some bidder \(i\) can clinch goods — meaning the remaining supply exceeds the total demand of the other bidders: \(s - \sum_{j \neq i} \hat{D}_j(p) > 0\). Give bidder \(i\) those \(k\) excess goods at price \(p\) each. Decrease \(s\) by \(k\) and \(\hat{B}_i\) by \(p \cdot k\).
  3. Here \(\hat{D}_i(p) = \min\{\lfloor \hat{B}_i / p \rfloor, s\}\) if \(p < v_i\), and 0 otherwise.

The clinching auction respects budgets by construction (goods are sold at or below a bidder’s value, and total spending never exceeds the budget). Different goods may be clinched at different prices, with prices increasing over time.

Theorem 6.2 (Dobzinski-Lavi-Nisan). The clinching auction with public budgets is DSIC. Moreover, it is the unique deterministic DSIC auction that always computes a Pareto-optimal allocation.

The proof of DSIC compares the utility from truthful bidding to that from any deviation. Underbidding (\(b_i < v_i\)) causes the bidder to miss goods clinched in the price interval \([b_i, v_i]\), all of which would have generated positive utility. Overbidding (\(b_i > v_i\)) causes the bidder to acquire goods in the price interval \([v_i, b_i]\), generating negative utility per good. In both cases, truthful bidding is strictly better.

6.3 Matching Without Money

Some of the most important allocation problems prohibit monetary transfers. Kidney exchange, school choice, and organ donation are governed by legal and ethical constraints that rule out payments. This is mechanism design without money — and it is, in many ways, harder than mechanism design with money. Without payments to incentivize truthfulness, the designer must rely entirely on the structure of the allocation rule.

The housing allocation problem. Each of \(n\) agents initially owns one house, and agents have ordinal preferences (rankings) over all houses. The Top Trading Cycle Algorithm (TTCA), credited to Gale by Shapley and Scarf (1974), works as follows:

  1. Each remaining agent points to the owner of its favorite remaining house.
  2. The resulting directed graph has at least one cycle (self-loops count).
  3. Execute all cycles: each agent on a cycle gives its house to its predecessor on the cycle.
  4. Remove those agents and houses. Repeat.
Theorem 6.3 (Shapley-Scarf, 1974). The TTCA is DSIC: no agent can obtain a better house by misreporting preferences. Moreover, the resulting allocation is the unique core allocation — no coalition of agents can rearrange their houses to make every member strictly better off.

Kidney exchange is a celebrated real-world application. Each incompatible patient-donor pair is an “agent” who initially “owns” an incompatible kidney. The TTCA finds beneficial exchange cycles. However, long cycles are impractical in kidney exchange because all surgeries in a cycle must be performed simultaneously (otherwise, after receiving a kidney, a donor pair might renege on donating). This motivates restricting to short cycles (typically pairwise or three-way exchanges), modeled as maximum matching in a compatibility graph.

Stable matching. The canonical example of mechanism design without money in two-sided markets is the Gale-Shapley proposal algorithm for stable matching. Given \(n\) men and \(n\) women, each with ranked preferences over the other side, the algorithm works as follows:

  1. Each unmatched man proposes to his highest-ranked woman who hasn’t yet rejected him.
  2. Each woman entertains the best offer she has received so far, rejecting all others.
  3. Repeat until all men are matched.
Theorem 6.4 (Gale-Shapley, 1962). The proposal algorithm terminates in at most \(n^2\) iterations with a stable matching — a perfect matching with no "blocking pair" \((u, v)\) where \(u\) and \(v\) both prefer each other to their current partners.
Proof sketch. Termination: each man makes at most \(n\) proposals (never reproposing to the same woman), so at most \(n^2\) iterations. Completeness: if some man is unmatched at the end, he has been rejected by all \(n\) women, each of whom must be matched (a woman only rejects when she has a better offer). So all \(n\) women are matched, implying all \(n\) men are too — a contradiction. Stability: consider a non-matched pair \((u, v)\). Either \(u\) never proposed to \(v\) (meaning \(u\) prefers its partner to \(v\)) or \(v\) rejected \(u\) in favor of a man she liked more (and she can only do better over time). In either case, the pair does not block.

The proposal algorithm, remarkably, was essentially the same algorithm that had been used since the 1950s to assign medical residents to hospitals through the National Resident Matching Program. Gale and Shapley formalized it independently in 1962. Alvin Roth’s work connecting stable matching theory to practical market design — including the redesign of the NRMP and the creation of kidney exchange clearinghouses — earned him (jointly with Shapley) the 2012 Nobel Memorial Prize in Economics.

A deep property of the proposal algorithm is that it computes the man-optimal stable matching: each man is matched to the highest-ranked woman with whom he could be matched in any stable matching. This is simultaneously the woman-pessimal stable matching. Truthful reporting is a dominant strategy for the proposing side (men), but women can sometimes benefit by misreporting preferences — an inherent asymmetry of the algorithm.


Chapter 7: Selfish Routing

We now shift from designing games to analyzing games “in the wild.” In many settings — road traffic, Internet packet routing, bandwidth sharing — the game already exists and we cannot redesign its rules. The question becomes: how much efficiency is lost because players act selfishly rather than cooperatively? This chapter develops the theory of selfish routing in networks, culminating in tight bounds on the price of anarchy that depend only on the class of edge cost functions, not on the network topology.

7.1 Nonatomic Routing and Wardrop Equilibrium

In a nonatomic selfish routing game, traffic consists of a continuum of infinitesimally small players, each choosing a path through a network to minimize its own travel time. No single player has a measurable effect on congestion — but their collective choices determine edge costs.

Definition 7.1 (Selfish Routing Network). A selfish routing network consists of:
  • A directed graph \(G = (V, E)\) with source \(s\) and sink \(t\).
  • A traffic rate \(r > 0\) (total flow from \(s\) to \(t\)).
  • For each edge \(e \in E\), a cost function \(c_e : \mathbb{R}_{\geq 0} \to \mathbb{R}_{\geq 0}\), nonnegative, continuous, and nondecreasing, describing the per-unit travel time as a function of congestion.
A flow is a vector \(\{f_P\}_{P \in \mathcal{P}}\) over \(s\)-\(t\) paths \(\mathcal{P}\), with \(f_P \geq 0\) and \(\sum_P f_P = r\). The edge load is \(f_e = \sum_{P \ni e} f_P\).
Definition 7.2 (Wardrop Equilibrium). A flow \(f\) is at equilibrium (or Wardrop equilibrium) if traffic only travels on shortest paths: for every path \(P\) with \(f_P > 0\), \[ c_P(f) = \sum_{e \in P} c_e(f_e) \leq c_{P'}(f) \quad \text{for all } P' \in \mathcal{P}. \] Equivalently, all paths carrying positive flow have the same cost \(L\), and all other paths have cost at least \(L\).

The Wardrop condition captures the idea that no infinitesimal driver can improve its travel time by switching routes. It is the continuous analogue of Nash equilibrium: at equilibrium, all used routes have equal cost, and no unused route is cheaper.

\[ C(f) = \sum_{e \in E} f_e \cdot c_e(f_e). \]

7.2 Existence and Uniqueness via Convex Optimization

A remarkable feature of nonatomic routing is that equilibria can be characterized as solutions to a convex optimization problem, guaranteeing both existence and uniqueness.

Definition 7.3 (Beckmann Potential). The Beckmann potential of a flow \(f\) is \[ \Phi(f) = \sum_{e \in E} \int_0^{f_e} c_e(x)\,dx. \]

Since cost functions are nonnegative and nondecreasing, \(\Phi\) is a convex function of the flow. The space of feasible flows (nonneg vectors summing to \(r\)) is a compact convex set. Therefore \(\Phi\) has a global minimum.

Theorem 7.4 (Beckmann, McGuire, Winsten, 1956). A flow \(f\) is a Wardrop equilibrium if and only if it minimizes the Beckmann potential \(\Phi\) over all feasible flows. In particular:
  1. Every selfish routing network has at least one equilibrium flow.
  2. All equilibrium flows have the same edge loads \(\{f_e\}_{e \in E}\) and hence the same total cost.

The uniqueness of edge loads (though not necessarily of path flows) follows from the strict convexity of \(\Phi\) in the edge-load variables when cost functions are strictly increasing. The first-order optimality (KKT) conditions for minimizing \(\Phi\) are precisely the Wardrop equilibrium conditions — the cost of every used path equals the common “shortest path” distance from \(s\) to \(t\).

This convex optimization characterization means equilibria can be computed in polynomial time, in sharp contrast to Nash equilibria in general games (which are PPAD-hard).

7.3 Price of Anarchy for Nonatomic Routing

We now prove the central result: for nonatomic routing with cost functions from a class \(\mathcal{C}\), the worst-case PoA is completely determined by simple two-link “Pigou networks” — no matter how complex the actual network is.

Example 7.5 (Pigou's Example). The simplest selfish routing network has two parallel links from \(s\) to \(t\): one with cost \(c(x) = x\) (congestion-dependent) and one with cost \(c(x) = 1\) (constant). With one unit of traffic, every driver takes the lower link in equilibrium (even fully congested, it ties the constant link). Equilibrium cost: \(1 \cdot 1 = 1\). Optimal cost: split traffic \(1/2\) and \(1/2\), giving total cost \(\frac{1}{2} \cdot \frac{1}{2} + \frac{1}{2} \cdot 1 = \frac{3}{4}\). Price of anarchy: \(\frac{4}{3}\).
Definition 7.6 (Pigou Bound). For a class \(\mathcal{C}\) of cost functions, the Pigou bound is the worst-case PoA across all Pigou-like networks (two parallel links, one with cost \(c(\cdot) \in \mathcal{C}\), the other with constant cost \(c(r)\)): \[ \alpha(\mathcal{C}) := \sup_{c \in \mathcal{C}} \sup_{r \geq 0} \sup_{x \geq 0} \frac{r \cdot c(r)}{x \cdot c(x) + (r-x) \cdot c(r)}. \]
Theorem 7.7 (Roughgarden-Tardos, 2002; Roughgarden, 2003). For every class \(\mathcal{C}\) of cost functions and every selfish routing network with cost functions in \(\mathcal{C}\), the price of anarchy is at most \(\alpha(\mathcal{C})\). This bound is tight: it is achieved by a Pigou-like network.

The striking message: the only obstacle to near-optimal selfish routing is nonlinear cost functions, not complex network structure. For affine (linear) costs \(c(x) = ax + b\), the Pigou bound evaluates to \(\frac{4}{3}\). For polynomial costs of degree \(d\), the PoA grows as \(\Theta(d/\ln d)\) — quartic costs give PoA ≈ 2.2, still quite modest.

Cost function classPoA
Linear: \(ax + b\)\(4/3\)
Quadratic: \(ax^2 + bx + c\)\(\approx 1.63\)
Degree-\(d\) polynomials\(\Theta(d / \ln d)\)
Proof of Theorem 7.7. Let \(f\) be the equilibrium flow and \(f^*\) the optimal flow. The proof has two parts. \[ \sum_{e \in E} (f_e^* - f_e) c_e(f_e) \geq 0. \tag{*} \]\[ f_e^* \cdot c_e(f_e^*) \geq \frac{1}{\alpha(\mathcal{C})} \cdot f_e \cdot c_e(f_e) + (f_e^* - f_e) \cdot c_e(f_e). \]\[ C(f^*) \geq \frac{1}{\alpha(\mathcal{C})} \cdot C(f) + \underbrace{\sum_e (f_e^* - f_e) c_e(f_e)}_{\geq 0 \text{ by } (*)}. \]

Thus \(C(f)/C(f^*) \leq \alpha(\mathcal{C})\).

7.4 Braess’s Paradox and Network Over-Provisioning

Braess’s paradox (Example 1.12) shows that adding capacity to a network can increase equilibrium cost. This raises a natural question: can we quantify how bad the paradox can be?

For nonatomic routing with affine costs, Roughgarden (2006) showed that removing edges can improve the equilibrium cost by at most a factor of \(\frac{4}{3}\) — exactly the PoA bound. More generally, for cost functions of degree \(d\), removing edges can improve costs by at most the Pigou bound for degree-\(d\) polynomials.

The practical implication is network over-provisioning: if we build a network with \(\alpha(\mathcal{C})\) times the optimal capacity, selfish routing achieves at most the optimal cost. This is a much cheaper intervention than trying to control individual routing decisions.


Chapter 8: Congestion Games and Load Balancing

Nonatomic routing models a continuum of infinitesimal players. In many settings — packet routing on the Internet, job scheduling on parallel machines, players choosing resources in a shared system — players are large enough that each individual’s choice measurably affects congestion. These are atomic games, and their theory differs significantly from the nonatomic case.

8.1 Atomic Routing and Rosenthal’s Potential

In an atomic selfish routing game, there are \(k\) players, each choosing an \(s_i\)-\(t_i\) path. Player \(i\)’s cost is the sum of edge costs along its chosen path, where each edge’s cost depends on the total number of players using it.

Theorem 8.1 (Rosenthal, 1973). Every atomic selfish routing game with arbitrary real-valued cost functions has at least one pure Nash equilibrium.
Proof. Define the Rosenthal potential function: \[ \Phi(f) = \sum_{e \in E} \sum_{i=1}^{f_e} c_e(i), \] where \(f_e\) is the number of players using edge \(e\). This is the "area under the cost curve" — the sum of the first \(f_e\) values of \(c_e\), not \(f_e \cdot c_e(f_e)\). \[ \Phi(\hat{f}) - \Phi(f) = \sum_{e \in \hat{P}_i} c_e(\hat{f}_e) - \sum_{e \in P_i} c_e(f_e) = C_i(\hat{P}_i, f_{-i}) - C_i(P_i, f_{-i}). \]

The single function \(\Phi\) simultaneously tracks every player’s incentive to deviate. A flow minimizing \(\Phi\) (which exists since there are finitely many flows) is a PNE: no player can decrease \(\Phi\), so no player can decrease its cost.

This proof extends beyond routing to congestion games: any game where players choose subsets of resources, each resource has a cost depending on its load, and players pay the sum of costs on their chosen resources.

Definition 8.2 (Congestion Game). A congestion game consists of:
  • \(k\) players, a set \(E\) of resources, and for each resource \(e\), a cost function \(c_e : \{1, \ldots, k\} \to \mathbb{R}\).
  • Each player \(i\) has a strategy set \(\mathcal{S}_i \subseteq 2^E\), where each strategy is a subset of resources.
  • Player \(i\)'s cost is \(C_i(\mathbf{s}) = \sum_{e \in s_i} c_e(f_e)\), where \(f_e = |\{j : e \in s_j\}|\).

Rosenthal’s theorem applies to all congestion games: the potential \(\Phi(\mathbf{s}) = \sum_e \sum_{i=1}^{f_e} c_e(i)\) guarantees PNE existence. Moreover, better-response dynamics — repeatedly letting some unhappy player switch to a better strategy — always converge to a PNE in finite steps, since \(\Phi\) strictly decreases at each step and can only take finitely many values. (The convergence time, however, can be exponential — see Chapter 12.)

8.2 PoA for Atomic Routing

The PoA for atomic routing with affine cost functions \(c_e(x) = a_e x + b_e\) was pinned down by Christodoulou and Koutsoupias (2005).

Theorem 8.3 (Christodoulou-Koutsoupias, 2005). In every atomic selfish routing game with affine cost functions, the PoA (for pure Nash equilibria) is at most \(\frac{5}{2}\). This bound is tight.

The bound of \(\frac{5}{2}\) is significantly worse than the nonatomic bound of \(\frac{4}{3}\) for the same cost functions. The reason is the “lumpiness” of atomic players: when a large player switches routes, it causes a discrete jump in congestion that affects many other players. The proof follows the canonical four-step PoA recipe: (1) invoke the PNE condition for each player, (2) sum over players, (3) disentangle the cross-terms using algebraic bounds specific to affine costs (\(\sum_i C_i(s_i^*, \mathbf{s}_{-i}) \leq \frac{5}{3} \text{cost}(\mathbf{s}^*) + \frac{1}{3} \text{cost}(\mathbf{s})\)), (4) solve for the PoA.

8.3 Load Balancing Games

A fundamental special case of congestion games is load balancing: \(k\) jobs must be assigned to \(m\) machines, where machine \(j\) has speed \(s_j\). Each job \(i\) has a weight \(w_i\), and the load on machine \(j\) is \(\ell_j = \sum_{i : \text{job } i \to j} w_i / s_j\). Each job’s cost is the load on its machine.

For identical machines (\(s_j = 1\) for all \(j\)), this is a congestion game with one resource per machine, and Rosenthal’s theorem guarantees PNE existence. The PoA measures how much worse the equilibrium makespan is compared to optimal scheduling.

Theorem 8.4. For load balancing on \(m\) identical machines:
  • The PoA (for the makespan objective) approaches \(\Theta(\frac{\log m}{\log \log m})\) as the number of jobs grows.
  • The PoS is 1: the optimal assignment is always a Nash equilibrium (no job wants to move to a more loaded machine).

For related machines (machines with different speeds), the situation is more nuanced. As discussed in Section 5.6, the makespan-minimization problem on related machines is closely connected to mechanism design: each machine’s speed is private information, and the natural greedy allocation rule is monotone, enabling a DSIC mechanism.

8.4 Cost-Sharing and Network Design Games

In the games studied so far — routing and load balancing — players impose negative externalities on each other: more congestion means higher costs. Some games exhibit positive externalities, where additional players reduce costs. The canonical example is network cost-sharing.

Definition 8.5 (Network Cost-Sharing Game). A network cost-sharing game has a graph \(G = (V,E)\) with edge costs \(\gamma_e \geq 0\). Player \(i\) needs to connect \(s_i\) to \(t_i\) by choosing a path \(P_i\). The cost of each used edge is split equally among its users: \[ C_i(\mathbf{P}) = \sum_{e \in P_i} \frac{\gamma_e}{f_e}, \] where \(f_e = |\{j : e \in P_j\}|\). The social cost is the total cost of the formed network: \(\text{cost}(\mathbf{P}) = \sum_{e : f_e \geq 1} \gamma_e\).

This is a congestion game (with decreasing per-player costs), so pure Nash equilibria exist. However, the PoA can be terrible: in the “VHS vs. Betamax” example, two technologies (edges) with costs \(1 + \varepsilon\) and \(k\) lead to a bad equilibrium where everyone picks the expensive option, giving PoA = \(k\) — linear in the number of players. The PoA is unbounded even for games with a unique Nash equilibrium.

This motivates studying the price of stability (PoS): how good is the best equilibrium?

Theorem 8.6 (Anshelevich et al., 2008). In every network cost-sharing game with \(k\) players, there exists a pure Nash equilibrium with cost at most \(\mathcal{H}_k = \sum_{i=1}^k \frac{1}{i} = \Theta(\ln k)\) times the optimal cost. This bound is tight.
Proof. Network cost-sharing games are potential games with Rosenthal potential: \[ \Phi(\mathbf{P}) = \sum_{e \in E} \gamma_e \sum_{i=1}^{f_e} \frac{1}{i} = \sum_{e \in E} \gamma_e \cdot \mathcal{H}_{f_e}. \] Let \(\mathbf{P}\) minimize \(\Phi\) (a PNE) and \(\mathbf{P}^*\) minimize cost. Since \(\gamma_e \leq \Phi_e \leq \gamma_e \cdot \mathcal{H}_k\) for each used edge: \[ \text{cost}(\mathbf{P}) \leq \Phi(\mathbf{P}) \leq \Phi(\mathbf{P}^*) \leq \mathcal{H}_k \cdot \text{cost}(\mathbf{P}^*). \] The first and last inequalities relate cost to potential; the middle one uses \(\mathbf{P}\) being the potential minimizer.

The “opt-out” example shows this is tight: \(k\) players share a common destination, with a shared edge of cost \(1 + \varepsilon\) and individual direct edges of cost \(1/i\) for player \(i\). The unique equilibrium (everyone opts out) has cost \(\mathcal{H}_k\), while the optimum (everyone shares) has cost \(1 + \varepsilon\).


Chapter 9: The Smooth Games Framework

The PoA proofs in Chapters 7–8 share a common four-step structure: (1) invoke the equilibrium condition per player with a hypothetical deviation to the optimal strategy, (2) sum over players, (3) disentangle the cross-term \(\sum_i C_i(s_i^*, \mathbf{s}_{-i})\) and bound it in terms of \(\text{cost}(\mathbf{s})\) and \(\text{cost}(\mathbf{s}^*)\), (4) solve for the PoA. Roughgarden (2009) observed that step 3 — the “disentanglement” step — never actually uses the fact that \(\mathbf{s}\) is a Nash equilibrium. This seemingly minor observation has a powerful consequence: any PoA bound proved via this recipe automatically extends to coarse correlated equilibria, approximate equilibria, and no-regret learning outcomes.

9.1 Smoothness Definition

Definition 9.1 (Smooth Game). A cost-minimization game is \((\lambda, \mu)\)-smooth (with \(\lambda > 0\) and \(\mu < 1\)) if for some optimal outcome \(\mathbf{s}^*\) and all strategy profiles \(\mathbf{s}\): \[ \sum_{i=1}^k C_i(s_i^*, \mathbf{s}_{-i}) \leq \lambda \cdot \text{cost}(\mathbf{s}^*) + \mu \cdot \text{cost}(\mathbf{s}). \] For payoff-maximization games, smoothness requires: \[ \sum_{i=1}^k \pi_i(s_i^*, \mathbf{s}_{-i}) \geq \lambda \cdot V(\mathbf{s}^*) - \mu \cdot V(\mathbf{s}). \]

The smoothness condition captures precisely the “disentanglement” step. For atomic routing with affine costs, the algebraic bound \(\sum_i C_i(s_i^*, \mathbf{s}_{-i}) \leq \frac{5}{3}\text{cost}(\mathbf{s}^*) + \frac{1}{3}\text{cost}(\mathbf{s})\) establishes \((\frac{5}{3}, \frac{1}{3})\)-smoothness. For location games (a class of payoff-maximization games studied by Vetta), the bound establishes \((1,1)\)-smoothness.

9.2 Robust Price of Anarchy

The payoff of the smooth games framework is that PoA bounds extend automatically to much larger equilibrium sets.

Theorem 9.2 (Roughgarden, 2009). In every \((\lambda, \mu)\)-smooth cost-minimization game, the PoA of coarse correlated equilibria is at most \(\frac{\lambda}{1 - \mu}\). Similarly, in \((\lambda, \mu)\)-smooth payoff-maximization games, the PoA of CCE is at least \(\frac{\lambda}{1 + \mu}\).
Proof. Let \(\sigma\) be a CCE and \(\mathbf{s}^*\) an optimal outcome. Starting from the CCE condition — for each player \(i\), the unconditional expected cost of following \(\sigma\) is at most that of deviating to \(s_i^*\): \[ \mathbf{E}_{\mathbf{s} \sim \sigma}[\text{cost}(\mathbf{s})] \leq \mathbf{E}_{\mathbf{s} \sim \sigma}\left[\sum_i C_i(\mathbf{s})\right] \leq \sum_i \mathbf{E}_{\mathbf{s} \sim \sigma}[C_i(s_i^*, \mathbf{s}_{-i})]. \] The first inequality uses \(\text{cost}(\mathbf{s}) \leq \sum_i C_i(\mathbf{s})\) and the second uses the CCE condition for each \(i\). Now apply the smoothness condition (which holds for *all* \(\mathbf{s}\), hence under expectation): \[ \leq \mathbf{E}_{\mathbf{s} \sim \sigma}[\lambda \cdot \text{cost}(\mathbf{s}^*) + \mu \cdot \text{cost}(\mathbf{s})] = \lambda \cdot \text{cost}(\mathbf{s}^*) + \mu \cdot \mathbf{E}[\text{cost}(\mathbf{s})]. \] Rearranging: \((1 - \mu) \cdot \mathbf{E}[\text{cost}(\mathbf{s})] \leq \lambda \cdot \text{cost}(\mathbf{s}^*)\).

Since the equilibrium hierarchy satisfies PNE ⊂ MNE ⊂ CE ⊂ CCE, the PoA for PNE is at most the PoA for CCE. But Theorem 9.2 shows the bound is the same for CCE as we proved for PNE! The PoA bounds of \(\frac{5}{2}\) for atomic routing and \(\frac{4}{3}\) for nonatomic routing with affine costs extend unchanged to CCE. This is remarkable: CCE is a much larger set, yet the worst case is no worse.

The theorem also extends to \(\varepsilon\)-approximate Nash equilibria (where no player can reduce cost by more than a \((1+\varepsilon)\) factor): the PoA becomes \(\frac{(1+\varepsilon)\lambda}{1 - \mu(1+\varepsilon)}\) for \(\varepsilon < \frac{1}{\mu} - 1\).

9.3 Non-Truthful Mechanisms and Simultaneous Item-Bidding

The smooth games framework applies beyond routing to non-truthful mechanisms — auction formats where bidders play strategically rather than truthfully.

Bhawalkar and Roughgarden (2011) studied simultaneous second-price item-bidding auctions: \(m\) items are sold in parallel second-price auctions, and bidders with subadditive valuations bid independently on each item. This is not DSIC (a bidder’s optimal bid on one item depends on whether it wins others), but the resulting game has Nash equilibria.

Theorem 9.3 (Bhawalkar-Roughgarden, 2011). In simultaneous second-price auctions with subadditive bidders, the PoA for coarse correlated equilibria is at most 2.

This bound of 2 applies to the welfare (not revenue) of equilibrium outcomes. It says that even though simultaneous item-bidding is not DSIC, the equilibrium welfare is at least half of optimal — achieved through a smoothness argument. For the more restrictive class of submodular bidders, a pure NE always exists and can be computed by a greedy algorithm (assigning items to the bidder with the highest marginal value).

9.4 Unifying Theme

The smooth games framework reveals a deep unity across seemingly disparate PoA results. The following table summarizes the key instances:

GameSmoothnessPoA boundApplies to
Nonatomic routing, affine costsNot directly smooth (continuous game)\(4/3\)Wardrop equilibria
Atomic routing, affine costs\((5/3, 1/3)\)-smooth\(5/2\)PNE, MNE, CE, CCE
Location games\((1,1)\)-smooth\(1/2\) (welfare)PNE, MNE, CE, CCE
Simultaneous 2nd-price auctions (subadditive)\((1,1)\)-smooth\(1/2\) (welfare)PNE, MNE, CE, CCE
Network cost-sharingNot smooth (PoA is \(\Theta(k)\))\(\mathcal{H}_k\) (PoS only)Best PNE

The last row illustrates a limitation: smoothness gives PoA bounds, but network cost-sharing has unbounded PoA. The logarithmic price of stability bound in Theorem 8.6 requires the potential function argument, which is not captured by the smoothness framework.


Chapter 10: Best-Response and No-Regret Dynamics

Equilibrium concepts are useful for analyzing the steady states of strategic interaction, but they beg the question: how do players reach an equilibrium? This chapter studies natural learning dynamics — algorithms that individual players can use to adapt their strategies over time — and connects these dynamics to the equilibrium concepts of Chapter 1. The central result is that no-regret learning converges to the set of coarse correlated equilibria, closing the loop with the smooth games framework: if players learn via no-regret algorithms in a smooth game, the long-run outcome is approximately as efficient as any Nash equilibrium.

10.1 Best-Response Dynamics in Potential Games

In a potential game (Section 8.1), the simplest learning dynamic is better-response dynamics: in each round, some player with a profitable deviation switches to a better strategy.

Theorem 10.1. In every finite potential game, better-response dynamics converge to a pure Nash equilibrium in finite steps.
Proof. The Rosenthal potential \(\Phi\) strictly decreases whenever a player makes a beneficial switch. Since the game is finite, \(\Phi\) takes only finitely many values, and the dynamics must terminate at a local minimum of \(\Phi\) — a PNE.

The convergence time, however, can be exponential. For congestion games with linear cost functions, Ackermann, Röglin, and Vöcking (2008) showed that better-response dynamics can take an exponential number of steps even for simple network structures. This exponential convergence time motivates the search for faster dynamics that converge to approximate equilibria.

For symmetric congestion games with \(n\) players and \(m\) resources, better-response dynamics find a \((1+\varepsilon)\)-approximate PNE in time \(\text{poly}(n, m, 1/\varepsilon)\) via a careful potential-based argument. In smooth potential games, the approximate PNE found by such dynamics is automatically near-optimal by the smooth games framework (Theorem 9.2 extended to \(\varepsilon\)-PNE).

10.2 The Multiplicative Weights Algorithm

We now introduce a far more general approach that does not require a potential function. The multiplicative weights algorithm (also known as Hedge, or the randomized weighted majority algorithm) is an online learning algorithm that works in any repeated game.

The setup: a player faces \(T\) rounds. In each round \(t\), the player chooses an action \(i_t \in \{1, \ldots, n\}\) and incurs a cost \(c_t(i_t) \in [0,1]\). The costs are revealed after the player’s choice. The player’s goal is to minimize regret: the difference between its total cost and the total cost of the best single action in hindsight.

Definition 10.2 (External Regret). The external regret of a sequence of actions \(i_1, \ldots, i_T\) is \[ R_T = \sum_{t=1}^T c_t(i_t) - \min_{i \in \{1,\ldots,n\}} \sum_{t=1}^T c_t(i). \] An algorithm has no regret if \(R_T / T \to 0\) as \(T \to \infty\).
Definition 10.3 (Multiplicative Weights). Fix a learning rate \(\varepsilon \in (0, 1/2)\). Initialize weights \(w_1(i) = 1\) for all actions \(i\). In each round \(t\):
  1. Play action \(i_t\) drawn from the distribution \(p_t(i) = w_t(i) / \sum_j w_t(j)\).
  2. Observe costs \(c_t(i)\) for all \(i\).
  3. Update weights: \(w_{t+1}(i) = w_t(i) \cdot (1 - \varepsilon)^{c_t(i)}\).

The update rule penalizes actions that incurred high cost: their weights decrease multiplicatively, making them less likely to be chosen in the future. Actions with low cost retain their weights.

Theorem 10.4 (Regret Bound). With learning rate \(\varepsilon \in (0, 1/2)\) and costs in \([0,1]\), the Multiplicative Weights algorithm guarantees expected regret at most \[ R_T \leq \varepsilon T + \frac{\ln n}{\varepsilon}. \] Choosing \(\varepsilon = \sqrt{\ln n / T}\) gives \(R_T \leq 2\sqrt{T \ln n}\), so the per-round regret \(R_T / T = O(\sqrt{\ln n / T}) \to 0\).
Proof sketch. Track the total weight \(W_t = \sum_i w_t(i)\). Initially \(W_1 = n\). After round \(t\): \[ W_{t+1} = \sum_i w_t(i)(1-\varepsilon)^{c_t(i)} \leq W_t \cdot \sum_i p_t(i)(1 - \varepsilon \cdot c_t(i)) = W_t(1 - \varepsilon \cdot \mathbf{E}[c_t]). \] After \(T\) rounds: \(W_{T+1} \leq n \cdot \prod_t (1 - \varepsilon \cdot \mathbf{E}[c_t])\). For the best action \(i^*\): \(W_{T+1} \geq w_{T+1}(i^*) = (1-\varepsilon)^{\sum_t c_t(i^*)}\). Taking logarithms and using \(\ln(1-x) \leq -x\) and \(\ln(1-\varepsilon) \geq -\varepsilon - \varepsilon^2\) gives the bound.

The Multiplicative Weights algorithm is remarkably versatile. It is the algorithmic backbone of boosting in machine learning (Freund and Schapire’s AdaBoost), approximate solutions to linear programs and semidefinite programs, and — as we see next — convergence to equilibria in games.

10.3 No-Regret Implies Convergence to CCE

The connection between no-regret learning and equilibria is one of the most beautiful results in algorithmic game theory.

Theorem 10.5. In a repeated game, if every player uses a no-regret algorithm (such as Multiplicative Weights), then the empirical distribution of play converges to the set of coarse correlated equilibria.
Proof. Let \(\sigma^T\) be the empirical distribution: the uniform distribution over the \(T\) strategy profiles actually played in rounds \(1, \ldots, T\). For each player \(i\), the no-regret guarantee says that for every fixed action \(s_i'\): \[ \frac{1}{T} \sum_{t=1}^T C_i(\mathbf{s}^t) \leq \frac{1}{T} \sum_{t=1}^T C_i(s_i', \mathbf{s}_{-i}^t) + \frac{R_T^i}{T}. \] The left side is \(\mathbf{E}_{\mathbf{s} \sim \sigma^T}[C_i(\mathbf{s})]\); the right side approaches \(\mathbf{E}_{\mathbf{s} \sim \sigma^T}[C_i(s_i', \mathbf{s}_{-i})]\) as \(R_T^i / T \to 0\). In the limit, \(\sigma^T\) satisfies the CCE condition (Definition 1.9) for every player.

Combined with the smooth games framework (Theorem 9.2), this yields a powerful corollary: in any \((\lambda, \mu)\)-smooth game, if all players use no-regret algorithms, the time-averaged social cost is at most \(\frac{\lambda}{1-\mu}\) times optimal — the same bound as for pure Nash equilibria. Players don’t need to know the game structure, the number of players, or even that they are in a game. They just need to minimize their own regret.

10.4 Swap Regret and Correlated Equilibria

External regret compares the player’s performance to the best fixed action. A stronger notion is swap regret, which compares to the best function mapping actions to actions.

Definition 10.6 (Swap Regret). The swap regret of a sequence of actions is \[ R_T^{\text{swap}} = \max_{\pi : S_i \to S_i} \sum_{t=1}^T \left[c_t(i_t) - c_t(\pi(i_t))\right], \] where the maximum is over all functions \(\pi\) from actions to actions.

Swap regret is strictly stronger than external regret: choosing \(\pi\) to be a constant function \(\pi(\cdot) = i^*\) recovers external regret. No-swap-regret learning converges to the set of correlated equilibria (not just coarse correlated):

Theorem 10.7 (Blum-Mansour, 2007). If every player uses a no-swap-regret algorithm, the empirical distribution of play converges to the set of correlated equilibria.

The Blum-Mansour construction converts any no-external-regret algorithm into a no-swap-regret algorithm using a black-box reduction. The idea: maintain \(n\) copies of the external-regret algorithm (one for each action), and when action \(i\) is played, feed the costs to the \(i\)-th copy. The overall strategy is the stationary distribution of a Markov chain defined by the \(n\) copies’ recommendations. The swap regret of the combined algorithm is bounded by the sum of external regrets of the copies, giving per-round swap regret \(O(n \sqrt{\ln n / T})\).


Chapter 11: Zero-Sum Games and the Minimax Theorem

The theory of no-regret learning yields a striking algorithmic proof of the most fundamental theorem in game theory: von Neumann’s minimax theorem for two-player zero-sum games.

11.1 Two-Player Zero-Sum Games

Definition 11.1 (Zero-Sum Game). A two-player zero-sum game is defined by an \(m \times n\) payoff matrix \(A\), where entry \(a_{ij}\) is the payoff to the row player (and \(-a_{ij}\) to the column player) when the row player plays \(i\) and the column player plays \(j\).

The row player wants to maximize \(a_{ij}\); the column player wants to minimize it. Each player can randomize: the row player chooses a distribution \(\mathbf{p}\) over rows, the column player chooses \(\mathbf{q}\) over columns, and the expected payoff is \(\mathbf{p}^T A \mathbf{q}\).

The maximin value is \(\max_{\mathbf{p}} \min_{\mathbf{q}} \mathbf{p}^T A \mathbf{q}\): the best guarantee the row player can secure, assuming the column player responds optimally. The minimax value is \(\min_{\mathbf{q}} \max_{\mathbf{p}} \mathbf{p}^T A \mathbf{q}\): the least the column player can be forced to concede.

A simple inequality always holds: \(\max_{\mathbf{p}} \min_{\mathbf{q}} \leq \min_{\mathbf{q}} \max_{\mathbf{p}}\) (committing first is never an advantage). Von Neumann’s theorem says these are equal.

11.2 Von Neumann’s Minimax Theorem

Theorem 11.2 (Von Neumann, 1928). For every two-player zero-sum game with payoff matrix \(A\): \[ \max_{\mathbf{p} \in \Delta_m} \min_{\mathbf{q} \in \Delta_n} \mathbf{p}^T A \mathbf{q} = \min_{\mathbf{q} \in \Delta_n} \max_{\mathbf{p} \in \Delta_m} \mathbf{p}^T A \mathbf{q}, \] where \(\Delta_k\) denotes the probability simplex in \(\mathbb{R}^k\).

There are many proofs: von Neumann’s original used Brouwer’s fixed-point theorem; later proofs use LP duality (the maximin and minimax problems are dual linear programs). We give an algorithmic proof via no-regret learning.

Proof via Multiplicative Weights. Run the repeated game for \(T\) rounds. In each round, the row player uses Multiplicative Weights, and the column player plays a best response to the row player's current mixed strategy \(\mathbf{p}_t\) (i.e., the pure strategy \(j_t\) minimizing \(\mathbf{p}_t^T A \mathbf{e}_{j_t}\)). \[ \frac{1}{T}\sum_t \mathbf{p}_t^T A \mathbf{e}_{j_t} \geq \max_{\mathbf{p}} \frac{1}{T} \sum_t \mathbf{p}^T A \mathbf{e}_{j_t} - \frac{R_T}{T} = \max_{\mathbf{p}} \mathbf{p}^T A \bar{\mathbf{q}} - \frac{R_T}{T}. \]\[ \frac{1}{T}\sum_t \mathbf{p}_t^T A \mathbf{e}_{j_t} = \frac{1}{T}\sum_t \min_{\mathbf{q}} \mathbf{p}_t^T A \mathbf{q} \leq \min_{\mathbf{q}} \bar{\mathbf{p}}^T A \mathbf{q} \]\[ \max_{\mathbf{p}} \mathbf{p}^T A \bar{\mathbf{q}} \leq \min_{\mathbf{q}} \bar{\mathbf{p}}^T A \mathbf{q} + \frac{R_T}{T}. \]

As \(T \to \infty\), \(R_T/T \to 0\), so \(\max_{\mathbf{p}} \min_{\mathbf{q}} \leq \min_{\mathbf{q}} \max_{\mathbf{p}}\). Combined with the opposite inequality (which always holds), we get equality.

This proof is constructive: Multiplicative Weights computes an approximate minimax equilibrium in \(O(\frac{\ln n}{\varepsilon^2})\) rounds for an \(\varepsilon\)-approximate solution — polynomial in the game size and \(1/\varepsilon\). This is in dramatic contrast to computing Nash equilibria of general games, which is PPAD-hard.

The minimax theorem also implies the existence of a mixed Nash equilibrium in any finite two-player zero-sum game (the maximin strategy for each player), predating Nash’s general existence theorem (1950) by 22 years.


Chapter 12: Computing Equilibria

We have seen that Nash equilibria exist in all finite games (Theorem 1.6), that pure Nash equilibria exist in all congestion games (Theorem 8.1), and that correlated equilibria can be computed efficiently via linear programming. This chapter addresses the computational complexity of finding equilibria: which equilibrium concepts are tractable, and which are fundamentally hard?

12.1 PLS-Completeness and Pure Nash Equilibria

Better-response dynamics converge to a pure Nash equilibrium in potential games — but the convergence can take exponentially many steps. This exponential behavior is not an artifact of a particular algorithm but reflects a deep computational barrier.

Definition 12.1 (PLS). The complexity class PLS (Polynomial Local Search) consists of optimization problems where:
  1. Feasible solutions can be verified in polynomial time.
  2. The objective function can be evaluated in polynomial time.
  3. Neighboring solutions (local improvements) can be found in polynomial time.
  4. The goal is to find a local optimum — a solution with no improving neighbor.

Every PLS problem has a solution (since the number of feasible solutions is finite and the objective has a global optimum, which is also a local optimum). But finding a local optimum may require following an exponentially long sequence of improving moves. A problem is PLS-complete if every PLS problem can be reduced to it — implying that no polynomial-time algorithm exists unless PLS = P.

Theorem 12.2 (Fabrikant, Papadimitriou, Talwar, 2004). Computing a pure Nash equilibrium of a congestion game is PLS-complete, even when each player has two strategies.

This result explains why better-response dynamics can take exponential time in congestion games: finding a PNE is, in general, as hard as any local search problem. However, for special cases — symmetric congestion games, singleton congestion games (each strategy is a single resource), or games with a polynomial number of strategies — PNE can be computed efficiently.

12.2 PPAD and Brouwer’s Fixed Point

For games without a potential function, even the existence of pure Nash equilibria is not guaranteed. Nash’s theorem ensures a mixed Nash equilibrium exists, but this guarantee comes via Brouwer’s fixed-point theorem — a non-constructive existence result. The complexity of finding a Nash equilibrium is captured by a different class.

Definition 12.3 (PPAD). The complexity class PPAD (Polynomial Parity Arguments on Directed graphs) consists of total search problems where existence of a solution is guaranteed by a parity argument on a directed graph: given an exponentially large directed graph (defined implicitly) with a known source vertex (in-degree 0, out-degree 1), find either a sink (in-degree 1, out-degree 0) or another source.

The defining problem of PPAD is finding a second “end of a line” in a directed path. The directed graph has exponentially many nodes, each with in-degree and out-degree at most 1. Given a known source, there must be at least one sink (by a parity argument: the number of sources equals the number of sinks modulo 2). But finding it may require following the path, which can be exponentially long.

Brouwer’s fixed-point theorem — “every continuous function from a compact convex set to itself has a fixed point” — can be formulated as a PPAD problem. The discrete analogue is Sperner’s lemma: in any properly colored triangulation of a simplex, there exists a “trichromatic” cell. Finding this cell is PPAD-complete.

12.3 PPAD-Completeness of Nash Equilibrium

Theorem 12.4 (Daskalakis-Goldberg-Papadimitriou, 2009; Chen-Deng, 2009). Computing a mixed Nash equilibrium of a two-player game is PPAD-complete.

This is one of the landmark results in algorithmic game theory. Its message is that Nash’s existence theorem, despite being a beautiful mathematical result, does not yield an efficient algorithm. Even for two-player games (the simplest case where Nash equilibria may require randomization), no polynomial-time algorithm is known or believed to exist.

The proof reduces from the Brouwer fixed-point problem. Given any Brouwer instance (a continuous function \(f\) on a simplex), one constructs a two-player game whose Nash equilibria correspond to approximate fixed points of \(f\). The construction uses a sequence of gadgets that encode arithmetic operations, comparisons, and function evaluations as incentive constraints in the game.

The PPAD-completeness of Nash equilibrium stands in sharp contrast to several related problems:

ProblemComplexity
Mixed Nash equilibrium (2-player)PPAD-complete
Correlated equilibriumPolynomial (LP)
Coarse correlated equilibriumPolynomial (no-regret learning)
Nash in zero-sum gamesPolynomial (LP / multiplicative weights)
Pure NE in congestion gamesPLS-complete

12.4 Computing Correlated Equilibria

The tractability of correlated equilibria provides a remarkable contrast. A CE of a game with \(k\) players and strategy sets \(S_1, \ldots, S_k\) is a distribution \(\sigma\) over \(S_1 \times \cdots \times S_k\) satisfying the linear constraints of Definition 1.7. This is a linear feasibility problem with \(\sum_i |S_i|^2\) constraints and \(\prod_i |S_i|\) variables.

While the number of variables can be exponential in \(k\), there are two efficient approaches:

  1. LP with separation oracle. For games where the constraint set has polynomial-size representation, the ellipsoid method can find a CE in polynomial time using a separation oracle for the CE polytope.

  2. No-regret dynamics. As shown in Theorem 10.5, if all players run no-regret algorithms (such as Multiplicative Weights), the empirical distribution converges to a CCE. For swap-regret algorithms (Theorem 10.7), convergence is to CE. This approach is fully distributed — each player only needs to know its own costs — and runs in time polynomial in the game size and \(1/\varepsilon\) for an \(\varepsilon\)-approximate CE.

The gap between polynomial CE computation and PPAD-hard Nash computation justifies the focus on CCE in the smooth games framework: not only do CCE yield the same PoA bounds as PNE in smooth games, they are also computationally tractable.


Chapter 13: Market Equilibria

We conclude with a different model of strategic interaction: markets where agents trade goods at prices determined by supply and demand. Unlike the games of Chapters 1–12, markets are governed by the price mechanism rather than by explicit game rules. The central question is whether competitive prices exist and can be computed efficiently.

13.1 Exchange Economies and Walrasian Equilibrium

Definition 13.1 (Fisher Market). A Fisher market consists of:
  • \(n\) buyers, each with a budget \(B_i > 0\) of money.
  • \(m\) divisible goods, each with supply 1.
  • Buyer \(i\) has a utility function \(u_i : \mathbb{R}_{\geq 0}^m \to \mathbb{R}\), where \(u_i(x_i)\) is \(i\)'s utility from receiving bundle \(x_i = (x_{i1}, \ldots, x_{im})\).
Definition 13.2 (Walrasian / Competitive Equilibrium). A Walrasian equilibrium consists of prices \(\mathbf{p} = (p_1, \ldots, p_m)\) and an allocation \(\mathbf{x} = (x_1, \ldots, x_n)\) such that:
  1. Utility maximization: Each buyer \(i\) receives a utility-maximizing bundle subject to its budget constraint: \(x_i \in \arg\max \{u_i(x) : \mathbf{p} \cdot x \leq B_i\}\).
  2. Market clearing: All goods are sold: \(\sum_i x_{ij} = 1\) for all \(j\) (or \(\leq 1\) with \(p_j = 0\) if there is excess supply).

The existence of Walrasian equilibria is a foundational result in mathematical economics, proved by Arrow and Debreu (1954) using Kakutani’s fixed-point theorem — the same tool that underlies Nash’s equilibrium existence theorem. The Arrow-Debreu theorem applies to the more general exchange economy model, where each agent starts with an endowment of goods (rather than money) and trades.

13.2 Computing Market Equilibria

For Fisher markets with linear utilities — \(u_i(x_i) = \sum_j u_{ij} x_{ij}\) — the equilibrium computation problem has a beautiful convex programming formulation.

Theorem 13.3 (Eisenberg-Gale, 1959). In a Fisher market with linear utilities, the equilibrium allocation maximizes the Nash social welfare: \[ \max_{\mathbf{x} \geq 0} \sum_{i=1}^n B_i \ln\left(\sum_j u_{ij} x_{ij}\right) \quad \text{subject to} \quad \sum_i x_{ij} \leq 1 \text{ for all } j. \] This is a convex program (the objective is concave) and can be solved in polynomial time. The equilibrium prices are the dual variables of the supply constraints.

The Eisenberg-Gale program is a remarkable result: it reduces the economic equilibrium problem to a tractable optimization problem. The objective — maximizing the weighted sum of log-utilities — is a Nash bargaining solution, and the dual variables automatically yield market-clearing prices. This provides both an existence proof and an efficient algorithm.

For more complex utility functions (e.g., CES utilities, Leontief utilities), the equilibrium computation landscape is more varied:

Utility classComplexity
LinearPolynomial (convex program)
Spending-constraintPolynomial (convex program)
CES with \(\rho < 1\)Polynomial
LeontiefPPAD-hard
GeneralPPAD-hard (Arrow-Debreu model)

13.3 Welfare Theorems

The classical welfare theorems connect competitive equilibria to social efficiency.

Theorem 13.4 (First Welfare Theorem). Every Walrasian equilibrium allocation is Pareto optimal: no reallocation of goods can make some buyer better off without making another worse off.
Proof. Suppose allocation \(\mathbf{x}'\) Pareto-dominates the equilibrium allocation \(\mathbf{x}\). Then \(u_i(x_i') \geq u_i(x_i)\) for all \(i\) with strict inequality for some \(i\). Since \(x_i\) maximizes \(u_i\) subject to the budget constraint \(\mathbf{p} \cdot x_i \leq B_i\), any strictly preferred bundle must cost more: \(\mathbf{p} \cdot x_i' > B_i\) for the strictly improved buyer, and \(\mathbf{p} \cdot x_i' \geq B_i\) for all others. Summing: \(\sum_i \mathbf{p} \cdot x_i' > \sum_i B_i = \sum_j p_j\) (by market clearing and budget exhaustion). But the total value of all goods at prices \(\mathbf{p}\) is \(\sum_j p_j\), so the dominating allocation requires more total spending than exists — a contradiction.
Theorem 13.5 (Second Welfare Theorem). Under mild convexity assumptions, every Pareto-optimal allocation can be supported as a Walrasian equilibrium with appropriate redistribution of budgets/endowments.

The welfare theorems provide the deepest connection between mechanism design and market theory. The First Welfare Theorem says that the price mechanism automatically achieves Pareto optimality — a form of efficiency without any central planner. This is Adam Smith’s “invisible hand” made precise. The Second Welfare Theorem says that any desired efficient outcome can be achieved through the market, given the right initial distribution of resources.

From the perspective of algorithmic game theory, markets are games where the “mechanism” is the price system. The First Welfare Theorem is a PoA = 1 result: equilibria are always optimal (in the Pareto sense). The computational challenge is finding the equilibrium prices — which, as we have seen, ranges from polynomial (for linear Fisher markets) to PPAD-hard (for general exchange economies). The close parallel between the complexity of Nash equilibria in games and the complexity of Walrasian equilibria in markets reflects a deep structural connection: both are fixed-point problems, and both can be PPAD-hard when the underlying structure is sufficiently complex.

Back to top