CO 743: Discrete and Computational Geometry

Estimated study time: 1 hr 42 min

Table of contents

These notes synthesize material from J. Matousek’s Lectures on Discrete Geometry, J. Pach and P.K. Agarwal’s Combinatorial Geometry, M. de Berg et al.’s Computational Geometry: Algorithms and Applications, and J. Matousek’s Using the Borsuk-Ulam Theorem, enriched with material from MIT 18.319 and L. Guth’s Polynomial Methods in Combinatorics.

1. Convex Sets and Polytopes

Convexity is one of the most fundamental notions in geometry, and its interplay with combinatorics gives rise to a rich and beautiful theory. In this chapter we develop the foundational results on convex sets, beginning with the classical theorems of Carathéodory, Radon, and Helly, and proceeding to the combinatorial study of polytopes.

1.1 Convex Sets and Convex Hulls

A set \( C \subseteq \mathbb{R}^d \) is convex if for every pair of points \( x, y \in C \) and every \( \lambda \in [0,1] \), the point \( \lambda x + (1-\lambda) y \) belongs to \( C \). Equivalently, \( C \) contains the entire line segment between any two of its points.

The intersection of any family of convex sets is convex, which motivates the following definition.

The convex hull of a set \( S \subseteq \mathbb{R}^d \), denoted \( \operatorname{conv}(S) \), is the intersection of all convex sets containing \( S \). Equivalently, \[ \operatorname{conv}(S) = \left\{ \sum_{i=1}^k \lambda_i x_i : k \geq 1, \; x_i \in S, \; \lambda_i \geq 0, \; \sum_{i=1}^k \lambda_i = 1 \right\}. \]

A point of the form \( \sum_{i=1}^k \lambda_i x_i \) with \( \lambda_i \geq 0 \) and \( \sum \lambda_i = 1 \) is called a convex combination of the points \( x_1, \ldots, x_k \). The equivalence of the two definitions is a standard exercise: the set of all convex combinations is itself convex and is contained in every convex set containing \( S \).

In \( \mathbb{R}^2 \), the convex hull of the four points \( (0,0), (1,0), (0,1), (1,1) \) is the unit square \( [0,1]^2 \). The convex hull of three non-collinear points is a triangle. The convex hull of a circle is the closed disk it bounds.

1.2 Carathéodory’s Theorem

A natural question about convex hulls is: how many points from \( S \) are needed to express any point of \( \operatorname{conv}(S) \) as a convex combination? Carathéodory’s theorem, proved by Constantin Carathéodory in 1907, gives a sharp answer.

Carathéodory's Theorem. Let \( S \subseteq \mathbb{R}^d \). Every point \( p \in \operatorname{conv}(S) \) can be expressed as a convex combination of at most \( d+1 \) points of \( S \).
Since \( p \in \operatorname{conv}(S) \), there exist points \( x_1, \ldots, x_k \in S \) and non-negative reals \( \lambda_1, \ldots, \lambda_k \) with \( \sum_{i=1}^k \lambda_i = 1 \) such that \( p = \sum_{i=1}^k \lambda_i x_i \). We wish to show that we can take \( k \leq d+1 \). \[ \sum_{i=2}^k \mu_i (x_i - x_1) = 0. \]\[ \sum_{i=1}^k \mu_i x_i = 0 \quad \text{and} \quad \sum_{i=1}^k \mu_i = 0, \]

with not all \( \mu_i \) zero. This is called an affine dependence.

\[ p = \sum_{i=1}^k (\lambda_i - t\mu_i) x_i, \quad \text{and} \quad \sum_{i=1}^k (\lambda_i - t\mu_i) = 1. \]\[ t^* = \min \left\{ \frac{\lambda_i}{\mu_i} : \mu_i > 0 \right\}. \]

At \( t = t^* \), all coefficients \( \lambda_i - t^* \mu_i \) remain non-negative (by the choice of minimum), they still sum to 1, and at least one coefficient is zero. Thus we have expressed \( p \) as a convex combination of at most \( k-1 \) points of \( S \).

Repeating this process, we reduce the number of points until \( k \leq d+1 \).

The bound \( d+1 \) is tight: the origin in \( \mathbb{R}^d \) lies in the convex hull of the \( d+1 \) vertices of a regular simplex centered at the origin, but not in the convex hull of any \( d \) of them.

1.3 Radon’s Theorem

Johann Radon proved the following partition result in 1921. It is a direct consequence of the same linear algebra argument that underlies Carathéodory’s theorem.

Radon's Theorem. Any set of \( d+2 \) points in \( \mathbb{R}^d \) can be partitioned into two disjoint sets whose convex hulls intersect.
Let \( x_1, \ldots, x_{d+2} \in \mathbb{R}^d \). Consider the \( d+1 \) vectors \( x_2 - x_1, \ldots, x_{d+2} - x_1 \) in \( \mathbb{R}^d \). Since there are \( d+1 \) vectors in \( \mathbb{R}^d \), they are linearly dependent: there exist reals \( \mu_2, \ldots, \mu_{d+2} \), not all zero, with \[ \sum_{i=2}^{d+2} \mu_i (x_i - x_1) = 0. \] Setting \( \mu_1 = -\sum_{i=2}^{d+2} \mu_i \), we get \( \sum_{i=1}^{d+2} \mu_i x_i = 0 \) and \( \sum_{i=1}^{d+2} \mu_i = 0 \), with not all \( \mu_i = 0 \). \[ \sum_{i \in I} \mu_i x_i = -\sum_{j \in J} \mu_j x_j = \sum_{j \in J} |\mu_j| x_j. \]\[ p = \sum_{i \in I} \frac{\mu_i}{\sigma} x_i = \sum_{j \in J} \frac{|\mu_j|}{\sigma} x_j \]

is a point that lies in both \( \operatorname{conv}(\{x_i : i \in I\}) \) and \( \operatorname{conv}(\{x_j : j \in J\}) \). Since \( I \) and \( J \) partition \( \{1, \ldots, d+2\} \), we are done.

The partition in Radon's theorem is called a Radon partition, and the point \( p \) in the intersection of the two hulls is a Radon point. In the plane (\( d=2 \)), any 4 points can be Radon-partitioned: if three form a triangle containing the fourth, the partition is \( \{p_4\} \) vs. \( \{p_1,p_2,p_3\} \); otherwise two crossing diagonals of the convex quadrilateral give a partition into two pairs.

1.4 Helly’s Theorem

Eduard Helly proved his celebrated intersection theorem in 1913, though it was first published by Radon in 1921. It is one of the cornerstones of combinatorial convexity.

Helly's Theorem. Let \( C_1, \ldots, C_n \) be convex sets in \( \mathbb{R}^d \), with \( n \geq d+1 \). If every \( d+1 \) of these sets have a common point, then all \( n \) sets have a common point.
We proceed by induction on \( n \). The base case \( n = d+1 \) is trivial. For the inductive step, assume the theorem holds for \( n-1 \) sets and consider \( C_1, \ldots, C_n \) with \( n \geq d+2 \).

For each \( i \in \{1, \ldots, n\} \), the family \( \{C_j : j \neq i\} \) has \( n-1 \) members, and every \( d+1 \) of them have a common point (since every \( d+1 \) of the original \( n \) sets do). By the induction hypothesis, there exists a point \( x_i \in \bigcap_{j \neq i} C_j \).

We now have \( n \geq d+2 \) points \( x_1, \ldots, x_n \). By Radon’s theorem, these can be partitioned into two sets \( I \) and \( J \) with \( \operatorname{conv}(\{x_i : i \in I\}) \cap \operatorname{conv}(\{x_j : j \in J\}) \neq \emptyset \). Let \( p \) be a point in this intersection.

We claim \( p \in \bigcap_{k=1}^n C_k \). Fix any \( k \in \{1, \ldots, n\} \). Since \( I \) and \( J \) partition \( \{1, \ldots, n\} \), either \( k \notin I \) or \( k \notin J \).

If \( k \notin I \), then for every \( i \in I \), we have \( x_i \in C_k \) (since \( x_i \in \bigcap_{j \neq i} C_j \) and \( k \neq i \)). So \( \{x_i : i \in I\} \subseteq C_k \), and since \( C_k \) is convex, \( \operatorname{conv}(\{x_i : i \in I\}) \subseteq C_k \), hence \( p \in C_k \).

Similarly, if \( k \notin J \), then \( p \in \operatorname{conv}(\{x_j : j \in J\}) \subseteq C_k \).

In either case \( p \in C_k \), so \( p \in \bigcap_{k=1}^n C_k \).

In the plane (\( d=2 \)), Helly's theorem states that if every three convex sets in a family have a common point, then the whole family has a common point. Consider three convex disks in the plane, any two of which overlap. Helly's theorem says nothing here (we need every three to intersect). But if every triple among \( n \geq 3 \) convex disks shares a point, then there is a single point common to all disks.

1.5 The Fractional Helly Theorem

Helly’s theorem requires that every \( (d+1) \)-tuple of sets has a common point. What if only a positive fraction of the \( (d+1) \)-tuples intersect? The fractional Helly theorem, due to Katchalski and Liu (1979), gives a surprisingly strong conclusion.

Fractional Helly Theorem. For every \( d \geq 1 \) and every \( \alpha \in (0,1] \), there exists \( \beta = \beta(d, \alpha) > 0 \) with the following property. If \( C_1, \ldots, C_n \) are convex sets in \( \mathbb{R}^d \) and at least \( \alpha \binom{n}{d+1} \) of the \( (d+1) \)-element subfamilies have a common point, then there exists a point contained in at least \( \beta n \) of the sets. Specifically, one may take \( \beta = 1 - (1-\alpha)^{1/(d+1)} \).

This result tells us that even a “fractional” Helly condition forces a large common intersection. It has found numerous applications in combinatorial geometry and optimization.

1.6 Colorful Carathéodory Theorem

Imre Bárány proved the following remarkable generalization of Carathéodory’s theorem in 1982. It introduces the idea of “colorful” configurations that has become a powerful paradigm.

Colorful Carathéodory Theorem (Bárány, 1982). Let \( P_0, P_1, \ldots, P_d \subseteq \mathbb{R}^d \) be \( d+1 \) point sets such that the origin lies in the convex hull of each \( P_i \). Then there exist points \( p_0 \in P_0, p_1 \in P_1, \ldots, p_d \in P_d \) such that the origin lies in \( \operatorname{conv}(\{p_0, p_1, \ldots, p_d\}) \).
The name "colorful" comes from thinking of each set \( P_i \) as points of color \( i \). The theorem guarantees a "rainbow simplex" (one point of each color) containing the origin. Setting all \( P_i = S \) recovers Carathéodory's theorem.

1.7 Polytopes and Face Lattices

We now turn to the combinatorial theory of polytopes, which are the higher-dimensional analogues of polygons and polyhedra.

A convex polytope (or simply polytope) is the convex hull of a finite set of points in \( \mathbb{R}^d \). Equivalently, a polytope is a bounded set that can be written as the intersection of finitely many closed half-spaces.

The equivalence of these two descriptions (the “V-representation” and “H-representation”) is a fundamental result in polyhedral theory, sometimes called the Weyl-Minkowski theorem.

A face of a polytope \( P \) is a set \( F \subseteq P \) such that \( F = P \cap H \) for some supporting hyperplane \( H \) (i.e., \( H \) is a hyperplane such that \( P \) lies entirely in one of the closed half-spaces determined by \( H \)). By convention, \( \emptyset \) and \( P \) itself are also considered faces. Faces of dimension 0 are vertices, faces of dimension 1 are edges, and faces of codimension 1 (dimension \( d-1 \)) are facets.

The set of all faces of a polytope \( P \), ordered by inclusion, forms a lattice called the face lattice of \( P \). Two polytopes are combinatorially equivalent if their face lattices are isomorphic. The study of face lattices is central to combinatorial convexity.

Let \( P \) be a \( d \)-dimensional polytope. The \( f \)-vector of \( P \) is \( (f_0, f_1, \ldots, f_{d-1}) \), where \( f_i \) denotes the number of \( i \)-dimensional faces. The \( f \)-polynomial is \( f(t) = \sum_{i=0}^{d-1} f_i t^i \).
A cube in \( \mathbb{R}^3 \) has \( f \)-vector \( (f_0, f_1, f_2) = (8, 12, 6) \): 8 vertices, 12 edges, and 6 facets (square faces). A regular tetrahedron has \( f \)-vector \( (4, 6, 4) \).

1.8 Euler’s Formula and the Dehn-Sommerville Relations

Leonhard Euler discovered the following remarkable relation in 1752, one of the founding results of topology.

Euler's Formula. For any convex polytope \( P \) in \( \mathbb{R}^3 \) (or any connected planar graph), \[ f_0 - f_1 + f_2 = 2. \] More generally, for a \( d \)-dimensional convex polytope, \[ \sum_{i=0}^{d-1} (-1)^i f_i = 1 - (-1)^d. \]

This is the Euler-Poincaré relation for polytopes. It provides one linear constraint on the \( f \)-vector. For simplicial polytopes (those where every proper face is a simplex), much more can be said.

A polytope is simplicial if every facet is a simplex (i.e., every facet has exactly \( d \) vertices). A polytope is simple if every vertex belongs to exactly \( d \) facets.
Dehn-Sommerville Relations. For a simplicial \( d \)-polytope \( P \), the \( f \)-vector satisfies the \( \lfloor d/2 \rfloor \) linear relations \[ f_{k-1} = \sum_{i=k}^{d-1} (-1)^{d-1-i} \binom{i}{k} f_i, \quad k = 0, 1, \ldots, d-1. \] Together with the Euler relation, these reduce the number of free parameters in the \( f \)-vector of a simplicial \( d \)-polytope to \( \lfloor d/2 \rfloor \).

These relations, first discovered by Max Dehn in 1905 for \( d = 4,5 \) and generalized by Duncan Sommerville in 1927, show that the \( f \)-vector of a simplicial polytope has a hidden symmetry captured by the so-called \( h \)-vector.

1.9 The Upper Bound Theorem

How many faces can a polytope with \( n \) vertices have? This fundamental question was answered by Peter McMullen in 1970.

The cyclic polytope \( C(n,d) \) is the convex hull of \( n \) distinct points on the moment curve \( \gamma(t) = (t, t^2, \ldots, t^d) \) in \( \mathbb{R}^d \). By Gale's evenness condition, the cyclic polytope is simplicial and its combinatorial type is independent of the choice of points.
Upper Bound Theorem (McMullen, 1970). Among all convex polytopes with \( n \) vertices in \( \mathbb{R}^d \), the cyclic polytope \( C(n,d) \) has the maximum number of \( i \)-dimensional faces for every \( i \). In particular, the number of facets of any \( d \)-polytope with \( n \) vertices is at most that of \( C(n,d) \).
\[ f_{d-1} \leq \binom{n - \lceil d/2 \rceil}{n - d} + \binom{n - \lfloor d/2 \rfloor - 1}{n - d}. \]

1.10 Neighborly Polytopes

A \( d \)-dimensional polytope \( P \) is \( k \)-neighborly if every set of \( k \) vertices forms a face. It is neighborly if it is \( \lfloor d/2 \rfloor \)-neighborly.
The cyclic polytope \( C(n,d) \) is neighborly: every \( \lfloor d/2 \rfloor \) vertices form a face. Moreover, no \( d \)-polytope with \( n > d+1 \) vertices can be \( (\lfloor d/2 \rfloor + 1) \)-neighborly.

Neighborly polytopes are extremal objects in the theory of polytopes. Their existence demonstrates that high-dimensional polytopes can have far richer combinatorial structure than their low-dimensional counterparts suggest. In dimension 4, for example, a neighborly polytope with \( n \) vertices has every pair of vertices forming an edge — it is a “complete graph” polytope, with \( \binom{n}{2} \) edges.


2. Incidence Geometry

Incidence geometry studies the combinatorial relationships between geometric objects — how many times can points and lines (or other geometric primitives) meet? The central results here have deep connections to additive combinatorics, harmonic analysis, and theoretical computer science.

2.1 Point-Line Incidences

Given a set \( P \) of points and a set \( L \) of lines in \( \mathbb{R}^2 \), an incidence is a pair \( (p, \ell) \in P \times L \) such that \( p \in \ell \). The number of incidences is denoted \( I(P, L) \).

A trivial upper bound is \( I(P, L) \leq |P| \cdot |L| \). Can we do better? If \( |P| = m \) and \( |L| = n \), a simple double counting shows \( I(P,L) \leq m + \binom{n}{2} \) (each pair of lines meets in at most one point) and \( I(P,L) \leq n + \binom{m}{2} \) (each pair of points determines at most one line). But the true answer is far stronger.

2.2 The Crossing Lemma

Before proving the Szemerédi-Trotter theorem, we establish a key tool from topological graph theory.

A drawing of a graph \( G \) in the plane represents each vertex as a point and each edge as a continuous arc connecting its endpoints, where arcs may cross. The crossing number \( \operatorname{cr}(G) \) is the minimum number of edge crossings over all drawings of \( G \).
Crossing Lemma (Ajtai-Chvátal-Newborn-Szemerédi, 1982; Leighton, 1983). Let \( G \) be a simple graph with \( n \) vertices and \( m \) edges, where \( m \geq 4n \). Then \[ \operatorname{cr}(G) \geq \frac{m^3}{64n^2}. \]
We first need the following easy fact: for any drawing of \( G \), the number of crossings plus \( n \) is at least \( m - 3n + 6 \) (by Euler's formula for planar graphs, a planar graph on \( n \) vertices has at most \( 3n-6 \) edges). Thus \[ \operatorname{cr}(G) \geq m - 3n. \]

Now consider a random subgraph \( G' \) obtained by independently keeping each vertex with probability \( p \in (0,1] \) (and keeping an edge if both its endpoints are kept). The expected number of vertices of \( G' \) is \( pn \), the expected number of edges is \( p^2 m \), and the expected number of crossings is \( p^4 \operatorname{cr}(G) \) (a crossing requires all four endpoints of the two crossing edges to be kept).

\[ p^4 \operatorname{cr}(G) \geq p^2 m - 3pn. \]\[ \operatorname{cr}(G) \geq \frac{m}{p^2} - \frac{3n}{p^3}. \]\[ \operatorname{cr}(G) \geq \frac{m^3}{16n^2} - \frac{3m^3}{64n^2} = \frac{m^3}{64n^2}. \]
The crossing lemma is one of the most useful results in combinatorial geometry. Its proof, using the "probabilistic method" on a deterministic bound, is a beautiful illustration of the power of randomness in combinatorics. The constant \( 1/64 \) can be improved; the best known constant (by Ackerman, 2019) replaces the threshold \( 4n \) by \( (103/16)n \) and obtains a constant approaching \( 1/29 \).

2.3 The Szemerédi-Trotter Theorem

The following theorem, proved by Endre Szemerédi and William Trotter in 1983, is one of the central results in combinatorial geometry. The original proof was technically involved; we present the elegant proof via the crossing lemma due to Székely (1997).

Szemerédi-Trotter Theorem (1983). Let \( P \) be a set of \( m \) points and \( L \) a set of \( n \) lines in \( \mathbb{R}^2 \). Then \[ I(P, L) = O\left( m^{2/3} n^{2/3} + m + n \right). \]
We construct a graph \( G \) as follows. The vertices of \( G \) are the points of \( P \). For each line \( \ell \in L \), order the points of \( P \cap \ell \) along \( \ell \), and connect consecutive points by an edge. Thus \( G \) has \( m \) vertices and exactly \( I(P,L) - n \) edges (each line with \( k \) points on it contributes \( k-1 \) edges).

Draw \( G \) in the plane by placing each vertex at the corresponding point of \( P \) and drawing each edge as the line segment between consecutive points on its line. Two edges from different lines can cross in at most one point, so the number of crossings is at most \( \binom{n}{2} < n^2/2 \).

If \( I(P,L) - n \leq 4m \), then \( I(P,L) \leq 4m + n = O(m+n) \) and we are done.

\[ \frac{n^2}{2} \geq \operatorname{cr}(G) \geq \frac{(I(P,L)-n)^3}{64m^2}. \]\[ (I(P,L)-n)^3 \leq 32 m^2 n^2, \]

giving \( I(P,L) - n = O(m^{2/3} n^{2/3}) \), and hence \( I(P,L) = O(m^{2/3} n^{2/3} + m + n) \).

The Szemerédi-Trotter bound is tight: there exist configurations of \( m \) points and \( n \) lines with \( \Omega(m^{2/3} n^{2/3} + m + n) \) incidences. The lower bound construction uses a suitable grid of points and lines of various slopes.

2.4 Applications: Distinct Distances and Unit Distances

One of the oldest problems in combinatorial geometry is due to Paul Erdős (1946): what is the minimum number of distinct distances determined by \( n \) points in the plane?

For a set \( P \) of \( n \) points in \( \mathbb{R}^2 \), define \( D(P) = \{ \|p - q\| : p, q \in P, p \neq q \} \). The distinct distances problem asks for the minimum of \( |D(P)| \) over all \( n \)-point sets.

Erdős conjectured that the answer is \( \Omega(n/\sqrt{\log n}) \), achieved by the \( \sqrt{n} \times \sqrt{n} \) integer grid. Using the Szemerédi-Trotter theorem, one can prove a lower bound of \( \Omega(n^{4/5}) \). The Elekes-Sharir framework (2010) transformed the problem into one about incidences, ultimately leading to the resolution by Guth and Katz.

Guth-Katz Theorem (2015). Any set of \( n \) points in the plane determines at least \( \Omega(n / \log n) \) distinct distances.

This nearly resolves Erdős’s conjecture and represents one of the greatest achievements of the polynomial method in combinatorial geometry. The proof combines the Elekes-Sharir reduction to a point-line incidence problem in three dimensions with the polynomial partitioning technique.

The unit distances problem, also posed by Erdős, asks for the maximum number \( u(n) \) of pairs at distance exactly 1 among \( n \) points. The Szemerédi-Trotter theorem implies \( u(n) = O(n^{4/3}) \) (by applying it to the \( n \) circles of radius 1 centered at the points). The \( \sqrt{n} \times \sqrt{n} \) grid shows \( u(n) = \Omega(n^{1+c/\log\log n}) \). Closing this gap remains a major open problem.

2.5 Beck’s Theorem

Beck's Theorem (1983). For any set of \( n \) points in the plane, one of the following holds:
(i) There is a line containing at least \( n/100 \) of the points, or
(ii) The points determine at least \( \Omega(n^2) \) distinct lines.

Beck’s theorem can be proved using the Szemerédi-Trotter theorem. It captures the fundamental dichotomy that a point set is either “nearly collinear” or determines many lines.

2.6 Incidences in Higher Dimensions

The study of point-hyperplane and point-variety incidences in \( \mathbb{R}^d \) for \( d \geq 3 \) requires substantially different techniques. The polynomial partitioning method of Guth and Katz (discussed in Chapter 7) has led to major progress. A key result is the following.

Point-plane incidence bound (Guth-Katz framework). In \( \mathbb{R}^3 \), the number of incidences between \( m \) points and \( n \) planes can be \( \Theta(mn) \) in the worst case (e.g., all points on one line). However, if no line contains more than \( s \) points, stronger bounds are obtainable via polynomial partitioning.

3. Arrangements and Geometric Transversals

Arrangements of hyperplanes and other geometric objects are fundamental structures in computational and combinatorial geometry. This chapter develops the theory of arrangements, proves the ham-sandwich theorem using Borsuk-Ulam, and introduces the powerful framework of VC dimension and epsilon-nets.

3.1 Hyperplane Arrangements

An arrangement of hyperplanes in \( \mathbb{R}^d \) is a finite collection \( \mathcal{H} = \{H_1, \ldots, H_n\} \) of hyperplanes. The arrangement partitions \( \mathbb{R}^d \) into faces of various dimensions: connected components of intersections of some hyperplanes minus others. The faces of dimension \( d \) are called cells (or regions).
An arrangement of \( n \) hyperplanes in \( \mathbb{R}^d \) has at most \( \sum_{i=0}^{d} \binom{n}{i} \) cells.

For \( d = 2 \), this gives at most \( \binom{n}{0} + \binom{n}{1} + \binom{n}{2} = 1 + n + \binom{n}{2} \) regions, with equality when the lines are in general position (no two parallel, no three concurrent). This is the classical result on the number of regions determined by \( n \) lines in the plane.

Three lines in general position in \( \mathbb{R}^2 \) create \( 1 + 3 + 3 = 7 \) regions. Four lines in general position create \( 1 + 4 + 6 = 11 \) regions.

3.2 The Zone Theorem

The zone of a hyperplane \( H \) in an arrangement \( \mathcal{H} \) is the set of all faces of the arrangement that have a non-empty intersection with \( H \).
Zone Theorem. In an arrangement of \( n \) hyperplanes in \( \mathbb{R}^d \), the zone of any hyperplane has total combinatorial complexity \( O(n^{d-1}) \). In the plane (\( d = 2 \)), the zone of a line in an arrangement of \( n \) lines has at most \( 5n - 1 \) edges.

The zone theorem is a crucial tool in the analysis of algorithms for constructing and manipulating arrangements. It implies that inserting a new hyperplane into an existing arrangement takes time proportional to the zone complexity.

3.3 The Ham-Sandwich Theorem

The ham-sandwich theorem is a beautiful result at the intersection of topology and geometry. Its name comes from the following vivid interpretation: given a ham sandwich consisting of two slices of bread and a slice of ham, it is possible to cut all three pieces exactly in half with a single straight cut.

Ham-Sandwich Theorem (Stone-Tukey, 1942). Given \( d \) measurable sets \( A_1, \ldots, A_d \) in \( \mathbb{R}^d \), each with finite positive measure, there exists a hyperplane that simultaneously bisects each \( A_i \) (i.e., each open half-space determined by the hyperplane contains exactly half the measure of each set).
We use the Borsuk-Ulam theorem (proved in full in Chapter 6). The space of oriented hyperplanes in \( \mathbb{R}^d \) can be parametrized by \( S^d \) as follows. For a unit vector \( u \in S^{d-1} \) and a real number \( a \), the hyperplane is \( \{ x : u \cdot x = a \} \). Lifting to \( S^d \) via the parametrization: for \( v = (v_0, v_1, \ldots, v_d) \in S^d \), consider the hyperplane \[ H_v = \{ x \in \mathbb{R}^d : v_1 x_1 + \cdots + v_d x_d = v_0 \} \] with the positive side being \( H_v^+ = \{ x : v_1 x_1 + \cdots + v_d x_d > v_0 \} \). \[ f_i(v) = \mu(A_i \cap H_v^+) - \frac{1}{2}\mu(A_i), \quad i = 1, \ldots, d. \]

Then \( f \) is continuous (under suitable regularity assumptions on the sets) and satisfies \( f(-v) = -f(v) \) since reversing the orientation of the hyperplane swaps the two sides. By the Borsuk-Ulam theorem, there exists \( v^* \in S^d \) with \( f(v^*) = 0 \), meaning \( H_{v^*} \) bisects each \( A_i \).

3.4 Geometric Transversals

A transversal to a family \( \mathcal{F} \) of convex sets in \( \mathbb{R}^d \) is a flat (affine subspace) that intersects every member of \( \mathcal{F} \). A line transversal is a transversal that is a line; a point transversal (or piercing point) is a single point common to all members.

Helly’s theorem characterizes when a point transversal exists. For line transversals, the situation is more complex and leads to Hadwiger-type transversal theorems.

3.5 VC Dimension and Epsilon-Nets

The theory of VC dimension, originating in statistical learning theory (Vapnik-Chervonenkis, 1971), has become a cornerstone of computational and combinatorial geometry.

Let \( X \) be a set and \( \mathcal{R} \subseteq 2^X \) a set system (or range space). A subset \( A \subseteq X \) is shattered by \( \mathcal{R} \) if for every \( B \subseteq A \), there exists \( R \in \mathcal{R} \) with \( R \cap A = B \). The VC dimension of \( (X, \mathcal{R}) \) is the maximum size of a shattered subset.
Consider the range space where \( X = \mathbb{R}^2 \) and \( \mathcal{R} \) is the set of all open half-planes. Three non-collinear points can be shattered (for each subset, some half-plane captures exactly that subset), but no set of four points can be shattered. Thus the VC dimension is 3.

More generally, the range space of half-spaces in \( \mathbb{R}^d \) has VC dimension \( d+1 \).

Sauer-Shelah Lemma (Sauer, 1972; Shelah, 1972). If \( (X, \mathcal{R}) \) has VC dimension \( d \) and \( |X| = n \), then \( |\{ R \cap X : R \in \mathcal{R} \}| \leq \sum_{i=0}^d \binom{n}{i} = O(n^d) \).

The Sauer-Shelah lemma shows that bounded VC dimension implies polynomial growth of the number of distinct “traces” of ranges on finite sets. This is crucial for the epsilon-net theorem.

3.6 Epsilon-Nets and the Epsilon-Net Theorem

Let \( (X, \mathcal{R}) \) be a range space and \( \mu \) a probability measure on \( X \). A set \( N \subseteq X \) is an \( \varepsilon \)-net for \( (X, \mathcal{R}) \) with respect to \( \mu \) if for every \( R \in \mathcal{R} \) with \( \mu(R) \geq \varepsilon \), we have \( N \cap R \neq \emptyset \).

In other words, an \( \varepsilon \)-net “hits” every range of measure at least \( \varepsilon \). The \( \varepsilon \)-net theorem guarantees that small \( \varepsilon \)-nets exist for range spaces of bounded VC dimension.

\( \varepsilon \)-Net Theorem (Haussler-Welzl, 1987). Let \( (X, \mathcal{R}) \) be a range space of VC dimension \( d \), and let \( \mu \) be a probability measure on \( X \). For every \( \varepsilon > 0 \), there exists an \( \varepsilon \)-net of size at most \( C \frac{d}{\varepsilon} \log \frac{d}{\varepsilon} \), where \( C \) is an absolute constant.
The proof uses the probabilistic method combined with the Sauer-Shelah lemma. Let \( r = \lceil \frac{c d}{\varepsilon} \log \frac{d}{\varepsilon} \rceil \) for a suitable constant \( c \). Draw \( r \) points \( x_1, \ldots, x_r \) independently from \( \mu \), forming a random set \( N \). \[ \Pr[N \cap R = \emptyset] = (1 - \mu(R))^r \leq (1-\varepsilon)^r \leq e^{-\varepsilon r}. \]

To apply a union bound, we need to bound the number of distinct ranges restricted to a finite sample. Draw an additional “test sample” \( T \) of size \( t \) (to be chosen). The key observation is: if \( R \) has \( \mu(R) \geq \varepsilon \) and \( N \cap R = \emptyset \), then with reasonable probability, \( R \) contains at least \( \varepsilon t / 2 \) points of \( T \). This can be formalized via a double-sampling argument.

More precisely, consider a random sample \( S \) of size \( r + t \). We bound the probability that there exists a range \( R \) that is “heavy” (captures at least \( \varepsilon t / 2 \) points of the last \( t \) points of \( S \)) yet “missed” by the first \( r \) points of \( S \). The number of distinct subsets induced by ranges on \( S \) is at most \( \sum_{i=0}^d \binom{r+t}{i} \leq (r+t)^d \) by the Sauer-Shelah lemma.

\[ \frac{\binom{r+t - \varepsilon t/2}{r}}{\binom{r+t}{r}} \leq \left(1 - \frac{\varepsilon t/2}{r+t}\right)^r. \]

Taking \( t = 2r \) and choosing \( c \) large enough, the union bound over all \( O((3r)^d) \) ranges gives a total failure probability less than 1. Thus an \( \varepsilon \)-net of size \( r = O\left(\frac{d}{\varepsilon}\log\frac{d}{\varepsilon}\right) \) exists.

The epsilon-net theorem has profound applications in computational geometry (e.g., fast algorithms for range searching and geometric set cover), combinatorial geometry (e.g., weak epsilon-nets and the first selection lemma), and learning theory (PAC learning). The \( \log(1/\varepsilon) \) factor cannot be removed in general, as shown by Pach and Tardos (2013).

3.7 Epsilon-Approximations

A set \( A \subseteq X \) is an \( \varepsilon \)-approximation for \( (X, \mathcal{R}) \) with respect to a probability measure \( \mu \) if for every \( R \in \mathcal{R} \), \[ \left| \frac{|A \cap R|}{|A|} - \mu(R) \right| \leq \varepsilon. \]

An \( \varepsilon \)-approximation provides a uniform approximation of the measure of all ranges simultaneously. Every \( \varepsilon \)-approximation is a \( 2\varepsilon \)-net, but the converse is far from true. For range spaces of VC dimension \( d \), \( \varepsilon \)-approximations of size \( O(d/\varepsilon^2 \cdot \log(d/\varepsilon)) \) exist, by a more refined probabilistic argument.


4. Geometric Algorithms

Computational geometry concerns the design and analysis of algorithms for geometric problems. This chapter covers the foundational algorithms: convex hull computation, Voronoi diagrams, Delaunay triangulations, range searching, point location, and line segment intersection.

4.1 Convex Hull Algorithms

Computing the convex hull of a set of points is one of the most fundamental problems in computational geometry. We present several classical algorithms.

4.1.1 Graham Scan

Graham Scan (Graham, 1972). The convex hull of \( n \) points in \( \mathbb{R}^2 \) can be computed in \( O(n \log n) \) time.

The algorithm works as follows:

  1. Find the point \( p_0 \) with the lowest \( y \)-coordinate (breaking ties by \( x \)-coordinate).
  2. Sort the remaining points by polar angle with respect to \( p_0 \).
  3. Process points in sorted order, maintaining a stack representing the vertices of the current convex hull. For each new point, pop points from the stack while the new point makes a right turn (or goes straight) with the top two stack elements, then push the new point.

The correctness follows from the invariant that the stack always forms a convex chain. The running time is dominated by the sorting step: \( O(n \log n) \). The stack operations take \( O(n) \) amortized time (each point is pushed and popped at most once).

Consider the points \( (0,0), (1,0), (2,1), (1,2), (0,2), (-1,1) \). The Graham scan starting from \( (0,0) \) sorts by angle, then processes: it outputs the convex hull \( (0,0), (1,0), (2,1), (1,2), (0,2), (-1,1) \) — all six points are extreme, forming a hexagon.

4.1.2 Gift Wrapping (Jarvis March)

Gift Wrapping (Jarvis, 1973). The convex hull of \( n \) points in \( \mathbb{R}^2 \) can be computed in \( O(nh) \) time, where \( h \) is the number of hull vertices.

Starting from the lowest point, the algorithm repeatedly selects the point that makes the smallest counter-clockwise angle with the current edge direction. Each step takes \( O(n) \) time, and there are \( h \) steps. This is output-sensitive: when \( h \) is small, it outperforms Graham scan.

4.1.3 Divide and Conquer

The convex hull of \( n \) points in \( \mathbb{R}^2 \) can be computed in \( O(n \log n) \) time by a divide-and-conquer algorithm.

Sort the points by \( x \)-coordinate. Recursively compute the convex hulls of the left and right halves. Merge the two hulls by finding the upper and lower tangent lines. The merge step takes \( O(n) \) time, giving the recurrence \( T(n) = 2T(n/2) + O(n) \), which solves to \( O(n \log n) \).

In \( \mathbb{R}^d \) for \( d \geq 2 \), the convex hull of \( n \) points can have \( \Theta(n^{\lfloor d/2 \rfloor}) \) faces (by the upper bound theorem). Algorithms achieving \( O(n \log n + n^{\lfloor d/2 \rfloor}) \) time are known.

4.2 Voronoi Diagrams

Given a set \( S = \{p_1, \ldots, p_n\} \) of sites in \( \mathbb{R}^2 \), the Voronoi diagram of \( S \) is the partition of the plane into Voronoi cells \[ \operatorname{Vor}(p_i) = \{ x \in \mathbb{R}^2 : \|x - p_i\| \leq \|x - p_j\| \text{ for all } j \neq i \}. \] Each cell is a convex polygon (possibly unbounded), and the cells partition \( \mathbb{R}^2 \).

The Voronoi diagram encodes proximity information: the cell of \( p_i \) consists of all points closer to \( p_i \) than to any other site. It is a fundamental data structure with applications in nearest-neighbor queries, facility location, and many other domains.

The Voronoi diagram of \( n \) sites in \( \mathbb{R}^2 \) has at most \( 2n - 5 \) vertices and at most \( 3n - 6 \) edges (for \( n \geq 3 \)), and can be computed in \( O(n \log n) \) time.

The complexity bounds follow from Euler’s formula (the Voronoi diagram is a planar graph). The \( O(n \log n) \) algorithm can be obtained by Fortune’s sweep (Fortune, 1987) or by divide and conquer (Shamos and Hoey, 1975).

4.3 Delaunay Triangulations

The Delaunay triangulation of a set \( S \) of points in \( \mathbb{R}^2 \) is the dual of the Voronoi diagram: two sites \( p_i \) and \( p_j \) are connected by a Delaunay edge if and only if their Voronoi cells share an edge. If the sites are in general position (no four points are cocircular), the Delaunay triangulation is a triangulation of \( \operatorname{conv}(S) \).
Empty circle property. Three points \( p_i, p_j, p_k \in S \) form a Delaunay triangle if and only if the circumscribed circle of the triangle \( p_i p_j p_k \) contains no other point of \( S \) in its interior.

This characterization gives an equivalent definition: a triangulation of \( S \) is Delaunay if and only if every triangle has an empty circumscribed circle. The Delaunay triangulation maximizes the minimum angle among all triangulations of the same point set, making it ideal for numerical methods.

There is a beautiful connection between Delaunay triangulations and convex hulls.

Lifting map duality. The Delaunay triangulation of points \( (x_i, y_i) \in \mathbb{R}^2 \) is the projection of the lower convex hull of the lifted points \( (x_i, y_i, x_i^2 + y_i^2) \in \mathbb{R}^3 \) (onto the paraboloid \( z = x^2 + y^2 \)).

This duality provides a conceptually clean way to compute Delaunay triangulations via three-dimensional convex hull algorithms.

4.4 Range Searching

Range searching is a fundamental problem in computational geometry: preprocess a set of \( n \) points so that for a given query range \( R \) (e.g., a rectangle, half-plane, or simplex), one can efficiently count or report the points in \( R \).

Orthogonal range searching. Using a \( d \)-dimensional range tree, one can answer orthogonal range queries (axis-aligned boxes) in \( O(\log^d n + k) \) time using \( O(n \log^{d-1} n) \) space, where \( k \) is the output size. With fractional cascading, the query time improves to \( O(\log^{d-1} n + k) \).

For non-orthogonal ranges (e.g., half-planes or simplices), different techniques are needed, often involving partition trees and cuttings.

4.5 Point Location

The point location problem asks: given a planar subdivision with \( n \) vertices, preprocess it so that for a query point \( q \), one can quickly determine which face of the subdivision contains \( q \).
Kirkpatrick's algorithm (1983). A planar subdivision with \( n \) vertices can be preprocessed in \( O(n) \) time and space to support point location queries in \( O(\log n) \) time.

Kirkpatrick’s algorithm builds a hierarchy of progressively coarser triangulations. At each level, an independent set of vertices is removed and the resulting hole is re-triangulated. The hierarchy has \( O(\log n) \) levels, and a query descends from the coarsest level to the finest.

4.6 Line Segment Intersection: The Bentley-Ottmann Algorithm

Bentley-Ottmann Algorithm (1979). Given \( n \) line segments in the plane, all \( k \) intersection points can be reported in \( O((n+k) \log n) \) time using \( O(n) \) space.

The algorithm uses the plane sweep paradigm: a vertical line sweeps from left to right, maintaining a balanced binary search tree of segments ordered by their \( y \)-coordinate at the sweep line. Event points (segment endpoints and intersection points) are processed in left-to-right order. When two segments become adjacent in the sweep structure, their intersection (if any) is predicted and added to the event queue.

The key insight is that two segments can intersect only if they are adjacent in the sweep line order at some point during the sweep. This keeps the event queue manageable: at any time, it contains at most \( O(n+k) \) events.

4.7 Randomized Incremental Construction

Many geometric algorithms benefit from randomization. The paradigm of randomized incremental construction builds a geometric structure by inserting objects in random order and updating the structure after each insertion.

The Delaunay triangulation of \( n \) points in the plane can be computed in \( O(n \log n) \) expected time by randomized incremental insertion.

The algorithm inserts points one at a time in random order. After inserting a point \( p \), it finds the triangle containing \( p \), splits it, and restores the Delaunay property by edge flips. Using the history graph (or conflict graph) for point location, each insertion takes \( O(\log n) \) expected amortized time.

Randomized incremental construction extends naturally to many other problems: computing convex hulls, trapezoidal decompositions, and Voronoi diagrams in higher dimensions. The backward analysis technique, due to Clarkson and Shor (1989), provides a unified framework for analyzing these algorithms.

5. Packing, Covering, and Tiling

The study of packing, covering, and tiling problems has ancient roots — Kepler speculated about sphere packing in 1611 — and connects geometry to number theory, coding theory, and optimization. This chapter surveys the main results and open problems.

5.1 Sphere Packing

A sphere packing in \( \mathbb{R}^d \) is a collection of non-overlapping (interior-disjoint) balls of equal radius. The packing density \( \Delta \) is the fraction of space covered by the balls: \[ \Delta = \lim_{R \to \infty} \frac{\operatorname{vol}(\text{balls within } B(0,R))}{\operatorname{vol}(B(0,R))}, \] when this limit exists.
A lattice packing is one where the centers of the balls form a lattice \( \Lambda \subseteq \mathbb{R}^d \) (a discrete subgroup of \( \mathbb{R}^d \) with \( d \) linearly independent generators). The packing density of a lattice packing with balls of radius \( r \) and lattice of determinant \( \det(\Lambda) \) is \[ \Delta = \frac{V_d \cdot r^d}{\det(\Lambda)}, \] where \( V_d \) is the volume of the unit ball in \( \mathbb{R}^d \).
In \( \mathbb{R}^2 \), the densest packing is the hexagonal lattice packing, with density \( \pi / \sqrt{12} \approx 0.9069 \). This was proved by Thue (1910) and Tóth (1940). In \( \mathbb{R}^3 \), the face-centered cubic (FCC) and hexagonal close-packed (HCP) packings both achieve density \( \pi / \sqrt{18} \approx 0.7405 \).

5.2 The Kepler Conjecture

Kepler Conjecture (Hales, 2005). The maximum density of a sphere packing in \( \mathbb{R}^3 \) is \( \pi / \sqrt{18} \approx 0.7405 \), achieved by the FCC lattice packing.

This conjecture, posed by Johannes Kepler in 1611, was one of the oldest open problems in geometry. Thomas Hales announced a proof in 1998, which was published in 2005 after extensive peer review. The proof is a monumental achievement, combining theoretical reductions with extensive computer calculations. It was later verified formally using the Flyspeck project (completed 2014) and the Lean proof assistant.

5.3 The Minkowski-Hlawka Theorem

While finding the densest packing in a given dimension is extremely hard, we can prove the existence of reasonably dense packings.

Minkowski-Hlawka Theorem. For every \( d \geq 1 \), there exists a lattice packing of unit balls in \( \mathbb{R}^d \) with density at least \[ \Delta \geq \frac{\zeta(d)}{2^{d-1}}, \] where \( \zeta(d) = \sum_{k=1}^\infty k^{-d} \) is the Riemann zeta function. In particular, \( \Delta \geq 2^{-d+1} \) for \( d \geq 2 \).

The proof uses Minkowski’s theorem on lattice points in convex bodies (or, in its modern formulation, a probabilistic averaging argument over random lattices due to Siegel). Despite its simplicity, the \( 2^{-d} \) order has not been substantially improved for general \( d \) — this is one of the major open problems in the geometry of numbers.

5.4 Rogers’ Bound and High-Dimensional Packing

Rogers' Bound (1958). The density of any sphere packing in \( \mathbb{R}^d \) satisfies \[ \Delta \leq \frac{d \cdot V_d \cdot \sigma_d}{2^d}, \] where \( \sigma_d \) is the density of a certain "simplex bound." Asymptotically, this gives \( \Delta \leq 2^{-0.599d + o(d)} \).
Kabatiansky-Levenshtein Bound (1978). The density of any sphere packing in \( \mathbb{R}^d \) satisfies \[ \Delta \leq 2^{-0.599d + o(d)}. \] This is the best known asymptotic upper bound and matches the Rogers bound up to sub-exponential factors.

The gap between the lower bound (Minkowski-Hlawka, \( \Delta \geq 2^{-d+1} \)) and the upper bound (Kabatiansky-Levenshtein, \( \Delta \leq 2^{-0.599d} \)) remains wide. Understanding the true exponential rate of the optimal packing density is a deep open problem.

5.5 Connections to Coding Theory

There is a profound analogy between sphere packings and error-correcting codes. A binary code of length \( n \), minimum distance \( d \), and size \( M \) corresponds to a packing of \( M \) Hamming balls of radius \( \lfloor (d-1)/2 \rfloor \) in \( \{0,1\}^n \). The rate of the code, \( \frac{1}{n}\log_2 M \), is analogous to the packing density.

The Gilbert-Varshamov bound (existence of codes by greedy/probabilistic arguments) is the coding-theoretic analogue of the Minkowski-Hlawka theorem, while the Hamming bound (sphere-packing bound for codes) is analogous to Rogers’ bound.

In recent breakthroughs, Viazovska (2017) proved that the \( E_8 \) lattice gives the densest sphere packing in \( \mathbb{R}^8 \), with density \( \pi^4/384 \), and Cohn, Kumar, Miller, Radchenko, and Viazovska (2017) proved the analogous result for the Leech lattice in \( \mathbb{R}^{24} \). These are the only dimensions beyond 1, 2, and 3 where the optimal packing density is known.

5.6 Sphere Covering

A sphere covering of \( \mathbb{R}^d \) is a collection of balls (of equal radius) whose union is all of \( \mathbb{R}^d \). The covering density \( \theta \) is the average number of balls covering a point.

Covering is the dual problem to packing: we want to cover space with as few balls as possible. The covering density satisfies \( \theta \geq 1 \) with equality only in trivial cases. Rogers (1959) showed that the covering density of the thinnest lattice covering in \( \mathbb{R}^d \) is at most \( d \ln d + d \ln \ln d + 5d \) — roughly exponential in dimension, in contrast to the packing problem.

5.7 Tiling Problems

A tiling (or tessellation) of \( \mathbb{R}^d \) by a body \( K \) is a collection of congruent copies of \( K \) that cover \( \mathbb{R}^d \) with interiors pairwise disjoint.

A central question is: which convex bodies tile \( \mathbb{R}^d \)? In \( \mathbb{R}^2 \), the complete list of convex tiles is known: all triangles and quadrilaterals tile (by suitable arrangements), along with exactly 15 families of convex pentagons (the list was completed in 2017) and exactly 3 families of convex hexagons. No convex polygon with 7 or more sides tiles the plane.

5.8 The Honeycomb Conjecture

Honeycomb Conjecture (Hales, 1999). The regular hexagonal tiling is the partition of the plane into regions of equal area that minimizes the total perimeter per unit area.

This result, conjectured since antiquity (Pappus attributed it to the wisdom of bees), was proved by Thomas Hales in 1999. The proof uses the isoperimetric inequality and careful analysis of the possible local structures.


6. Topological Methods in Combinatorial Geometry

Topology provides surprisingly powerful tools for combinatorial and geometric problems. This chapter centers on the Borsuk-Ulam theorem and its many applications, including the ham-sandwich theorem, Lovász’s proof of Kneser’s conjecture, and the topological Tverberg theorem.

6.1 The Borsuk-Ulam Theorem

The Borsuk-Ulam theorem, first proved by Karol Borsuk in 1933, is one of the most beautiful and applicable results in topology. It has an extraordinary range of consequences in combinatorics, geometry, and beyond.

Borsuk-Ulam Theorem. For every continuous map \( f: S^n \to \mathbb{R}^n \), there exists a point \( x \in S^n \) such that \( f(x) = f(-x) \).

Equivalently: every continuous odd map \( g: S^n \to \mathbb{R}^n \) (satisfying \( g(-x) = -g(x) \)) has a zero.

We prove the theorem by induction on \( n \). The proof uses the degree theory of maps.

Base case \( n = 1 \). Let \( f: S^1 \to \mathbb{R} \) be continuous. Define \( g: S^1 \to \mathbb{R} \) by \( g(x) = f(x) - f(-x) \). Then \( g(-x) = f(-x) - f(x) = -g(x) \). Since \( g \) is continuous and odd, and \( S^1 \) is connected, the intermediate value theorem applies: if \( g(x_0) > 0 \) for some \( x_0 \), then \( g(-x_0) < 0 \), so \( g \) has a zero on any path from \( x_0 \) to \( -x_0 \). At this zero, \( f(x) = f(-x) \).

Inductive step. Assume the theorem holds for \( n-1 \). Let \( f: S^n \to \mathbb{R}^n \) be continuous. Define \( g: S^n \to \mathbb{R}^n \) by \( g(x) = f(x) - f(-x) \), which is continuous and odd. Suppose for contradiction that \( g(x) \neq 0 \) for all \( x \). Then \( h(x) = g(x)/\|g(x)\| \) is a continuous odd map \( h: S^n \to S^{n-1} \).

Restrict \( h \) to the equator \( S^{n-1} = \{ x \in S^n : x_{n+1} = 0 \} \). This gives a continuous odd map \( h|_{S^{n-1}}: S^{n-1} \to S^{n-1} \). By the Borsuk-Ulam theorem applied to the identity, any continuous odd map from \( S^{n-1} \) to \( S^{n-1} \) has odd degree (this is the key topological fact, provable via homology or by the theory of the degree of equivariant maps). In particular, \( h|_{S^{n-1}} \) is not null-homotopic.

But \( h \) extends to the upper hemisphere \( S^n_+ = \{x \in S^n : x_{n+1} \geq 0\} \), which is contractible (homeomorphic to the closed \( n \)-disk). So \( h|_{S^{n-1}} \) extends to a map from the disk \( D^n \to S^{n-1} \), meaning \( h|_{S^{n-1}} \) is null-homotopic — a contradiction.

Thus \( g \) has a zero, meaning \( f(x) = f(-x) \) for some \( x \in S^n \).

Alternative proofs of the Borsuk-Ulam theorem exist using the fundamental group (for \( n = 1 \)), homology theory, the Lusternik-Schnirelmann category, or Tucker's lemma (a combinatorial analogue). Each approach illuminates different aspects of the theorem.

An immediate and beautiful consequence:

Borsuk-Ulam, antipodal version. There is no continuous odd map \( f: S^n \to S^{n-1} \). Equivalently, there is no continuous map \( f: S^n \to S^{n-1} \) with \( f(-x) = -f(x) \) for all \( x \).

Another classical consequence:

Lusternik-Schnirelmann Theorem. If \( S^n \) is covered by \( n+1 \) closed sets \( A_1, \ldots, A_{n+1} \), then at least one \( A_i \) contains a pair of antipodal points.

6.2 The Ham-Sandwich Theorem Revisited

We already proved the ham-sandwich theorem in Chapter 3 using Borsuk-Ulam. Let us note some generalizations.

Polynomial Ham-Sandwich Theorem (Stone-Tukey generalization). Let \( \mu_1, \ldots, \mu_m \) be finite Borel measures on \( \mathbb{R}^d \) that are absolutely continuous with respect to Lebesgue measure. If \( m \leq \binom{d+k}{k} - 1 \), then there exists a polynomial of degree at most \( k \) whose zero set bisects all \( m \) measures simultaneously.

For \( k = 1 \), this recovers the classical ham-sandwich theorem (a hyperplane bisects \( d \) measures).

6.3 Lovász’s Proof of Kneser’s Conjecture

One of the most celebrated applications of the Borsuk-Ulam theorem in combinatorics is Lovász’s 1978 proof of Kneser’s conjecture, which determined the chromatic number of the Kneser graph.

The Kneser graph \( KG(n,k) \) has as vertices all \( k \)-element subsets of \( \{1, 2, \ldots, n\} \) (where \( n \geq 2k \)), with two vertices adjacent if and only if the corresponding sets are disjoint.
The Kneser graph \( KG(5,2) \) is the Petersen graph: vertices are the \( \binom{5}{2} = 10 \) pairs from \( \{1,2,3,4,5\} \), and two pairs are adjacent iff they are disjoint. The Petersen graph has chromatic number 3.

Martin Kneser conjectured in 1955 that \( \chi(KG(n,k)) = n - 2k + 2 \). The upper bound is easy: color a \( k \)-set \( A \) by \( \min(A) \) if \( \min(A) \leq n - 2k + 1 \), and by \( n - 2k + 2 \) otherwise. This uses \( n - 2k + 2 \) colors and is a proper coloring.

Kneser's Conjecture (Lovász, 1978). The chromatic number of the Kneser graph \( KG(n,k) \) is \( n - 2k + 2 \).
We prove the lower bound \( \chi(KG(n,k)) \geq n - 2k + 2 \). Suppose for contradiction that there is a proper coloring \( c \) of \( KG(n,k) \) using \( n - 2k + 1 \) colors. We will derive a contradiction using the Borsuk-Ulam theorem.

Identify the ground set with \( [n] = \{1, 2, \ldots, n\} \), and place the elements of \( [n] \) on the sphere \( S^{n-2} \) in general position: choose points \( v_1, \ldots, v_n \in S^{n-2} \) such that no \( n-1 \) of them lie in a closed hemisphere (this is possible for \( n \geq 2 \) points in \( S^{n-2} \) by a dimension argument).

For each point \( x \in S^{n-2} \), consider the open hemisphere \( H_x = \{ y \in S^{n-2} : x \cdot y > 0 \} \). By our general position assumption, \( H_x \) contains at least one of the \( v_i \). Define \( A(x) \subseteq [n] \) to be the indices of points in \( H_x \), i.e., \( A(x) = \{ i \in [n] : x \cdot v_i > 0 \} \). By general position, \( |A(x)| \geq 1 \) for all \( x \).

\[ f(x) = e_{c(B(x))}, \]

the \( c(B(x)) \)-th standard basis vector in \( \mathbb{R}^{n-2k+1} \). If \( |A(x)| < k \), set \( f(x) = 0 \).

\[ f_j(x) = \max\left(0, \; \min_{i \in B(x)} (x \cdot v_i)\right) \quad \text{times} \quad [\![c(B(x)) = j]\!], \]

summed over appropriate subsets, and made continuous and antipodal by construction. The precise technical details (using a partition of unity based on the inner products \( x \cdot v_i \)) show that one can construct a continuous map \( f: S^{n-2} \to \mathbb{R}^{n-2k+1} \) with the property that if \( f(x) = f(-x) \), then the \( k \)-sets associated to \( x \) and \( -x \) receive the same color yet are disjoint.

By the Borsuk-Ulam theorem, there exists \( x^* \) with \( f(x^*) = f(-x^*) \). The sets \( A(x^*) \) and \( A(-x^*) \) are disjoint (since \( H_{x^*} \) and \( H_{-x^*} \) are complementary open hemispheres), so the \( k \)-subsets \( B(x^*) \subseteq A(x^*) \) and \( B(-x^*) \subseteq A(-x^*) \) are disjoint. But \( f(x^*) = f(-x^*) \) forces \( c(B(x^*)) = c(B(-x^*)) \), contradicting the properness of the coloring.

Hence \( \chi(KG(n,k)) \geq n - 2k + 2 \).

Lovász's proof was revolutionary: it introduced topological methods into combinatorics and spawned the entire field of topological combinatorics. Greene (2002) later gave a shorter proof using the Borsuk-Ulam theorem more directly, and Matoušek (2004) gave an elegant proof using Tucker's lemma.

6.4 The Topological Tverberg Theorem

Tverberg’s theorem generalizes Radon’s theorem: while Radon partitions \( d+2 \) points into 2 parts with intersecting hulls, Tverberg considers \( r \) parts.

Tverberg's Theorem (1966). Any set of \( (r-1)(d+1)+1 \) points in \( \mathbb{R}^d \) can be partitioned into \( r \) disjoint sets whose convex hulls have a common point.

The topological version, conjectured by Bárány, Shlosman, and Szücs, strengthens this by replacing “points” with “images of vertices of a simplex.”

Topological Tverberg Theorem (Bárány-Shlosman-Szücs, 1981; Özaydin, Volovikov, Sarkaria). When \( r \) is a prime power, for any continuous map \( f: \Delta_{(r-1)(d+1)} \to \mathbb{R}^d \), there exist \( r \) pairwise disjoint faces of the simplex whose images under \( f \) have a common point.

The proof uses equivariant topology — specifically, the theory of \( \mathbb{Z}/p \)-equivariant maps and the Borsuk-Ulam theorem for cyclic groups. For non-prime-power \( r \), counterexamples were found by Mabillard and Wagner (2015) for \( d \geq 3r + 1 \), and subsequently by Frick, Avvakumov-Karasev-Skopenkov, and Blagojević-Frick-Ziegler for all non-prime-power \( r \geq 6 \).

6.5 The Necklace Splitting Problem

Necklace Splitting Theorem (Alon-West, 1986). Let a necklace have \( t \cdot k \) beads of each of \( t \) colors (with beads arranged on an interval). Then the necklace can be fairly divided among \( k \) thieves using at most \( (k-1)t \) cuts.

The case \( k = 2 \) follows directly from the Borsuk-Ulam theorem: we need at most \( t \) cuts to divide a necklace with \( t \) colors fairly between two thieves. The proof maps cutting positions to points on \( S^t \) and applies Borsuk-Ulam to show a fair partition exists.

6.6 Borsuk’s Conjecture and Its Disproof

In 1933, Borsuk asked whether every bounded set in \( \mathbb{R}^d \) can be partitioned into \( d+1 \) sets of smaller diameter. This became known as Borsuk’s conjecture.

Disproof of Borsuk's Conjecture (Kahn-Kalai, 1993). For sufficiently large \( d \), there exist bounded sets in \( \mathbb{R}^d \) that cannot be partitioned into \( d+1 \) sets of smaller diameter. In fact, the minimum number of pieces needed can be at least \( (1.2)^{\sqrt{d}} \), which is exponential in \( \sqrt{d} \) and far exceeds \( d+1 \).

The construction uses the vertex set of a cleverly chosen subset of \( \{0,1\}^n \), exploiting the “concentration of measure” phenomenon on the Boolean cube. Specifically, they use a certain union of binary codes and show, using Frankl-Wilson-type intersection theorems, that any partition into pieces of smaller diameter must use exponentially many pieces.

Borsuk's conjecture is true for \( d \leq 3 \) (proved by Borsuk, Grünbaum, Heppes, and Straszewicz) and for smooth convex bodies in all dimensions. The smallest dimension where it fails is currently known to be \( d = 64 \) (Bondarenko, 2014) or lower. The gap between where it holds (small \( d \)) and where it fails (large \( d \)) remains an active area of research.

6.7 Equivariant Topology

The common thread running through the applications in this chapter is equivariant topology: the study of topological spaces with group actions and maps that respect those actions.

Let \( G \) be a group. A \( G \)-space is a topological space \( X \) equipped with a continuous action of \( G \). A continuous map \( f: X \to Y \) between \( G \)-spaces is \( G \)-equivariant (or simply equivariant) if \( f(g \cdot x) = g \cdot f(x) \) for all \( g \in G, x \in X \).

The Borsuk-Ulam theorem can be restated as: there is no \( \mathbb{Z}/2 \)-equivariant map \( S^n \to S^{n-1} \), where \( \mathbb{Z}/2 \) acts by the antipodal map. This viewpoint generalizes to actions of \( \mathbb{Z}/p \), the symmetric group \( S_r \), and other groups, leading to deeper results like the topological Tverberg theorem.

The index (or genus) of a \( \mathbb{Z}/2 \)-space \( X \) is the smallest \( n \) such that there exists a \( \mathbb{Z}/2 \)-equivariant map \( X \to S^n \). The Borsuk-Ulam theorem says the index of \( S^n \) is \( n \). Computing or bounding the index of specific spaces is the key technique for deriving combinatorial applications.


7. Algebraic Methods and Modern Developments

The last two decades have seen a revolution in combinatorial geometry driven by algebraic methods — particularly the polynomial method. This chapter presents the key results: the joints problem, the Guth-Katz polynomial partitioning, Dvir’s proof of the finite field Kakeya conjecture, and connections to additive combinatorics.

7.1 The Polynomial Method

The polynomial method, in its modern incarnation, exploits the fact that low-degree polynomials cannot vanish on too many “independent” points. The key algebraic ingredients are:

Parameter counting. The space of polynomials of degree at most \( D \) in \( d \) variables has dimension \( \binom{D+d}{d} \). Given \( m \) points in \( \mathbb{R}^d \), if \( \binom{D+d}{d} > m \), then there exists a non-zero polynomial of degree at most \( D \) vanishing on all \( m \) points.
Bézout's Theorem (affine version). Let \( f_1, \ldots, f_d \) be polynomials in \( d \) variables over an algebraically closed field, of degrees \( D_1, \ldots, D_d \). If their zero sets have only finitely many common points, then the number of common zeros (counted with multiplicity) is at most \( D_1 \cdot D_2 \cdots D_d \).

7.2 The Joints Problem

A joint formed by a set of lines in \( \mathbb{R}^3 \) is a point where three or more non-coplanar lines meet. The joints problem, raised by Chazelle et al. in 1992, asks: given \( n \) lines in \( \mathbb{R}^3 \), how many joints can they form?

Joints Theorem (Guth-Katz, 2010; Quilodrán, 2010; Kaplan-Sharir-Shustin, 2010). A set of \( n \) lines in \( \mathbb{R}^3 \) determines at most \( O(n^{3/2}) \) joints.
We present the elegant algebraic proof due to Guth and Katz (and independently Quilodrán). Let \( \mathcal{L} \) be a set of \( n \) lines in \( \mathbb{R}^3 \) and \( J \) the set of joints. We prove \( |J| = O(n^{3/2}) \).

Suppose \( |J| > C n^{3/2} \) for a large constant \( C \) to be determined. Choose \( D \) such that \( \binom{D+3}{3} = \frac{(D+1)(D+2)(D+3)}{6} > |J| \) but \( \binom{D+2}{3} \leq |J| \). This gives \( D = \Theta(|J|^{1/3}) = \Theta(C^{1/3} n^{1/2}) \).

By parameter counting, there exists a non-zero polynomial \( f \) of degree at most \( D \) vanishing on all points of \( J \). We now derive a contradiction.

Claim: Every line \( \ell \in \mathcal{L} \) is contained in the zero set \( Z(f) = \{x : f(x) = 0\} \).

Proof of claim: The polynomial \( f \) restricted to \( \ell \) is a univariate polynomial of degree at most \( D \). The line \( \ell \) contains joints, and at each joint at least two other non-coplanar lines of \( \mathcal{L} \) pass through. The number of joints on \( \ell \) might be more than \( D \) if \( C \) is large enough (we will verify this), in which case \( f|_\ell = 0 \) identically, meaning \( \ell \subseteq Z(f) \).

To verify: the average number of joints per line is \( |J| \cdot (\text{average number of lines through a joint}) / n \geq 3|J|/n \), since each joint involves at least 3 lines. Actually, we argue differently: the number of incidences between lines and joints is at least \( 3|J| \). By averaging, some line contains at least \( 3|J|/n > 3C\sqrt{n} \) joints. But we need all lines to be in \( Z(f) \).

Consider a line \( \ell \) with \( k \) joints on it. If \( k > D \), then \( \ell \subseteq Z(f) \). For the remaining lines with \( k \leq D \) joints, we consider what happens if they are not in \( Z(f) \). Let \( J' \) be the set of joints on lines not contained in \( Z(f) \). Each such line contributes at most \( D \) joints to \( J' \), so \( |J'| \leq nD \).

Now, at each joint \( p \in J \setminus J' \), all lines through \( p \) are contained in \( Z(f) \). Since at least 3 non-coplanar lines through \( p \) lie in \( Z(f) \), the gradient \( \nabla f(p) \) is perpendicular to all tangent directions of these lines at \( p \). But 3 non-coplanar directions span \( \mathbb{R}^3 \), so \( \nabla f(p) = 0 \). Thus \( p \) is a singular point of \( f \).

Consider \( \nabla f = (\partial_1 f, \partial_2 f, \partial_3 f) \). Each \( \partial_i f \) has degree at most \( D - 1 \). The line \( \ell \subseteq Z(f) \) lies in \( Z(\partial_i f) \) as well only if \( \partial_i f \) vanishes on \( \ell \). If some \( \partial_i f \) is non-zero and does not vanish on \( \ell \), then \( \partial_i f|_\ell \) is a non-zero polynomial of degree at most \( D-1 \), so it has at most \( D-1 \) zeros on \( \ell \). Thus at most \( D-1 \) points of \( \ell \) have \( \nabla f = 0 \).

Iterate: replace \( f \) by a component of \( \nabla f \), and repeat the argument. At each stage, the degree decreases by 1. After at most \( D \) iterations, we reach a constant, and the conclusion is that the number of joints is at most \( O(nD) = O(n \cdot C^{1/3} n^{1/2}) = O(C^{1/3} n^{3/2}) \). Choosing \( C \) large enough yields a contradiction with \( |J| > Cn^{3/2} \).

More carefully: we prove by induction on degree that if \( f \) is a non-zero polynomial of degree \( D \) vanishing on all joints, and every line of \( \mathcal{L} \) lies in \( Z(f) \), then the number of joints where \( \nabla f = 0 \) is the number of “real” joints, and these satisfy \( |J| \leq n(D-1) + |J'| \) where \( |J'| \) counts joints of the partial derivatives. Unraveling gives \( |J| \leq nD + n(D-1) + \cdots \leq nD^2/2 \). With \( D = \Theta(n^{1/2}) \), this gives \( |J| = O(n^{3/2}) \).

The joints theorem was one of the first major successes of the polynomial method in incidence geometry. The proof is remarkable for its simplicity: it uses only basic facts from algebra (polynomial vanishing and degree bounds) yet solves a problem that resisted all earlier combinatorial approaches.

7.3 Polynomial Partitioning

The polynomial partitioning theorem is the main technical engine behind the Guth-Katz proof of the distinct distances theorem.

Polynomial Partitioning Theorem (Guth-Katz, 2015). For any finite set \( P \) of points in \( \mathbb{R}^d \) and any integer \( D \geq 1 \), there exists a non-zero polynomial \( f \) of degree at most \( D \) such that each connected component of \( \mathbb{R}^d \setminus Z(f) \) contains at most \( O(|P|/D^d) \) points of \( P \).

The proof uses the Stone-Tukey “polynomial ham-sandwich theorem” (itself a consequence of Borsuk-Ulam) iteratively: bisect the point set with a hyperplane, then bisect each half with another hyperplane, etc. After \( k \) rounds, we have \( 2^k \) cells. Translating to polynomial partitions via the product of the hyperplane equations gives a polynomial of degree \( k \) partitioning the points into \( 2^k \) roughly equal parts. Generalizing from hyperplanes to polynomials of degree \( D \) gives cells of \( \mathbb{R}^d \setminus Z(f) \), and the number of cells is \( O(D^d) \) by a result of Warren (or the Milnor-Thom bound on real algebraic varieties).

Milnor-Thom Bound. The zero set of a polynomial of degree \( D \) in \( \mathbb{R}^d \) has at most \( O(D^d) \) connected components in the complement.

7.4 Dvir’s Proof of the Finite Field Kakeya Conjecture

The Kakeya problem asks about the minimum size of a set containing a unit line segment in every direction. In finite fields, the analogous question was resolved by Zeev Dvir in 2009 with a stunningly short proof.

A Kakeya set (or Besicovitch set) in \( \mathbb{F}_q^d \) is a set \( K \subseteq \mathbb{F}_q^d \) that contains a line in every direction: for every direction \( v \in \mathbb{F}_q^d \setminus \{0\} \), there exists \( a \in \mathbb{F}_q^d \) such that \( \{a + tv : t \in \mathbb{F}_q\} \subseteq K \).
Finite Field Kakeya Conjecture (Dvir, 2009). Every Kakeya set in \( \mathbb{F}_q^d \) has size at least \( c_d q^d \), where \( c_d = 1/d! \) is a constant depending only on \( d \).
Suppose \( K \subseteq \mathbb{F}_q^d \) is a Kakeya set with \( |K| < \binom{q-1+d}{d} \). The dimension of the space of polynomials in \( d \) variables of degree at most \( q-1 \) over \( \mathbb{F}_q \) is \( \binom{q-1+d}{d} \). Since \( |K| < \binom{q-1+d}{d} \), there exists a non-zero polynomial \( f \in \mathbb{F}_q[x_1, \ldots, x_d] \) of degree at most \( q-1 \) that vanishes on all of \( K \).

Let \( D = \deg(f) \leq q - 1 \), and let \( f_D \) denote the homogeneous component of \( f \) of degree \( D \) (the leading form). We show \( f_D \) vanishes on all of \( \mathbb{F}_q^d \setminus \{0\} \).

Fix any direction \( v \in \mathbb{F}_q^d \setminus \{0\} \). Since \( K \) is a Kakeya set, there exists \( a \in \mathbb{F}_q^d \) such that \( f(a + tv) = 0 \) for all \( t \in \mathbb{F}_q \). But \( g(t) = f(a + tv) \) is a univariate polynomial of degree at most \( D \leq q-1 \) that vanishes at all \( q \) elements of \( \mathbb{F}_q \). Since \( \deg(g) \leq q - 1 < q \) yet \( g \) has \( q \) roots, we must have \( g \equiv 0 \) (or more precisely, the leading coefficient of \( g \) is zero if \( \deg(g) < q \)).

The leading coefficient of \( g(t) = f(a + tv) \) is \( f_D(v) \) (the top-degree part evaluated at \( v \)). Since \( g \) has \( q > D \) roots in \( \mathbb{F}_q \), and its degree is at most \( D \), we must have \( g \equiv 0 \). In particular, the coefficient of \( t^D \) is zero: \( f_D(v) = 0 \).

Since this holds for every \( v \in \mathbb{F}_q^d \setminus \{0\} \), the polynomial \( f_D \) vanishes on all of \( \mathbb{F}_q^d \setminus \{0\} \). But \( f_D \) is homogeneous and vanishes at \( 0 \) as well (since \( f_D(0) = 0 \) for any homogeneous polynomial of positive degree). So \( f_D \) vanishes on all of \( \mathbb{F}_q^d \).

Now, the Schwartz-Zippel lemma states that a non-zero polynomial of degree \( D \leq q-1 \) can vanish on at most \( D \cdot q^{d-1} < q^d \) points of \( \mathbb{F}_q^d \). Since \( f_D \) vanishes on all \( q^d \) points, \( f_D \equiv 0 \), contradicting the assumption that \( f_D \) is the leading homogeneous part of \( f \) (which is non-zero by definition of degree).

Therefore \( |K| \geq \binom{q-1+d}{d} \geq q^d / d! \).

Dvir's proof is a paradigmatic example of the polynomial method: the key idea is that if a set is too small, a low-degree polynomial vanishes on it, and the algebraic properties of this polynomial lead to a contradiction. The proof is barely one page yet resolved a conjecture that had resisted Fourier-analytic methods for years. It inspired much of the subsequent work on the polynomial method in combinatorial geometry.

7.5 The Polynomial Ham-Sandwich Theorem

Polynomial Ham-Sandwich Theorem (Guth, 2015). Given \( m \) finite point sets \( P_1, \ldots, P_m \) in \( \mathbb{R}^d \), with \( m = \binom{D+d}{d} - 1 \), there exists a non-zero polynomial \( f \) of degree at most \( D \) such that each open component of \( \mathbb{R}^d \setminus Z(f) \) contains at most \( |P_i|/2 \) points of each \( P_i \).

This result follows from the Borsuk-Ulam theorem applied to the space of polynomials of degree at most \( D \), viewed as a linear space with the \( \mathbb{Z}/2 \)-action \( f \mapsto -f \).

7.6 The Elekes-Rónyai Problem

Elekes-Rónyai Theorem (Raz-Sharir-Solymosi, 2017). Let \( f: \mathbb{R}^2 \to \mathbb{R} \) be a polynomial that is not of the form \( f(x,y) = g(h(x) + k(y)) \) or \( f(x,y) = g(h(x) \cdot k(y)) \) for univariate polynomials \( g, h, k \). Then for any sets \( A, B \subseteq \mathbb{R} \) with \( |A| = |B| = n \), we have \[ |f(A \times B)| = \Omega(n^{3/2}). \]

The original conjecture of Elekes and Rónyai (2000) was that the bound should be \( \Omega(n^2) \) (matching the trivial upper bound from \( |A \times B| = n^2 \)) minus lower-order terms, but the \( n^{3/2} \) bound proved by Raz, Sharir, and Solymosi uses the Szemerédi-Trotter theorem and is the current state of the art.

7.7 Sum-Product Estimates via Geometry

The sum-product problem, first studied by Erdős and Szemerédi (1983), lies at the heart of additive combinatorics and has deep connections to incidence geometry.

Erdős-Szemerédi Conjecture. For any finite set \( A \subseteq \mathbb{R} \), \[ \max(|A+A|, |A \cdot A|) \geq c_\varepsilon |A|^{2-\varepsilon} \] for every \( \varepsilon > 0 \).
\[ |A|^2 |A + A| \leq C \left( |A|^{4/3} |A \cdot A|^{2/3} + |A|^2 + |A||\!\cdot\!|A| \right), \]

which implies \( \max(|A+A|, |A \cdot A|) = \Omega(|A|^{5/4}) \). This was the first super-linear bound and demonstrated the power of geometric methods in additive combinatorics.

The best current bound is \( \max(|A+A|, |A \cdot A|) = \Omega(|A|^{4/3 + c}) \) for a small constant \( c > 0 \), due to Rudnev, Shkredov, and Stevens (2020). The full conjecture remains open and is one of the central problems in combinatorics.

7.8 Geometric Ramsey Theory

Ramsey theory in a geometric setting asks: what geometric configurations must appear in sufficiently large point sets?

Erdős-Szekeres Theorem (1935). For every integer \( k \geq 3 \), there exists a smallest integer \( \operatorname{ES}(k) \) such that any set of \( \operatorname{ES}(k) \) points in general position in the plane contains \( k \) points forming a convex polygon (the vertices of a convex \( k \)-gon in order). The Erdős-Szekeres conjecture states \( \operatorname{ES}(k) = 2^{k-2} + 1 \).

The original proof by Erdős and Szekeres gave \( \operatorname{ES}(k) \leq \binom{2k-4}{k-2} + 1 = O(4^k / \sqrt{k}) \). This was improved by Suk (2017) to \( \operatorname{ES}(k) \leq 2^{k + o(k)} \), nearly matching the conjectured lower bound of \( 2^{k-2} + 1 \).

Another important result in geometric Ramsey theory concerns monochromatic distances:

Monochromatic distance theorem. For any 2-coloring of the plane, there exist two points of the same color at distance exactly 1. More generally, for any \( k \)-coloring of \( \mathbb{R}^d \) (with \( d \geq k \)), there exist two monochromatic points at distance 1. This follows from the observation that a regular simplex with \( k+1 \) vertices at mutual distance 1 can be embedded in \( \mathbb{R}^k \).

The chromatic number of the plane — the minimum number of colors needed to color \( \mathbb{R}^2 \) so that no two points at distance 1 share a color — is known to satisfy \( 5 \leq \chi(\mathbb{R}^2) \leq 7 \). The lower bound of 5 was proved by de Grey (2018), improving the previous bound of 4, and the upper bound of 7 comes from a hexagonal tiling coloring. Determining the exact value remains a famous open problem.

7.9 Further Directions

The polynomial method and its interactions with topology, combinatorics, and analysis continue to drive rapid progress. Notable recent developments include:

Restriction estimates and decoupling. Bourgain, Demeter, and Guth (2015) proved the \( \ell^2 \)-decoupling conjecture using the polynomial partitioning method, with applications to number theory (Vinogradov's mean value theorem) and PDE (Strichartz estimates).
Capsets and the polynomial method over finite fields. Croot, Lev, Pach (2017) and Ellenberg-Gijswijt (2017) proved that a cap set in \( \mathbb{F}_3^n \) (a set with no three elements in arithmetic progression) has size at most \( O(2.756^n) \), dramatically improving the previous bounds. The proof uses the polynomial method (specifically, the slice rank method).
The sunflower conjecture. Alweiss, Lovett, Wu, and Zhang (2020) made a major breakthrough on the sunflower lemma using the spread approximation technique, reducing Erdős and Rado's bound from roughly \( D^k \) to \( (\log D)^{ck} \) sunflower-free families. While not directly a polynomial method result, it shares the same combinatorial-algebraic spirit.

The interplay between algebra, topology, probability, and geometry exemplified throughout these notes is one of the most vibrant areas of modern mathematics, and the tools developed here — from Helly’s theorem to polynomial partitioning — form an essential foundation for further study.

Back to top