CO 353: Computational Discrete Optimization

Chaitanya Swamy

Estimated study time: 39 minutes

Table of contents

Based on lecture notes by Sibelius Peng — PDF source

Sources and References

Primary textbook — Kleinberg, J. and Tardos, E. Algorithm Design. Addison-Wesley, 2005. Supplementary texts — Vazirani, V.V. Approximation Algorithms. Springer-Verlag, 2001. Williamson, D.P. and Shmoys, D.B. The Design of Approximation Algorithms. Cambridge University Press, 2011. Online resources — MIT 6.854 Advanced Algorithms (Karger). Stanford CS 261 Optimisation and Algorithmic Paradigms (Roughgarden). CMU 15-854 Algorithms and Complexity. David Williamson’s Algorithm Design lecture notes (available freely). Lecture notes — Peng, S. CO 353 Lecture Notes (Winter 2022). https://pdf.sibeliusp.com/1221/co353.pdf

Chapter 1: Efficient Graph Algorithms

Combinatorial optimisation begins with exact algorithms on graphs. Several fundamental problems — shortest paths, minimum spanning trees, arborescences — admit polynomial-time exact solutions with elegant correctness proofs and efficient data structures.

Shortest Paths and Dijkstra’s Algorithm

The shortest path problem: given a directed weighted graph $G = (V, E, w)$ with $w : E \to \mathbb{R}_{\geq 0}$ and source $s$, find the shortest path from $s$ to every other vertex.

\[ d[v] \leftarrow \min(d[v],\; d[u] + w(u,v)) \quad \text{for all } (u,v) \in E. \]

Correctness: by induction on $|S|$. When $u$ is added to $S$, its tentative distance $d[u]$ equals the true shortest path. Key invariant: no path through a not-yet-finalised vertex can be shorter (requires non-negative weights).

Time complexity depends on the priority queue:

Binary heap: $O((|V| + |E|) \log |V|)$
Fibonacci heap: $O(|E| + |V| \log |V|)$ — asymptotically optimal for dense graphs

For negative edge weights, Bellman–Ford (relaxing all edges $|V| - 1$ times) runs in $O(|V||E|)$ and detects negative cycles.

Minimum Spanning Trees

A minimum spanning tree (MST) of an undirected weighted graph $G = (V, E, w)$ is a spanning tree of minimum total weight. The problem has two key structural properties:

Theorem (Cut property). Let $S \subsetneq V$, and let $e$ be the minimum-weight edge crossing the cut $(S, V \setminus S)$. Then $e$ belongs to every MST.

Theorem (Cycle property). Let $C$ be any cycle. The maximum-weight edge of $C$ is not in any MST (assuming distinct weights).

Both Prim’s algorithm (greedily grows a tree, always adding the cheapest edge connecting the current tree to the rest) and Kruskal’s algorithm (greedily adds the cheapest edge not creating a cycle) are justified by the cut property.

Kruskal’s algorithm: sort edges by weight; for each edge $(u,v)$ in order, add it to the forest if $u$ and $v$ are in different components. Uses a Union-Find (disjoint set union) data structure. Running time: $O(|E| \log |E|)$ dominated by sorting.

Application to clustering: single-linkage hierarchical clustering builds the same structure as Kruskal’s MST. The $k$-clustering maximising the minimum inter-cluster distance is obtained by deleting the $k-1$ most expensive MST edges.

Minimum-Cost Arborescences: Edmonds’ Algorithm

A directed analogue of the MST is the minimum-cost arborescence (MCA): a spanning arborescence (directed spanning tree where all edges point away from a root $r$, i.e., there is exactly one directed path from $r$ to every vertex). The MCA problem is harder than MST because the cut property is more complex.

Edmonds’ algorithm (1967, also known as Chu–Liu/Edmonds’ algorithm) solves MCA optimally:

For each non-root vertex $v$, select the cheapest incoming edge $e(v) = (u^*(v), v)$.
If the chosen edges form an arborescence, output it.
Otherwise, there is a directed cycle $C$ in the chosen edges. Contract $C$ to a supernode; adjust edge costs to reflect the saving from replacing the cycle’s entry edge.
Recursively find the MCA in the contracted graph; expand the supernode.

The algorithm runs in $O(|E| \cdot |V|)$ time. The key insight in step 3: if the arborescence enters $C$ via some edge $e' = (x, v)$ where $v \in C$, it replaces the selected edge $e(v)$, so the effective cost of entering via $e'$ is $w(e') - w(e(v))$.

Chapter 2: Matroids and the Greedy Algorithm

Matroids abstract the common combinatorial structure behind greedy algorithms. They explain precisely when a greedy strategy finds the global optimum.

Matroid Axioms

Definition (Matroid). A matroid $M = (E, \mathcal{I})$ consists of a ground set $E$ and a collection of independent sets $\mathcal{I} \subseteq 2^E$ satisfying:

$\emptyset \in \mathcal{I}$.
(Hereditary) If $I \in \mathcal{I}$ and $J \subseteq I$, then $J \in \mathcal{I}$.
(Augmentation) If $I, J \in \mathcal{I}$ with $|I| < |J|$, then there exists $e \in J \setminus I$ such that $I \cup \{e\} \in \mathcal{I}$.

The augmentation axiom encapsulates the matroid “exchange” property: independent sets of different sizes can always be made to have equal size by adding one element. Maximal independent sets (those not extendable) are called bases and all have the same cardinality (the rank of the matroid).

Examples:

Graphic matroid $M(G)$: $E$ = edges of $G$, $\mathcal{I}$ = acyclic subgraphs (forests). Bases are spanning forests.
Linear matroid (or representable matroid) over a field $\mathbb{F}$: $E$ = columns of a matrix $A \in \mathbb{F}^{m \times n}$, $\mathcal{I}$ = linearly independent subsets.
Uniform matroid $U_{k,n}$: $E = \{1,\ldots,n\}$, $\mathcal{I} = \{S : |S| \leq k\}$. Bases are all $k$-subsets.
Transversal matroid: $E$ = elements, $\mathcal{I}$ = subsets that can be matched to distinct elements of a family of sets (Hall’s theorem).
Partition matroid: partition $E = E_1 \cup \cdots \cup E_t$, $\mathcal{I} = \{S : |S \cap E_i| \leq k_i\}$.

The Greedy Algorithm for Matroids

The maximum-weight independent set (MWIS) problem asks for the maximum-weight basis (or independent set) given weights $w : E \to \mathbb{R}_{\geq 0}$.

Theorem (Greedy algorithm is optimal on matroids). The greedy algorithm — add elements in decreasing weight order, skipping any element that would create a dependent set — solves MWIS optimally for any matroid.

The converse also holds: if the greedy algorithm is optimal for every weight function, then the independence system is a matroid.

\[ P(M) = \{x \in [0,1]^E : x(A) \leq r(A) \;\forall A \subseteq E\}, \]

where $r(A)$ is the rank of set $A$. Total unimodularity of the constraint matrix gives integrality.

Matroid intersection: the intersection $\mathcal{I}_1 \cap \mathcal{I}_2$ of two matroid independent sets is not in general a matroid, but the maximum-weight set in $\mathcal{I}_1 \cap \mathcal{I}_2$ can be found in polynomial time by the matroid intersection algorithm (Lawler 1975). Key applications: bipartite matching (transversal matroid ∩ partition matroid), spanning arborescences (graphic matroid ∩ partition matroid), colourful spanning trees.

Chapter 3: NP-Hardness and Reduction Theory

Matroids and network flows give polynomial algorithms for special problems. For most combinatorial optimisation problems, however, we cannot expect polynomial-time exact algorithms unless P = NP. Understanding the hierarchy of computational complexity is essential for deciding when to seek exact algorithms, heuristics, or approximation algorithms.

The Complexity Classes P and NP

A decision problem asks for a yes/no answer. The class $\mathbf{P}$ contains problems solvable in polynomial time by a deterministic algorithm. The class $\mathbf{NP}$ contains problems where a proposed solution can be verified in polynomial time.

For example, TSP (is there a tour of cost $\leq k$?) is in NP: given a proposed tour, we can check its cost and that it visits every vertex in polynomial time. But finding the tour may require exponential search.

Polynomial-time reductions: problem $A$ reduces to problem $B$ ($A \leq_P B$) if there is a polynomial-time algorithm that converts any instance of $A$ into an instance of $B$ with the same answer. If $B \in \mathbf{P}$, then $A \in \mathbf{P}$; contrapositively, if $A \notin \mathbf{P}$, then $B \notin \mathbf{P}$.

NP-completeness: a problem $B$ is NP-hard if every problem in NP reduces to $B$; it is NP-complete if additionally $B \in \mathbf{NP}$.

Theorem (Cook–Levin, 1971). Boolean satisfiability (SAT) is NP-complete. More specifically, 3-SAT (SAT restricted to clauses of size exactly 3) is NP-complete.

The Cook–Levin theorem is the foundational result: it establishes the first NP-complete problem, enabling all subsequent NP-hardness proofs via reduction from 3-SAT (or from another NP-complete problem).

The reduction hierarchy for combinatorial optimisation includes:

3-SAT $\leq_P$ Independent Set $\leq_P$ Clique $\leq_P$ Vertex Cover
3-SAT $\leq_P$ Hamiltonian Cycle $\leq_P$ TSP
3-SAT $\leq_P$ 3-Coloring $\leq_P$ Graph Coloring
Steiner Tree (on general graphs) is NP-hard by reduction from 3-SAT or exact cover

Steiner tree problem: given a graph $G = (V, E)$ with edge costs and a set of terminal vertices $T \subseteq V$, find the minimum-cost connected subgraph spanning $T$. Unlike MST (which is polynomial), Steiner tree is NP-hard when $|T| < |V|$ — the difficulty is choosing which non-terminal vertices (Steiner vertices) to include.

Chapter 4: Approximation Algorithms via LP Rounding

When exact optimisation is NP-hard, we seek approximation algorithms: polynomial-time algorithms that output a feasible solution whose cost is within a guaranteed factor of the optimal. An algorithm is an $\alpha$-approximation if its output has objective value at most $\alpha \cdot \text{OPT}$ for minimisation (or at least $\text{OPT}/\alpha$ for maximisation).

LP Rounding for Set Cover

The set cover problem: given a universe $\mathcal{U}$ of $n$ elements and $m$ sets $S_1, \ldots, S_m \subseteq \mathcal{U}$ with costs $c_i \geq 0$, find the minimum-cost subcollection covering $\mathcal{U}$.

\[ \min\;\sum_i c_i x_i \quad \text{s.t.}\quad \sum_{i : e \in S_i} x_i \geq 1 \;\forall e \in \mathcal{U},\; x_i \geq 0. \]

The LP optimal value $z^*_\text{LP}$ lower-bounds the IP optimal $z^*$.

LP rounding: round each fractional $x_i^* \in [0,1]$ to 1 if $x_i^* \geq 1/f$, where $f = \max_e |\{i : e \in S_i\}|$ is the maximum frequency of any element. This gives an $f$-approximation:

Every element is covered: since $\sum_{i : e \in S_i} x_i^* \geq 1$ and at most $f$ sets cover $e$, at least one has $x_i^* \geq 1/f$ and is rounded to 1.
Cost: $\sum_i c_i y_i \leq f \sum_i c_i x_i^* = f \cdot z^*_\text{LP} \leq f \cdot z^*$.

For general set cover, the greedy algorithm gives an $\ln n$-approximation (Chvátal 1979, Johnson 1974), and this is optimal assuming NP ⊄ DTIME($n^{O(\log n)}$).

Primal-Dual Algorithms for Network Connectivity

The primal-dual method is a general LP-based technique for combinatorial optimisation. Instead of solving the LP and rounding, it simultaneously constructs a primal solution and a dual solution (certificate of near-optimality).

$f$-connectivity problem: given a graph $G = (V, E, w)$ and connectivity requirements $f : V \times V \to \{0, 1\}$, find the minimum-weight subgraph $H$ such that for all $(u,v)$ with $f(u,v) = 1$, $H$ contains a path from $u$ to $v$. This generalises Steiner forest (which generalises Steiner tree and MST).

The primal-dual algorithm for $f$-connectivity:

Start with empty solution $F = \emptyset$.
Let $\mathcal{C}$ be the set of violated components — connected components of $(V, F)$ that need more connections.
Raise the dual variable $y_C$ for each violated component $C$ at the same rate until some edge $e = (u,v)$ crossing the cut for some violated component becomes tight (its dual packing constraint is met).
Add all tight edges to $F$; update violated components; repeat.

Theorem (Approximation ratio). The primal-dual algorithm for $f$-connectivity achieves a 2-approximation for Steiner forest. For general $f$-connectivity with integer requirements $f \leq k$, it gives a $2k$-approximation.

The ratio 2 arises because a tree spans a component, and the dual packing witnesses optimality: the primal solution cost is at most twice the dual objective (each edge is counted by at most 2 violated components).

Uncapacitated Facility Location

The uncapacitated facility location (UFL) problem models choosing which warehouses to open to serve customers. Given a set $\mathcal{F}$ of facilities with opening costs $f_i$, a set $\mathcal{D}$ of clients with demands, and assignment costs $c_{ij}$ (cost to serve client $j$ from facility $i$), minimise total opening plus assignment cost.

\[ \min\;\sum_i f_i y_i + \sum_{i,j} c_{ij} x_{ij} \quad \text{s.t.}\quad \sum_i x_{ij} \geq 1 \;\forall j,\; x_{ij} \leq y_i,\; x_{ij}, y_i \geq 0. \]

LP rounding for metric UFL (Shmoys–Tardos–Aardal 1997): the metric assumption ($c_{ij}$ satisfies triangle inequality) enables an LP-rounding 3-approximation. The Jain–Vazirani (2001) primal-dual algorithm gives a 3-approximation directly. The best known approximation ratio is 1.488 (Li 2013), and the LP-relaxation integrality gap is known to be at least $1 + 2/e \approx 1.736$, so purely LP-based approaches cannot achieve the best ratio.

Chapter 5: Integer Programming Methods — Branch and Bound

When approximation is insufficient and exact solutions are needed for tractable-size instances, branch-and-bound is the standard approach.

Branch-and-Bound

Branch-and-bound solves an IP by recursively partitioning the feasible set into smaller subproblems (branching) and computing LP relaxation bounds to prune infeasible or suboptimal branches.

The algorithm maintains:

A best integer solution found so far (incumbent) with value $\bar{z}$.
An open list of subproblems (nodes in the branch tree), each with an LP relaxation bound.

At each node:

Solve the LP relaxation. If infeasible or bound $\leq \bar{z}$: prune.
If LP solution is integral: update incumbent, prune.
Branch: pick a fractional variable $x_j = \bar{x}_j \notin \mathbb{Z}$. Create two children: add $x_j \leq \lfloor \bar{x}_j \rfloor$ or $x_j \geq \lceil \bar{x}_j \rceil$.

The LP bound at each node is a relaxation of the true IP, so it gives a valid lower bound (for minimisation). Tighter bounds reduce branching; modern solvers add cutting planes at nodes (branch-and-cut) and use sophisticated branching rules.

\[ z_D(\lambda) = \min_x \{c^T x + \lambda^T (Ax - b) : x \in X\}. \]

The Lagrangian dual $\max_{\lambda \geq 0} z_D(\lambda)$ is a concave maximisation problem solved by subgradient methods. For many structured problems (TSP, capacitated problems), the Lagrangian bound is tighter than the LP bound, justifying the extra computation.

TSP and held-Karp bound: Lagrangian relaxation of the TSP with the degree constraints relaxed gives the Held–Karp lower bound — relax the constraint that each vertex has degree 2, penalising with a multiplier vector $\lambda$. The subproblem is a minimum 1-arborescence (easy to solve), and subgradient ascent maximises the bound. This bound is tight for many instances and is the basis of the best exact TSP solvers (Concorde).

Chapter 6: Beyond This Course

Natural Next Topics

CO 353 develops the core algorithmic toolkit for discrete optimisation: graph algorithms, matroids, NP-hardness, approximation algorithms via LP rounding and primal-dual, and integer programming via branch-and-bound. Each of these areas has deep extensions.

Hardness of approximation and the PCP theorem. CO 353 establishes NP-hardness for exact optimisation. But a harder question is: how close to optimal can a polynomial-time algorithm get? The Probabilistically Checkable Proofs (PCP) theorem (Arora–Lund–Motwani–Sudan–Szegedy, 1992) is the key tool: NP = PCP$[O(\log n), O(1)]$, meaning every NP proof can be verified by a randomised algorithm that reads only a constant number of bits of the proof, with one-sided error. This implies that many optimisation problems cannot be approximated to within a constant factor unless P = NP.

The PCP theorem gives inapproximability results for many problems: MAX-3-SAT cannot be approximated above $7/8 + \varepsilon$ unless P = NP (matching the best-known algorithm); MAX-CLIQUE cannot be approximated to $n^{1-\varepsilon}$ for any $\varepsilon > 0$ (Håstad’s result, requiring PCP with low error); Set Cover cannot be approximated to $(1 - \varepsilon) \ln n$ unless NP ⊆ DTIME($n^{O(\log \log n)}$) (Feige 1998). These results show that the approximation ratios achieved by the algorithms in CO 353 are often tight.

The proof of the PCP theorem involves two main ingredients: verifier composition (combining multiple probabilistic checks into one) and the low-degree test (checking that a function is approximately a low-degree polynomial over a finite field, a connection to algebraic coding theory). The algebraic machinery is related to the BCH bound and polynomial evaluation codes from CO 331.

The Unique Games Conjecture (UGC). Khot (2002) conjectured that the Unique Games problem (given a bipartite graph with labels on each edge specifying a permutation between the label sets of the two endpoints, is there a labelling satisfying almost all constraints?) cannot be approximated to within any constant factor. If true, UGC implies that many LP/SDP-based approximations are optimal:

Goemans–Williamson 0.878 for MAX-CUT is optimal (Khot–Kindler–Mossel–O’Donnell).
Best 2-approximation for vertex cover is optimal (Khot–Regev).
Shmoys–Tardos 3-approximation for metric UFL is within a constant of optimal. The UGC is the most important open problem in approximation algorithms and connects to the SDP hierarchy: subexponential algorithms for Unique Games (Arora–Barak–Steurer 2010, running in time $e^{n^{\varepsilon}}$) are known, suggesting the conjecture may be false.

Fixed-parameter tractability. Many NP-hard problems become tractable when a parameter $k$ (e.g., solution size, treewidth, genus) is small. An FPT algorithm runs in time $f(k) \cdot n^{O(1)}$ — polynomial in $n$ for fixed $k$. The key concept is treewidth: a graph of treewidth $w$ can be solved by dynamic programming on its tree decomposition in time $f(w) \cdot n$. Courcelle’s theorem states that any property expressible in monadic second-order logic (MSOL) — including independent set, colouring, and Hamiltonian path — can be decided in $O(f(w) \cdot n)$ time on graphs of treewidth at most $w$. The W-hierarchy of parameterized complexity theory provides lower bounds: Vertex Cover is FPT, but k-Clique is W[1]-hard (no FPT algorithm unless FPT = W[1]).

\[ f(A \cup \{e\}) - f(A) \geq f(B \cup \{e\}) - f(B). \]

This diminishing returns property is the discrete analogue of concavity. Submodularity unifies many functions: coverage (adding a new set covers fewer new elements as the already-covered set grows), entropy (adding a random variable to a set provides less new information as the set grows), and cut functions (which are submodular for undirected graphs).

Submodular minimisation is in polynomial time: the Grötschel–Lovász–Schrijver ellipsoid algorithm (1981) and Orlin’s combinatorial algorithm (2009) minimise submodular functions in strongly polynomial time. Submodular maximisation is NP-hard in general, but the greedy algorithm achieves a $(1 - 1/e)$-approximation for monotone submodular functions under cardinality constraints (Nemhauser–Wolsey–Fisher 1978) — and this ratio is tight (Feige 1998). For non-monotone submodular maximisation under cardinality constraints, the best ratio is $e/(e-1) + \varepsilon$ (Buchbinder–Feldman–Naor–Schwartz 2012).

The Lovász extension $\hat{f} : [0,1]^E \to \mathbb{R}$ of a submodular function $f$ is convex, and minimising $f$ reduces to minimising $\hat{f}$. This establishes submodular functions as the discrete analogue of convex functions, and submodular minimisation as the discrete analogue of convex optimisation.

Matroid intersection and submodular flows. CO 353 covers matroid optimisation; matroid intersection finds the maximum-weight set that is independent in both of two given matroids. The submodular flow problem (Edmonds–Giles 1977) is a vast generalisation: optimise a linear function over flows in a network where the capacity constraints are given by a submodular function. This framework unifies matroid intersection, maximum matching (non-bipartite), and network flows into a single polynomial-time solvable problem.

Randomised algorithms and online algorithms. CO 353 focuses on deterministic worst-case algorithms. Randomised algorithms achieve better expected-case performance: Miller–Rabin primality testing, randomised rounding of LP solutions, and randomised Quicksort. Online algorithms must make irrevocable decisions as input arrives — classic examples include online scheduling, online bipartite matching (used in internet advertising), and the ski rental problem. The competitive ratio measures how much worse an online algorithm is than the optimal offline algorithm in the worst case.

Follow-up Courses and Reading

CO 353’s content spans algorithms, complexity, and optimisation theory. The following resources continue in each direction.

At UWaterloo: CO 453 (Network Design) develops the primal-dual and Lagrangian relaxation techniques further, with applications to survivable network design and capacitated problems. CO 454 (Scheduling) covers machine scheduling models — single machine, parallel machines, jobs with precedence constraints — using matroid and LP-rounding approaches. CO 750 (Combinatorial Optimisation) covers total dual integrality, the matching polytope, perfect graphs (the Strong Perfect Graph Theorem), and Schrijver’s polyhedral theory at a graduate level. CS 487 (Introduction to Computational Complexity) develops the PCP theorem, interactive proofs, and the counting complexity class #P rigorously. CO 673 (Approximation Algorithms) is a full graduate course on the subject.

Key texts: Kleinberg–Tardos Algorithm Design (Addison-Wesley, 2005) is the most accessible graduate algorithms text and covers the core topics of CO 353 with excellent exposition. Vazirani Approximation Algorithms (Springer, 2001) is the original graduate approximation algorithms reference. Williamson–Shmoys The Design of Approximation Algorithms (Cambridge, 2011; free at designofapproxalgs.com) is the updated comprehensive reference with LP-rounding, primal-dual, and randomised rounding. Schrijver Combinatorial Optimisation: Polyhedra and Efficiency (3 volumes, Springer, 2003) is the encyclopedic polyhedral theory reference. For FPT: Cygan et al. Parameterized Algorithms (Springer, 2015; free online) covers the W-hierarchy and FPT algorithms comprehensively. For submodular functions: Bach Submodular Functions and Optimization, 2nd ed. (Elsevier, 2013) provides a comprehensive treatment. Arora–Barak Computational Complexity: A Modern Approach (Cambridge, 2009; draft free online) covers PCP, interactive proofs, and hardness of approximation.

Online resources: Tim Roughgarden’s CS 261 (Stanford, Optimization and Algorithmic Paradigms) lecture notes, Luca Trevisan’s CS 294 lecture notes on the PCP theorem, and the Approximation Algorithms survey by Vazirani are all freely available.

Open Problems and Active Research

P versus NP. The Clay Millennium Problem (one of seven, $1M prize). Resolving it would settle the fundamental question of whether every NP-hard problem requires exponential time. A polynomial-time algorithm for TSP would revolutionise logistics, cryptography (factoring is in NP), drug discovery (protein folding can be formulated as an NP problem), and artificial intelligence. Most researchers believe P ≠ NP, but a proof seems very far away. Natural proof barriers (Razborov–Rudich 1994), algebrization barriers (Aaronson–Wigderson 2009), and relativization barriers (Baker–Gill–Solovay 1975) rule out large classes of proof techniques, suggesting that entirely new approaches are needed.

The Unique Games Conjecture. As discussed, UGC would confirm the optimality of many known approximation ratios. But subexponential algorithms for Unique Games and connections to SDPs and the Lasserre hierarchy suggest it may be false. The deepest question: is there a polynomial-time algorithm for Unique Games achieving a non-trivial approximation ratio? If so, vertex cover can be approximated below 2, and many other hardness results become suboptimal. Resolving UGC is the central open problem in approximation algorithms.

TSP approximation. The metric TSP has been studied since the 1970s. Christofides–Serdyuk’s 3/2-approximation (1976) was the state of the art for 44 years. Karlin, Klein, and Oveis Gharan (2020, STOC Best Paper) improved this to $3/2 - \varepsilon$ for an explicit tiny $\varepsilon$ using the max-entropy algorithm — a randomised approach that samples spanning trees proportional to their probability in a specific distribution. The key technique is the thin tree conjecture: a random spanning tree can be made to satisfy near-half-integrality constraints with positive probability. Whether the TSP approximation ratio can be improved to below $4/3$ (the conjectured tight ratio, corresponding to the max-entropy distribution on a point set achieving it) or even to $1 + \varepsilon$ (a PTAS) is open. TSP is APX-hard (cannot be PTAS unless P = NP), so polynomial-time approximation schemes do not exist.

Submodular maximisation and the streaming setting. For monotone submodular maximisation under a cardinality constraint $k$, the greedy algorithm achieves $1 - 1/e$. In the streaming setting — where elements arrive one by one and the algorithm uses only $O(k)$ memory — achieving the same $(1-1/e)$ ratio requires sophisticated algorithms (Badanidiyuru–Mirzasoleiman–Karbasi–Krause 2014). But for non-monotone submodular maximisation in streams, the best ratio is $1/4$ (Chakrabarti–Kale 2015), far from the offline ratio. Closing the gap between streaming and offline submodular maximisation is open.

Graph isomorphism and canonisation. Graph isomorphism (GI) asks whether two graphs are the same up to relabelling. GI is in NP (propose an isomorphism and verify in polynomial time), and GI ∈ co-NP as well (known via algebraic methods). It is not known to be in P or to be NP-complete. Babai’s quasi-polynomial algorithm ($2^{O((\log n)^3)}$ time) was the first improvement since 1983 (Luks) and involves the classification of doubly transitive groups using the classification of finite simple groups. Whether GI is in P is open. Canonisation — computing a canonical form for graphs — is equivalent to GI, and whether both can be done in polynomial time remains an outstanding question in complexity theory.

Parameterized complexity and the W-hierarchy. The W-hierarchy provides lower bounds for FPT: if a problem is W[1]-hard, no FPT algorithm is expected. But the W-hierarchy is defined relative to bounded nondeterminism and is not known to relate to P vs. NP. Whether all W[1]-hard problems require $f(k) \cdot n^{\omega(1)}$ time (a so-called “hardness in exponent”) depends on fine-grained complexity assumptions like the Exponential Time Hypothesis (ETH): 3-SAT requires $2^{\Omega(n)}$ time. ETH implies k-Clique requires $n^{\Omega(k)}$ time. Whether ETH is true — whether 3-SAT really has an exponential lower bound — is wide open, and is the fundamental hypothesis underlying fine-grained algorithm lower bounds.

Counting complexity and the permanent. CO 353 focuses on decision and optimisation problems: does a solution exist? What is the optimal value? A complementary direction is counting complexity: how many solutions exist? The problem #SAT — counting the number of satisfying assignments to a Boolean formula — is in the complexity class #P (Valiant 1979). Valiant defined #P and proved a landmark result: computing the permanent of a 0-1 matrix is #P-complete, even though computing the determinant is polynomial time (by Gaussian elimination). The permanent $\text{perm}(A) = \sum_{\sigma \in S_n} \prod_{i=1}^n A_{i,\sigma(i)}$ is the number of perfect matchings in a bipartite graph with adjacency matrix $A$. The determinant and permanent have identical formulas except for the sign of permutations, yet they have vastly different computational complexity. Toda’s theorem (1991) shows that the entire polynomial hierarchy PH collapses to $P^{\#P}$ — any problem that can be decided in the polynomial hierarchy can be computed in polynomial time with an oracle for #SAT. This makes #P arguably more powerful than NP. The exact complexity of approximately counting matchings (Jerrum–Sinclair 1989 gave an FPRAS for bipartite graphs using Markov chain Monte Carlo), and of exactly counting matchings in general graphs, are fundamental questions.

Derandomisation and pseudorandom generators. Many efficient algorithms — Quicksort, Miller–Rabin primality testing, randomised rounding of LP solutions — use randomness. Derandomisation asks whether randomness is truly necessary or whether deterministic algorithms with equal performance exist. The key question: does $P = BPP$ (where BPP is the class of problems solvable by polynomial-time randomised algorithms with two-sided error)? Most complexity theorists believe $P = BPP$, but the best known result (Impagliazzo–Wigderson 1997) is conditional: if there exist problems in the exponential time hierarchy with circuit complexity $2^{\Omega(n)}$, then $P = BPP$. Pseudorandom generators (PRGs) produce sequences indistinguishable from random to polynomial-time algorithms using only $O(\log n)$ truly random bits; Nisan–Wigderson PRGs (1994) are constructed from hard functions. For space-bounded computation, there are unconditional results: Nisan (1992) derandomised RL (randomised log-space) in log-space, and Reingold (2008) proved that undirected reachability is in log-space (L), a seminal derandomisation result using expander graphs.

Communication complexity and circuit lower bounds. Communication complexity (Yao 1979) studies the minimum number of bits Alice and Bob must exchange to jointly compute a function $f(x, y)$ where Alice holds $x$ and Bob holds $y$. This model is cleaner than Turing machine complexity and has yielded tight lower bounds for data structures, streaming algorithms, and circuit depth. The disjointness function (DISJ$_n$: are Alice’s and Bob’s $n$-bit subsets disjoint?) requires $\Omega(n)$ communication even for randomised protocols (Raz 1992; Razborov 1992), using the technique of distributional complexity. Communication complexity lower bounds directly imply lower bounds for circuit size and depth: computing DISJ on a small circuit would imply a fast communication protocol via round elimination. Despite three decades of effort, proving that explicit functions require super-polynomial circuit size (i.e., that E ⊄ P/poly) remains the central open problem in circuit complexity, equivalent to $P \neq NP$ via the Karp–Lipton theorem. The natural proofs barrier (Razborov–Rudich 1994) explains why known techniques fail: any proof technique that proves lower bounds for a “natural” property of Boolean circuits implies a polynomial-time algorithm for distinguishing pseudorandom functions from random ones — which contradicts the existence of strong PRGs. Finding circuit lower bound techniques that avoid the natural proofs barrier is the main challenge in complexity theory, and CO 353’s algorithmic foundations are part of the toolkit needed to understand it.

Streaming algorithms and sketching. CO 353 studies algorithms that process the full input. In many modern settings — network traffic analysis, database analytics, social network mining — the input is a stream of elements too large to store, and the algorithm must process each element once in small memory. Streaming algorithms maintain compact sketches of the data seen so far. The count-min sketch (Cormode–Muthukrishnan 2005) maintains an approximate frequency table in $O(\varepsilon^{-1} \log(1/\delta))$ space, answering point queries (what is the approximate frequency of item $i$?) with additive error $\varepsilon \cdot \|f\|_1$ and failure probability $\delta$. The AMS sketch (Alon–Matias–Szegedy 1996, FOCS Test of Time Award) estimates the second moment $\|f\|_2^2 = \sum_i f_i^2$ (stream variance) in $O(\varepsilon^{-2})$ space using random linear maps — the first instance of dimensionality reduction for streaming. The Misra–Gries algorithm finds all items with frequency $> n/k$ in $O(k)$ space. For graph streams, the algorithm must process edges one by one; computing exact matchings, cuts, or connected components is impossible in sublinear space, but $\varepsilon$-approximations exist. Graph sparsification in streaming — maintaining a sparse subgraph that preserves cut values up to a factor $1 \pm \varepsilon$ — can be done in $\tilde{O}(n/\varepsilon^2)$ space. The connection to CO 353 is through approximation algorithms: sketching provides the “fast preprocessing” that enables approximation algorithms to scale to terabyte datasets.

Algorithmic game theory and mechanism design. CO 353’s primal-dual methods connect to mechanism design through the concept of dual prices as payments. In the VCG mechanism (Vickrey–Clarke–Groves), agents report their valuations and the social planner computes the optimal allocation (solving a welfare-maximisation IP) and charges each agent their externality on others. The payments equal the dual variables of the social welfare LP. For combinatorial auctions (multiple goods, exponential preference space), computing the optimal allocation is NP-hard (weighted maximum independent set in the valuations graph), and approximate mechanism design asks: can we design truthful mechanisms that approximately maximise social welfare? Lavi–Swamy (2011) show that any $\alpha$-approximation algorithm can be converted to an $\alpha$-truthful mechanism via semidefinite programming methods. The price of anarchy — how much efficiency is lost when rational agents make selfish decisions rather than cooperating — connects network flow theory (Roughgarden–Tardos 2002 on routing games) to CO 353’s min-cost flow material. For selfish routing with linear cost functions, the worst-case price of anarchy is $4/3$; for polynomial cost functions, it grows with the degree. These results use techniques from CO 353 — LP duality, flow decompositions — in an economic context, illustrating the breadth of the course’s foundational toolkit.

Fine-grained complexity and SETH. Beyond ETH (which rules out $2^{o(n)}$ algorithms for 3-SAT), the Strong Exponential Time Hypothesis (SETH) states that CNF-SAT on $n$ variables and $m$ clauses requires $(2 - \varepsilon)^n m^{O(1)}$ time for any $\varepsilon > 0$. SETH implies tight lower bounds for many polynomial-time problems: Orthogonal Vectors (do any two vectors in a collection of $n$ vectors of dimension $d$ have zero inner product?) requires $\Omega(n^{2-\varepsilon})$ time under SETH for $d = \omega(\log n)$. This lower bound propagates to: Edit distance between two strings (Abboud–Backurs–Williams 2015): $\Omega(n^{2-\varepsilon})$ under SETH; LCS (Longest Common Subsequence): $\Omega(n^{2-\varepsilon})$; Diameter of sparse graphs: $\Omega(n^{2-\varepsilon})$ in unweighted graphs. These are “hardness in the polynomial regime” — problems that currently take $\Theta(n^2)$ or $\Theta(n^3)$ and cannot be significantly improved unless SETH fails. This fine-grained analysis has transformed algorithm design: researchers now know that getting sub-quadratic algorithms for edit distance or sub-cubic algorithms for APSP (all-pairs shortest paths) would be a breakthrough comparable to P \neq) NP. The CO 353 material on shortest paths (Dijkstra, Bellman-Ford) directly touches these lower bounds, since APSP running in $O(n^{3-\varepsilon})$ on general real-weighted graphs would refute SETH via known reductions. Fine-grained complexity is one of the most active areas of algorithms research, with new reductions appearing at STOC and FOCS each year. SETH-hardness results have also been proved for Dynamic Programming problems: the Bellman–Ford algorithm for SSSP in dense graphs is SETH-hard to improve significantly, as is the classic DP for sequence alignment with affine gap penalties (Abboud–Bringmann–Gronlund–Woelfel 2022), connecting the CS 482 material on bioinformatics algorithms to the CO 353 complexity foundations through the unifying lens of fine-grained reductions. This web of reductions — CO 353 algorithms appearing as hard problems in fine-grained lower bounds for CO 353 algorithms in different parameter regimes — is one of the most striking examples of the internal coherence of theoretical computer science.