CS 341: Algorithms

Lap Chi Lau

Estimated study time: 1 hr 20 min

Table of contents

Based on lecture notes by Sibelius Peng — PDF source

Sources and References

Primary textbook — Cormen, T.H., Leiserson, C.E., Rivest, R.L., and Stein, C. Introduction to Algorithms, 3rd ed. MIT Press, 2009. Supplementary texts — Kleinberg, J. and Tardos, É. Algorithm Design. Pearson, 2005. | Dasgupta, S., Papadimitriou, C., and Vazirani, U. Algorithms. McGraw-Hill, 2006. Online resources — MIT 6.046J Design and Analysis of Algorithms, Spring 2015, MIT OpenCourseWare; Stanford CS161 Introduction to Computer Science: Design and Analysis of Algorithms; Tim Roughgarden’s lecture notes on advanced algorithm design (Stanford CS261). Lecture notes — Peng, S. CS 341 Lecture Notes (Spring 2021). https://pdf.sibeliusp.com/1215/cs341.pdf

Chapter 1: Foundations of Algorithm Analysis

The Twin Problems that Open Every Course

The course begins with a telling contrast. The Traveling Salesman Problem (TSP) asks for a minimum-cost tour visiting every vertex of a weighted graph — the delivery driver who must hit every city and return home at minimum expense. The Chinese Postman Problem asks instead for a minimum-cost tour covering every edge — the postal worker who must walk every street. Both sound like reasonable optimisation problems on the same kind of graph. Yet they belong to entirely different computational worlds. The Chinese Postman Problem is solvable in polynomial time \(O(n^4)\) using matching algorithms. The TSP, by contrast, has resisted polynomial-time solution for decades; the best known exact algorithm runs in \(O(2^n \cdot n^2)\) time, and the problem is NP-complete. This pairing is a pedagogical masterstroke: it immediately disabuses the reader of the idea that problem similarity implies computational similarity, and motivates the entire theoretical apparatus that follows.

Asymptotic Complexity

When we say an algorithm takes time \(f(n)\), we almost never mean the exact count of machine instructions. We mean something about the scaling behaviour of the running time as the input size grows. The standard language for this is the asymptotic notation introduced by Bachmann and Landau.

Given functions \(f(n)\) and \(g(n)\), the five standard notions are defined as limits:

  • \(g(n) = O(f(n))\) (big-Oh, upper bound) if \(\lim_{n \to \infty} g(n)/f(n) \leq c\) for some constant \(c\).
  • \(g(n) = \Omega(f(n))\) (big-Omega, lower bound) if \(\lim_{n \to \infty} g(n)/f(n) \geq c\) for some constant \(c\).
  • \(g(n) = \Theta(f(n))\) (big-Theta, tight bound) if \(\lim_{n \to \infty} g(n)/f(n) = c\) for some positive constant \(c\).
  • \(g(n) = o(f(n))\) (little-oh, strict upper) if \(\lim_{n \to \infty} g(n)/f(n) = 0\).
  • \(g(n) = \omega(f(n))\) (little-omega, strict lower) if \(\lim_{n \to \infty} g(n)/f(n) = \infty\).

The worst-case time complexity of an algorithm is \(O(f(n))\) if, for every input of size \(n\), the algorithm requires at most \(O(f(n))\) primitive operations. By adopting asymptotic notation, we deliberately ignore leading constants and lower-order terms. This is a principled choice: constants are machine-dependent and generally uninformative about the algorithm’s intrinsic efficiency. The theory asks whether the running time is \(O(n \log n)\) or \(O(n^2)\), not whether the constant factor is 3 or 7.

A word of caution is important here. Asymptotic complexity is a coarse lens. An \(O(n^3)\) algorithm with a small constant may outperform an \(O(n \log n)\) algorithm with a large one, for inputs up to a million elements. Engineering matters. But for understanding which problems are fundamentally tractable and which are not, asymptotic complexity is exactly the right tool.

Computation Models

An often-overlooked subtlety concerns what we count as a primitive operation. This course adopts the word-RAM model: we assume the problem instance fits in main memory, that each word occupies \(O(\log n)\) bits (enough to index into the input), and that a single word operation — addition, comparison, array access — takes constant time. This models standard CPU behaviour well for most problems.

The alternative is bit complexity, counting each bit operation separately. Under the word-RAM model, multiplying two \(n\)-digit numbers takes a certain number of word operations; under bit-complexity it takes a potentially larger number of bit operations. For most problems in this course the two models agree up to a logarithmic factor, but the distinction matters for arithmetic-intensive problems. The trial-division primality test, for example, has a deceptively simple analysis under word-RAM but becomes \(\Theta(\sqrt{N})\) bit-operations when we properly account for the size of the number \(N\).

The 3SUM Problem and Fine-Grained Complexity

The 3SUM problem serves as this course’s introductory example: given \(n+1\) numbers \(a_1, \ldots, a_n\) and \(c\), determine whether there exist indices \(i, j, k\) with \(a_i + a_j + a_k = c\). Three algorithms of decreasing complexity illuminate how algorithmic ingenuity works.

A brute-force approach enumerates all \(\binom{n}{3}\) triples: time \(O(n^3)\). A first improvement observes that \(a_i + a_j + a_k = c\) implies \(c - a_i - a_j = a_k\); after sorting in \(O(n \log n)\), each pair \((a_i, a_j)\) can be checked against the remaining elements by binary search in \(O(\log n)\), giving time \(O(n^2 \log n)\). The second improvement is more elegant: rewrite the condition as \(a_i + a_j = c - a_k\) and note that after sorting, the 2-SUM problem (does any pair sum to a target \(b\)?) can be solved in \(O(n)\) using two pointers. The pointer \(L\) starts at the left end of the sorted array and \(R\) at the right; if \(a_L + a_R\) equals the target we are done, if it exceeds the target we decrease \(R\), and if it falls short we increase \(L\). This gives time \(O(n^2)\) for 3SUM.

The question of whether 3SUM can be solved in truly sub-quadratic time — say \(O(n^{2-\epsilon})\) for some fixed \(\epsilon > 0\) — has been a central problem in fine-grained complexity theory for three decades. The 3SUM conjecture, that no such algorithm exists, is used as a hardness assumption to prove lower bounds for a surprising range of geometric and combinatorial problems. It represents a frontier that this course only hints at but which the extension chapter returns to in depth.

Chapter 2: Divide and Conquer

The Paradigm

Divide and conquer is the algorithmic counterpart of proof by induction. The strategy is always the same: reduce an instance of size \(n\) to two or more smaller instances of the same problem, solve them recursively, and combine the results. What makes some divide-and-conquer algorithms transcend the naive quadratic or cubic approaches is that the combination step can be cleverly designed to extract more information than simple recursion alone would suggest. The running time is characterised by a recurrence relation, and the Master Theorem is the canonical tool for solving such recurrences.

The Master Theorem

\[ T(n) = a \, T\!\left(\frac{n}{b}\right) + O(n^c), \]

where \(a > 0\) is the number of subproblems, \(b > 1\) is the factor by which the problem size shrinks, and \(n^c\) is the cost of the non-recursive work at each level. The Master Theorem provides the solution in three cases:

Master Theorem. Let \(T(n) = aT(n/b) + O(n^c)\) with \(a > 0, b > 1, c \geq 0\). Then \[ T(n) = \begin{cases} O(n^c) & \text{if } c > \log_b a, \\ O(n^c \log n) & \text{if } c = \log_b a, \\ O(n^{\log_b a}) & \text{if } c < \log_b a. \end{cases} \]

The intuition is clean: draw a recursion tree of height \(\log_b n\). The work done at each level is \(a^i \cdot O((n/b^i)^c) = O(n^c \cdot (a/b^c)^i)\). If \(a/b^c < 1\) (equivalently \(c > \log_b a\)), the work is dominated by the top level, giving \(O(n^c)\). If \(a/b^c > 1\), the work is dominated by the leaves — there are \(a^{\log_b n} = n^{\log_b a}\) of them — giving \(O(n^{\log_b a})\). If the two are equal, work is uniform across all \(O(\log n)\) levels, giving the middle case.

Merge Sort and Counting Inversions

Merge sort achieves \(O(n \log n)\) sorting by splitting the array in half, sorting each half recursively, and merging the two sorted halves in \(O(n)\) time. The recurrence \(T(n) = 2T(n/2) + O(n)\) falls into the middle case of the Master Theorem.

The problem of counting inversions — pairs \((i, j)\) with \(i < j\) but \(a_i > a_j\) — illustrates how merge sort can be enriched to compute additional information at negligible extra cost. The key insight is that when merging two sorted halves, every element placed from the right half creates a batch of inversions involving all remaining elements in the left half. Tracking this count during the merge adds only constant work per comparison, leaving the total complexity at \(O(n \log n)\). Inversions measure the “unsortedness” of a sequence; they arise naturally in comparing two rankings (say, two critics’ orderings of the same films), where the number of inversions equals the number of pairs that the two rankings disagree on.

Finding the Median in Linear Time

Finding the median of \(n\) distinct numbers seems to require sorting: naive approaches need \(O(n \log n)\). Yet a beautiful divide-and-conquer algorithm achieves \(O(n)\) time deterministically — a result that surprised many when it was first proved.

The key reduction is to the more general \(k\)-th smallest element problem. Given a pivot element \(a_i\) with rank \(r\) (it is the \(r\)-th smallest), the input splits into two groups \(S_1 = \{a_j : a_j < a_i\}\) and \(S_2 = \{a_j : a_j > a_i\}\). If \(r = k\) we are done; if \(r > k\) we recurse into \(S_1\) of size \(r-1\); if \(r < k\) we recurse into \(S_2\) for the \((k-r)\)-th smallest. The problem reduces to itself, but smaller — provided we can choose a pivot not too close to the extremes.

The median-of-medians algorithm finds such a pivot deterministically in \(O(n)\) time:

  1. Divide the \(n\) numbers into groups of 5, taking the median of each group. Time: \(O(n)\).
  2. Recursively find the median of the \(n/5\) group medians. Time: \(T(n/5)\).
  3. Use this median of medians as the pivot.

A combinatorial argument shows that the median of medians always has rank between \(3n/10\) and \(7n/10\), guaranteeing that the problem size reduces by at least a factor of \(10/7\) at each step. The recurrence \(T(n) \leq T(7n/10) + T(n/5) + O(n)\) has the solution \(T(n) = O(n)\). A randomised version, choosing a random pivot at each step, also achieves expected \(O(n)\) time with a much simpler analysis — the chance of getting a pivot with rank between \(n/4\) and \(3n/4\) is always at least \(1/2\), and the expected number of rounds before the problem reduces by a factor of \(3/4\) is at most 2.

The Closest Pair of Points

Given \(n\) points in the plane, find the pair at minimum Euclidean distance. The brute-force \(O(n^2)\) algorithm checks all pairs. Divide and conquer achieves \(O(n \log n)\).

Split the points by a vertical median line \(L\) into left half \(Q\) and right half \(R\). Recursively find the closest pair in each half; let \(\delta\) be the smaller of the two minimum distances found. The only remaining candidates are pairs with one point in \(Q\) and one in \(R\), and only those within a vertical strip of width \(2\delta\) centred on \(L\).

The crucial observation is that the strip is very sparse. Because all distances within \(Q\) and within \(R\) are at least \(\delta\), at most one point can lie in any \(\delta/2 \times \delta/2\) box. Dividing the strip into rows of height \(\delta/2\), each row spans at most \(2\delta/(\delta/2) = 4\) columns, and a point needs to check distances only against points within the next two rows above and below it — at most 11 points total. If the points in the strip are presorted by \(y\)-coordinate (which can be done once before the recursion begins), this scanning takes \(O(n)\). The resulting recurrence \(T(n) = 2T(n/2) + O(n)\) gives \(O(n \log n)\).

Arithmetic: Karatsuba and Strassen

The most dramatic speedups from divide and conquer occur in arithmetic, where clever algebraic identities turn 4-subproblem recursions into 3-subproblem ones.

Karatsuba multiplication (1960) multiplies two \(n\)-digit numbers in \(O(n^{\log_2 3}) \approx O(n^{1.585})\), beating the schoolbook \(O(n^2)\). The idea: to multiply \(x = x_1 \cdot 2^n + x_2\) and \(y = y_1 \cdot 2^n + y_2\), the product expands as \(x_1 y_1 \cdot 2^{2n} + (x_1 y_2 + x_2 y_1) \cdot 2^n + x_2 y_2\). A naïve approach requires four recursive multiplications of \(n/2\)-digit numbers. Karatsuba observed that the middle term \(x_1 y_2 + x_2 y_1\) equals \((x_1 + x_2)(y_1 + y_2) - x_1 y_1 - x_2 y_2\), so the four multiplications reduce to three: \(x_1 y_1\), \(x_2 y_2\), and \((x_1+x_2)(y_1+y_2)\). The recurrence \(T(n) = 3T(n/2) + O(n)\) gives \(O(n^{\log_2 3})\) by the Master Theorem.

Strassen’s algorithm (1969) performs matrix multiplication of two \(n \times n\) matrices in \(O(n^{\log_2 7}) \approx O(n^{2.807})\), beating the naive \(O(n^3)\). Rather than 8 multiplications of \(n/2 \times n/2\) submatrices, Strassen identified 7 carefully constructed products from which all entries of the output can be assembled. The savings is modest in absolute terms, but Strassen’s result sparked a research programme that has driven the matrix multiplication exponent down to roughly 2.371 (Williams–Duan–Xu, 2024), with the theoretical lower bound still unknown.

The Fast Fourier Transform (FFT) applies the same spirit to polynomial multiplication, achieving \(O(n \log n)\) time by exploiting the structure of complex roots of unity. The FFT is arguably the most practically impactful algorithm of the 20th century, underlying signal processing, image compression, and cryptographic computations. See [DPV §2.6] for the elegant derivation.

Chapter 3: Graph Algorithms

Graphs as Computational Objects

A graph \(G = (V, E)\) consists of a vertex set \(V\) (with \(n = |V|\)) and an edge set \(E\) (with \(m = |E|\)). Two data structures dominate: the adjacency matrix, an \(n \times n\) binary array with \(A[i,j] = 1\) iff \(ij \in E\), which uses \(\Theta(n^2)\) space but supports edge queries in \(O(1)\); and the adjacency list, which stores for each vertex a linked list of its neighbours, using \(O(n + m)\) space and enabling linear-time traversals. Nearly all graph algorithms in this course assume the adjacency list representation, since only it permits \(O(n+m)\)-time algorithms.

Breadth-First Search (BFS) explores a graph layer by layer, visiting all vertices at distance \(k\) from the source before any vertex at distance \(k+1\). The algorithm maintains a queue of discovered-but-not-yet-explored vertices and a boolean array tracking which vertices have been seen. Each vertex is enqueued once and each edge is examined once, giving total time \(O(n + m)\).

BFS is uniquely suited to computing shortest path distances in unweighted graphs. The reason is structural: because BFS explores in order of distance, a vertex \(v\) is first discovered along a shortest path from the source. Recording the parent of each discovered vertex builds a BFS tree, and the path from any vertex to the source in this tree is a shortest path.

A beautiful application of BFS is bipartiteness testing: a graph is bipartite if and only if it has no odd cycle. The BFS-based algorithm assigns vertices at even distance from the source to one side \(L\) and vertices at odd distance to the other side \(R\). If any edge connects two vertices on the same side, the graph is not bipartite, and the edge together with the BFS paths to their lowest common ancestor in the tree witnesses an odd cycle. This provides not just a decision but a certificate: a short proof of non-bipartiteness that can be verified in polynomial time without re-running the algorithm.

Depth-First Search (DFS) takes the opposite approach: from the current vertex, immediately recurse into an undiscovered neighbour, backtracking only when no undiscovered neighbour remains. The recursive structure of DFS produces a rich tree with structural properties that BFS does not possess.

Recording starting times \(\text{start}[v]\) and finishing times \(\text{finish}[v]\) for each vertex reveals the Parenthesis Property: the intervals \([\text{start}[u], \text{finish}[u]]\) and \([\text{start}[v], \text{finish}[v]]\) are either disjoint or one contains the other. Containment corresponds exactly to ancestor–descendant relationships in the DFS tree.

For undirected graphs, the DFS tree has a remarkable property: all non-tree edges are back edges, connecting a descendant to an ancestor. This is not a coincidence but a provable consequence of the depth-first exploration order. If an edge \(uv\) existed where neither \(u\) nor \(v\) is an ancestor of the other (a “cross edge”), then one would have been discovered while the other was already in the stack — but that would have made it a tree edge or back edge. The total absence of cross edges in undirected DFS trees is why DFS reveals so much more connectivity structure than BFS.

Cut Vertices and Cut Edges

A cut vertex (articulation point) is a vertex whose removal disconnects the graph. A cut edge (bridge) is an edge whose removal disconnects the graph. Both can be found in \(O(n + m)\) time using a single DFS via the following insight.

Define \(\text{low}[v]\) as the minimum over all start-times reachable from the subtree rooted at \(v\) via at most one back edge. Informally, \(\text{low}[v]\) measures how far “up” the DFS tree one can escape from \(v\)’s subtree. Then a non-root vertex \(v\) is a cut vertex if and only if it has a child \(u\) with \(\text{low}[u] \geq \text{start}[v]\) — meaning the subtree rooted at \(u\) cannot reach any proper ancestor of \(v\). The values \(\text{low}[v]\) can be computed bottom-up in \(O(n + m)\) total time by propagating \(\min\{\text{low}[w]\}\) from children and examining all incident back edges at each vertex.

Directed Graphs: Topological Order and Kosaraju’s Algorithm

Directed graphs (digraphs) introduce asymmetric reachability. A strongly connected component (SCC) is a maximal subset \(S \subseteq V\) such that every vertex in \(S\) is reachable from every other vertex in \(S\). Contracting each SCC to a single vertex yields a DAG — the “meta-graph” of the original directed graph — and all edges between SCCs go in one direction in this DAG.

Topological sorting of a DAG (ordering vertices so all edges go forward) is achieved by DFS: run DFS on the whole graph and output vertices in decreasing order of finish times. The key lemma is that for any edge \(uv\) in a DAG, \(\text{finish}[v] < \text{finish}[u]\), so the decreasing-finish-time ordering is a valid topological order.

Kosaraju’s algorithm for SCCs is one of the most elegant two-pass algorithms in graph theory:

  1. Run DFS on \(G\) and record vertices in decreasing order of finish time.
  2. Reverse all edges of \(G\) to obtain \(G^R\).
  3. Run DFS on \(G^R\) following the ordering from step 1; each DFS tree in this pass is one SCC.

The correctness relies on a key lemma: if \(C\) and \(C'\) are SCCs with edges from \(C\) to \(C'\), the maximum finish time in \(C\) exceeds that in \(C'\). This means step 1 gives a topological order of the component DAG, and step 3 peels off sink components one by one — but in \(G^R\) the sink components of \(G\) have become source components, so they are reached first. The total time is \(O(n + m)\).

Chapter 4: Greedy Algorithms

The Greedy Philosophy

Greedy algorithms commit to locally optimal decisions without reconsidering them. Unlike dynamic programming, which systematically explores all possibilities, a greedy algorithm makes a single choice at each step and moves on. The payoff, when it works, is simplicity and efficiency. The danger is that local optimality can fail to produce global optimality. The art of greedy algorithm analysis is proving — often via an exchange argument or a staying-ahead argument — that the greedy choice can always be embedded in an optimal solution.

Interval Scheduling: The Earliest-Finish-Time Rule

The interval scheduling problem is the canonical greedy example. Given \(n\) intervals \([s_i, f_i]\), find a maximum subset of pairwise disjoint intervals. Four natural greedy strategies suggest themselves: take the interval with earliest start time, shortest length, fewest conflicts, or earliest finish time. The first three can be refuted by simple counterexamples. Only earliest finish time works.

Theorem. The greedy algorithm that always selects the interval with the earliest finish time produces a maximum-cardinality set of disjoint intervals.

The proof uses a staying-ahead argument. Let the greedy solution be \([s_{i_1}, f_{i_1}] \prec [s_{i_2}, f_{i_2}] \prec \cdots \prec [s_{i_k}, f_{i_k}]\) and any optimal solution be \([s_{j_1}, f_{j_1}] \prec \cdots \prec [s_{j_\ell}, f_{j_\ell}]\). One proves by induction that \(f_{i_r} \leq f_{j_r}\) for all \(r\) — the greedy solution “finishes no later than” any other solution at every step. If \(\ell > k\), then the interval \([s_{j_{k+1}}, f_{j_{k+1}}]\) is disjoint from the greedy solution, so the greedy algorithm should have selected it — contradiction.

The related interval coloring problem asks for the minimum number of colours to colour intervals so that no two overlapping intervals share a colour (minimising the number of rooms needed to schedule all activities). A greedy algorithm that processes intervals by start time and assigns the smallest available colour works correctly. The proof is by contradiction: if \(k\) colours are used, the first interval requiring colour \(k\) overlaps \(k-1\) others, so some time \(t\) is covered by \(k\) intervals simultaneously — a lower bound argument showing that \(k\) colours are necessary.

Minimising Total Completion Time

Given \(n\) jobs with processing times \(p_1, \ldots, p_n\), schedule them to minimise the total completion time \(\sum_{i=1}^n C_i\) where \(C_i\) is the time at which job \(i\) finishes. The greedy strategy is obvious: process jobs in non-decreasing order of processing time (shortest-job-first). The proof uses an exchange argument: any schedule with an “inversion” — a job \(k\) scheduled immediately before job \(k+1\) with \(p_k > p_{k+1}\) — can be improved by swapping them. Before swapping, the contribution of the pair is \(2c + 2p_k + p_{k+1}\); after, it is \(2c + 2p_{k+1} + p_k\), which is strictly smaller since \(p_{k+1} < p_k\). Eliminating all inversions yields the sorted order.

Huffman Coding

Huffman coding (1952) is the solution to the optimal prefix-free binary code problem. Given \(n\) symbols with frequencies \(f_1, \ldots, f_n\) summing to 1, find a binary tree \(T\) with \(n\) leaves (one per symbol) that minimises the expected code length \(\sum_i f_i \cdot \text{depth}_T(i)\). Prefix-free codes correspond bijectively to full binary trees, and the problem is to find the shape of the optimal tree.

Three structural observations drive Huffman’s algorithm:

  1. The optimal tree is full (every internal node has two children); any non-full tree can be improved.
  2. Therefore, there exist two leaves of maximum depth that are siblings.
  3. By an exchange argument, in any optimal solution these siblings can be chosen to be the two least frequent symbols.

These observations suggest the algorithm: repeatedly merge the two least-frequent symbols into a new combined symbol with frequency equal to their sum, solving the reduced \((n-1)\)-symbol problem recursively, then expanding the merged node back into two leaves. A formal inductive proof shows that \(\text{Sol}(n) = \text{Opt}(T') + f_y + f_z\) where \(T'\) is the optimal tree for the reduced problem — since any optimal solution for the original problem can be “folded” to give a solution of the same quality for the reduced problem, the induction closes. Using a min-heap, the algorithm runs in \(O(n \log n)\).

Dijkstra’s Algorithm

Dijkstra’s algorithm solves single-source shortest paths in graphs with non-negative edge lengths. Conceptually, imagine igniting a fire at the source vertex at time 0; it spreads along edges, taking \(\ell_e\) time to cross edge \(e\). Dijkstra’s algorithm simulates this physical process efficiently: maintain a priority queue of “time to reach each vertex,” and at each step extract the nearest unvisited vertex, update its neighbours, and mark it as settled.

Invariant (Dijkstra Correctness). When a vertex \(u\) is extracted from the priority queue, \(\text{dist}[u]\) equals the true shortest-path distance from the source to \(u\).

The proof uses the key property that edge lengths are non-negative: any path to \(u\) via an unsettled vertex must have length at least \(\text{dist}[u]\), so there is no shorter path left to discover. This argument breaks completely with negative edges, as a path discovered later might use a negative edge to “go back” and achieve a shorter total length.

With a binary heap, Dijkstra runs in \(O((n + m) \log n)\). With a Fibonacci heap, it improves to \(O(n \log n + m)\) — a crucial improvement when the graph is dense. The BFS algorithm for unweighted graphs is the limiting case of Dijkstra when all edge lengths equal 1.

Chapter 5: Dynamic Programming

The Dynamic Programming Framework

Dynamic programming is the algorithmic realisation of a fundamental mathematical insight: if a problem can be expressed as a recurrence over polynomially many subproblems, and each subproblem can be solved by looking up the answers to smaller subproblems, then the whole computation can be done in polynomial time by storing and reusing intermediate results.

The distinction between top-down memorisation (compute recursively, cache answers) and bottom-up computation (fill a table in topological order) is primarily one of implementation style. Top-down can skip subproblems not actually needed, while bottom-up avoids recursion overhead and makes space optimisations more transparent. For each problem, the key intellectual work is always the same: identify the right set of subproblems and write down a recurrence.

Dynamic programming can often be visualised as finding a shortest (or longest, or most valuable) path in a subproblem DAG: each vertex is a subproblem, and edges represent the recurrence. The top-down algorithm is then DFS on this DAG with memoisation; the bottom-up algorithm is a BFS-like topological traversal.

Weighted Interval Scheduling

The unweighted interval scheduling problem yields to a greedy algorithm. Add weights — each accepted interval earns profit \(w_i\) — and greed fails completely; there is no known greedy approach. Dynamic programming, however, handles it elegantly.

Sort intervals by finish time. For each interval \(i\), let \(\text{next}[i]\) be the index of the first interval that starts strictly after \(f_i\). Define \(\text{opt}(i)\) as the maximum profit using only intervals \(\{i, i+1, \ldots, n\}\). Then:

\[ \text{opt}(i) = \max\!\bigl\{w_i + \text{opt}(\text{next}[i]),\; \text{opt}(i+1)\bigr\} \]

The first alternative chooses interval \(i\) and skips all overlapping ones; the second skips interval \(i\). There are only \(n\) subproblems and each requires two table lookups, giving \(O(n \log n)\) total time (dominated by the initial sort). This matches the greedy algorithm’s complexity for the unweighted case, yet the mechanism is entirely different: we are performing a systematic exhaustive search over an exponential space, compressed to polynomial size by the observation that only the “frontier” (the last accepted interval) matters for future decisions.

Subset-Sum and the Knapsack Problem

The subset-sum problem asks: given positive integers \(a_1, \ldots, a_n\) and target \(K\), does some subset sum to exactly \(K\)? The knapsack problem generalises: given items with weights \(w_i\) and values \(v_i\) and capacity \(W\), maximise \(\sum_{i \in S} v_i\) subject to \(\sum_{i \in S} w_i \leq W\).

\[ \text{subsum}[i][L] = \text{subsum}[i+1][L - a_i] \lor \text{subsum}[i+1][L]. \]

There are \(n \cdot K\) subproblems, each taking constant time. The overall complexity is \(O(nK)\) — the prototypical example of pseudo-polynomial time. The algorithm is polynomial in \(n\) and \(K\) treated as numbers, but exponential in the input size (which represents \(K\) in \(\log K\) bits). This is not a coincidence: subset-sum is NP-complete, so no polynomial-time algorithm is expected.

\[ \text{knapsack}(i, W) = \max\!\bigl\{v_i + \text{knapsack}(i+1, W - w_i),\; \text{knapsack}(i+1, W)\bigr\}, \]

running in \(O(nW)\) pseudo-polynomial time. The space can be compressed to \(O(W)\) by noting that only two rows of the table are ever needed simultaneously.

Longest Increasing Subsequence

Given a sequence \(a_1, \ldots, a_n\), its longest increasing subsequence (LIS) is a subsequence \(a_{i_1} < a_{i_2} < \cdots < a_{i_k}\) with \(i_1 < i_2 < \cdots < i_k\) of maximum length \(k\).

\[ L(i) = 1 + \max_{j > i,\, a_j > a_i} L(j), \]

with base cases \(L(n) = 1\). The answer is \(\max_{1 \leq i \leq n} L(i)\). Computing all \(L(i)\) bottom-up takes \(O(n^2)\).

LIS is also beautifully interpretable as a longest-path problem in a DAG: create a vertex for each index, with a directed edge from \(i\) to \(j\) whenever \(j > i\) and \(a_j > a_i\). A longest path in this DAG corresponds to an LIS.

A clever \(O(n \log n)\) algorithm improves this. For each length \(k\), maintain only the position \(\text{pos}[k]\) with the largest value among all length-\(k\) increasing subsequences ending so far. The critical monotone property is that the values \(a[\text{pos}[1]] > a[\text{pos}[2]] > \cdots > a[\text{pos}[m]]\), allowing binary search to update the relevant entry in \(O(\log n)\) per element.

Longest Common Subsequence and Edit Distance

The longest common subsequence (LCS) of two strings \(a_1 \ldots a_n\) and \(b_1 \ldots b_m\) is their longest shared subsequence (in order, but not necessarily contiguous). Define \(C(i, j)\) as the LCS length for suffixes starting at positions \(i\) and \(j\) respectively. The recurrence handles three cases: match \(a_i = b_j\) (add 1 to \(C(i+1, j+1)\)), skip \(a_i\) (use \(C(i+1, j)\)), or skip \(b_j\) (use \(C(i, j+1)\)). There are \(mn\) subproblems each taking \(O(1)\) time: total \(O(mn)\).

LCS is a reduction target as well as a problem: the LIS problem reduces to LCS by comparing the original sequence against its sorted version, since sorting forces any common subsequence to be increasing.

Edit distance (Levenshtein distance) counts the minimum number of single-character insertions, deletions, or substitutions to transform one string into another. The same \(O(mn)\) DP handles it, with four cases at each step: add a character, delete a character, substitute, or match. Edit distance is the workhorse of spell-checking, DNA sequence alignment, and version control systems. Visualised as shortest-path on a grid of subproblems, with free diagonal edges at matching characters, it unifies the LCS and edit-distance computations into one geometric framework.

Dynamic Programming on Trees

Dynamic programming applies naturally to trees, exploiting the absence of cycles. For the maximum independent set on a tree — find the largest set of vertices no two of which are adjacent — define \(I(v)\) as the size of the maximum independent set in the subtree rooted at \(v\). The recurrence:

\[ I(v) = \max\!\left(1 + \sum_{\substack{w:\, w \text{ grandchild of } v}} I(w),\; \sum_{\substack{w:\, w \text{ child of } v}} I(w)\right) \]

either includes \(v\) (excluding all children but recursing on grandchildren) or excludes \(v\) (recursing directly on children). Since each vertex contributes at most two table lookups (parent and grandparent), the total time is \(O(n)\).

This extends to “tree-like” graphs via the notion of treewidth: the maximum independent set and many other NP-hard problems on general graphs become polynomial when the graph’s treewidth is bounded. Treewidth measures how far a graph deviates from being a tree, and dynamic programming “along the tree decomposition” runs in time roughly \(O(n^{\text{treewidth}+1})\).

Optimal Binary Search Tree

Given \(n\) keys \(k_1 < \cdots < k_n\) with access frequencies \(f_1, \ldots, f_n\), the optimal binary search tree problem minimises \(\sum_{i=1}^n f_i \cdot \text{depth}_T(k_i)\) over all valid BSTs. Unlike Huffman coding, the BST constraint ties the shape of the tree to the sorted order of keys.

Define \(C(i, j)\) as the optimal cost for keys \(k_i, \ldots, k_j\), and let \(F_{i,j} = \sum_{\ell=i}^j f_\ell\). Trying all possible roots \(k_\ell\):

\[ C(i, j) = \min_{i \leq \ell \leq j} \bigl\{F_{i,j} + C(i, \ell-1) + C(\ell+1, j)\bigr\}. \]

The term \(F_{i,j}\) accounts for the fact that placing all remaining keys one level lower increases the objective by their total frequency. With \(O(n^2)\) subproblems and \(O(n)\) choices per subproblem, the algorithm runs in \(O(n^3)\). Knuth showed that exploiting a monotone property of the optimal root (it never moves left when the interval expands rightward) reduces this to \(O(n^2)\).

Bellman-Ford and Negative Cycles

When edge lengths can be negative, Dijkstra’s algorithm fails: its correctness proof relies on the monotonicity of path lengths, which negative edges destroy. The Bellman-Ford algorithm handles this at the cost of running in \(O(nm)\) time.

The key observation: if the graph has no negative cycles, any shortest path is a simple path and uses at most \(n-1\) edges. Define \(D(v, i)\) as the shortest-path distance from source \(s\) to \(v\) using at most \(i\) edges. The recurrence:

\[ D(v, i+1) = \min\!\left(D(v, i),\; \min_{u:\, uv \in E} \bigl\{D(u, i) + \ell_{uv}\bigr\}\right). \]

Running for \(n-1\) iterations gives all correct distances. A remarkable simplification: using a single array (rather than separate arrays for each \(i\)) still works — earlier updates in an iteration can only make distances tighter, which does not hurt correctness.

Detecting negative cycles uses one more iteration: if any \(D(v, n) < D(v, n-1)\), a path with \(n\) edges must have repeated a vertex, and that cycle must be negative (otherwise the cycle could be eliminated for a shorter path). The parent pointers trace back the path and expose the cycle.

Floyd-Warshall: All-Pairs Shortest Paths

Running Bellman-Ford from every source gives an \(O(n^2 m)\) all-pairs algorithm. The Floyd-Warshall algorithm achieves \(O(n^3)\) by a different subproblem decomposition. Let \(D(i, j, k)\) be the shortest path from \(i\) to \(j\) using only vertices in \(\{1, \ldots, k\}\) as intermediates. The recurrence:

\[ D(i, j, k+1) = \min\!\bigl\{D(i, j, k),\; D(i, k+1, k) + D(k+1, j, k)\bigr\}. \]

Either the shortest path avoids vertex \(k+1\) (first term) or it goes through \(k+1\), in which case it splits into two optimal sub-paths (second term). Starting from \(D(i, j, 0) = \ell_{ij}\) (or \(\infty\) if no direct edge) and running the loop for \(k = 0, 1, \ldots, n-1\) yields all-pairs shortest paths. The algorithm is remarkably compact — three nested loops of \(n\) iterations — and equally applicable to detecting negative cycles.

The question of whether all-pairs shortest paths can be solved in truly sub-cubic time (i.e., \(O(n^{3-\epsilon})\) for some fixed \(\epsilon > 0\)) remains a major open problem in algorithm design.

The Held-Karp Algorithm for TSP

The Traveling Salesman Problem has a dynamic programming solution that, while still exponential, is far better than the \(\Theta(n!)\) brute-force algorithm. The Held-Karp algorithm (1962) runs in \(O(n^2 \cdot 2^n)\) time.

The key insight: a shortest TSP tour visiting all \(n\) cities visits some last city \(i \neq 1\) before returning to the starting city 1. The tour’s first portion is a shortest Hamiltonian path from 1 to \(i\) through all other cities. Define \(C(i, S)\) as the length of a shortest path from city 1 to city \(i\) that visits exactly the cities in \(S\). The recurrence:

\[ C(i, S) = \min_{j \in S \setminus \{i\}} \bigl\{C(j, S \setminus \{i\}) + \ell_{ji}\bigr\}, \]

with base case \(C(i, \{1, i\}) = \ell_{1i}\). The answer is \(\min_i \{C(i, V) + \ell_{i1}\}\). There are \(O(n \cdot 2^n)\) subproblems, each solved in \(O(n)\) time. The exponential complexity is unavoidable (assuming P \(\neq\) NP), but for \(n \approx 30\)–50 this algorithm is practical where brute-force is not.

Chapter 6: Bipartite Graphs and Matching

The Assignment Problem and Matching

The bipartite matching problem is one of the foundational problems in combinatorial optimisation. Given a bipartite graph \(G = (X, Y; E)\), find a maximum-cardinality subset \(M \subseteq E\) of pairwise vertex-disjoint edges (a matching). The assignment problem is a special case: \(n\) jobs, \(m\) workers, an edge whenever worker \(p\) can do job \(j\) — can all jobs be assigned?

Greedy fails for matching: the first edge chosen can “block” multiple better options. Dynamic programming approaches are also unknown. The correct technique is the augmenting path method, a local-search approach with global optimality guarantees.

Augmenting Paths

Given a matching \(M\), an augmenting path is a path \(v_1, v_2, \ldots, v_{2k}\) where \(v_1\) and \(v_{2k}\) are unmatched, odd-indexed edges are not in \(M\), and even-indexed edges are in \(M\). Augmenting along such a path — adding the odd edges, removing the even ones — increases \(|M|\) by exactly 1.

Augmenting Path Theorem (Berge, 1957). A matching \(M\) is maximum if and only if \(G\) contains no augmenting path with respect to \(M\).

This is the structural bedrock of the algorithm: the algorithm repeatedly finds augmenting paths and augments. The search for an augmenting path in a bipartite graph reduces to a directed reachability problem. Direct every non-matching edge from \(X\) to \(Y\) and every matching edge from \(Y\) to \(X\). Add a super-source \(s\) connected to all unmatched vertices in \(X\) and a super-sink \(t\) connected from all unmatched vertices in \(Y\). An augmenting path exists iff \(t\) is reachable from \(s\) in this directed graph, detectable by BFS or DFS in \(O(n + m)\). The overall bipartite matching algorithm runs \(O(n)\) augmentation phases, each \(O(n + m)\): total \(O(nm)\). The Hopcroft-Karp algorithm processes multiple augmenting paths simultaneously, achieving \(O(m\sqrt{n})\).

König’s Theorem: Matching and Vertex Cover

A vertex cover is a set \(S \subseteq V\) such that every edge has at least one endpoint in \(S\). Clearly any matching provides a lower bound on the vertex cover: each matching edge requires at least one of its two endpoints in any vertex cover. König’s theorem says that for bipartite graphs, this bound is always tight.

König's Theorem (1931). In a bipartite graph, the maximum matching size equals the minimum vertex cover size.

The proof is constructive: given a maximum matching \(M\), define the directed graph above and let \(S\) be the set of vertices reachable from \(s\). The vertex cover is \((Y \cap S) \cup (X \setminus S)\). One verifies it has size \(|M|\) and covers all edges. This is a min-max theorem: the maximum of one quantity equals the minimum of another — a hallmark of beautiful combinatorial optimisation. Such theorems are far more than duality statements; they provide certificates. When reporting a maximum matching of size \(k\) that is not perfect, you can simultaneously exhibit a vertex cover of size \(k\), convincing your audience that no larger matching exists.

König’s theorem is the combinatorial analogue of LP duality for network flow. Indeed, it follows directly from the max-flow min-cut theorem when bipartite matching is formulated as a flow problem. This connection opens the door to the rich theory of network flows studied in CO 351.

Hall’s Theorem and Perfect Matchings

Hall's Marriage Theorem (1935). A bipartite graph \(G = (X, Y; E)\) with \(|X| = |Y|\) has a perfect matching if and only if for every subset \(S \subseteq X\), the neighbourhood \(N(S)\) satisfies \(|N(S)| \geq |S|\).

Hall’s condition is the structural obstruction: if some set of \(k\) vertices on the left share only \(k-1\) neighbours on the right, no perfect matching exists. The theorem is both a tool (certifying perfect matching existence) and an algorithm (König’s algorithm finds the obstructing set \(S\) when Hall’s condition fails).

A corollary: every \(d\)-regular bipartite graph has a perfect matching. Each vertex on the left has \(d\) neighbours; for any subset \(S \subseteq X\), the number of edges leaving \(S\) is \(d|S|\), and all land in \(N(S)\), which has size at most \(|Y|\). Counting these edges from both sides: \(d|S| \leq d \cdot |N(S)|\), so \(|N(S)| \geq |S|\). Hall’s condition holds.

Min-Max Theorems and LP Duality

The bipartite matching problem is a gateway to a deeper structural principle: combinatorial min-max theorems are manifestations of LP strong duality. The matching problem can be written as an integer linear program; relaxing integrality gives an LP whose dual is the fractional vertex cover. For bipartite graphs, the LP relaxation has integer optimal solutions (the constraint matrix is totally unimodular), so the LP solution is automatically an integer — and König’s theorem is LP duality made combinatorial.

This perspective — integrality of LP relaxations via total unimodularity — extends to network flows, transportation problems, and assignment problems. It is a central thread in the theory of combinatorial optimisation and is developed in CO 351 (Network Flow) and CO 452 (Integer Programming).

Chapter 7: Intractability

The Polynomial-Time Reduction

The theory of NP-completeness begins with a precise notion of what it means for one problem to be “at least as hard” as another. A polynomial-time reduction from problem \(A\) to problem \(B\) (written \(A \leq_P B\)) is a polynomial-time algorithm \(F\) that maps any instance \(I_A\) of \(A\) to an instance \(F(I_A)\) of \(B\), such that \(I_A\) is a YES-instance of \(A\) iff \(F(I_A)\) is a YES-instance of \(B\).

If \(A \leq_P B\) and \(B\) is solvable in polynomial time, then \(A\) is solvable in polynomial time. Contrapositively, if \(A\) cannot be solved in polynomial time, neither can \(B\). The second direction requires knowing that \(A\) is hard — which we do not know unconditionally — but it motivates the notion of NP-completeness: if every NP problem reduces to \(B\), then solving \(B\) in polynomial time would collapse the entire class NP to P.

Reductions are transitive: \(A \leq_P B\) and \(B \leq_P C\) imply \(A \leq_P C\). This allows the construction of a large equivalence class of problems, all inter-reducible, forming what Cook called the class of NP-complete problems.

The Class NP

A decision problem \(X\) is in NP if there exists a polynomial-time verification algorithm \(B_X\) such that an instance \(s\) is a YES-instance iff there exists a “proof” (witness) \(t\) of polynomial length with \(B_X(s, t) = \text{YES}\).

This definition captures all problems whose solutions, once found, can be checked efficiently. 3-SAT is in NP: the proof is a satisfying assignment, and checking that every clause is satisfied takes linear time. Graph coloring, clique, independent set, Hamiltonian cycle, subset-sum — all are in NP, since given a proposed solution one can verify it in polynomial time.

The class P of polynomial-time solvable problems sits inside NP. The question P = NP? asks whether every problem whose solutions can be verified efficiently can also be found efficiently. The near-universal belief is P \(\neq\) NP, but no proof exists. This is one of the seven Millennium Prize Problems, with a $1,000,000 prize for a resolution.

The related class co-NP contains problems whose NO-instances have short, verifiable proofs. A problem in NP ∩ co-NP has efficient witnesses for both YES and NO — bipartite matching is in NP ∩ co-NP precisely because König’s theorem provides a short NO-certificate (a small vertex cover). Most NP-complete problems are believed not to be in co-NP.

The Cook-Levin Theorem and NP-Completeness

The existence of NP-complete problems is not obvious. Cook (1971) and Levin (1973) independently proved:

Cook-Levin Theorem. 3-SAT is NP-complete.

The proof reduces an arbitrary NP computation to 3-SAT: given any polynomial-time verifier \(B_X\) and instance \(s\), simulate the verifier’s computation as a circuit and encode the circuit’s satisfiability as a 3-SAT formula. A satisfying assignment to this formula corresponds exactly to an accepting computation. This is a tour de force in encoding computation by logic.

Once 3-SAT is NP-complete, other problems can be shown NP-complete by reduction from 3-SAT, building a web of equivalences. The key reductions in this course are:

  • 3-SAT \(\leq_P\) Independent Set: for each clause with literals \(\ell_1 \lor \ell_2 \lor \ell_3\), create a triangle on three vertices; add edges between a variable and its negation across triangles. An independent set of size \(k\) (one vertex per clause triangle) corresponds to a satisfying assignment.
  • Clique \(\leq_P\) Independent Set: replace the graph by its complement.
  • Independent Set \(\leq_P\) Vertex Cover: \(S\) is independent iff \(V \setminus S\) is a vertex cover.
  • HC \(\leq_P\) TSP: embed the undirected graph in a complete graph with 0/1 edge weights; a Hamiltonian cycle exists iff there is a tour of length \(n\).

Hard Graph Problems

The following graph problems are all NP-complete:

Hamiltonian Cycle (HC): does the graph contain a cycle passing through every vertex exactly once? The reduction from 3-SAT to directed HC and then from directed HC to undirected HC are classic reductions whose structure reveals how boolean logic is encoded in graph topology.

Graph 3-Coloring: is there a proper colouring of the vertices using at most 3 colours? Trivially solvable for 1 colour (no edges) and 2 colours (bipartiteness testing in linear time), but NP-complete for 3 colours. The reduction from 3-SAT encodes each variable as a “gadget” and each clause as a triangle constraint.

Maximum Clique: does the graph contain a clique of size at least \(k\)? Hard to approximate as well: Håstad (1996) showed it is NP-hard to approximate within \(n^{1-\epsilon}\) for any (\epsilon > 0$.

Hard Partitioning Problems

3-Dimensional Matching (3DM): given disjoint sets \(X, Y, Z\) of size \(n\) and triples \(T \subseteq X \times Y \times Z\), do there exist \(n\) triples covering each element exactly once? 3DM generalises bipartite matching (which involves pairs) to triples, and this extra dimension makes it NP-complete.

Subset-Sum and Knapsack are NP-complete despite having pseudo-polynomial algorithms. The distinction between pseudo-polynomial and polynomial time complexity highlights that NP-completeness is about the bit-length of the input, not the magnitude of the numbers it contains. Subset-sum is NP-complete by a reduction from 3DM that encodes the triple-covering structure in the digits of the target sum.

The class of NP-complete problems now contains thousands of problems spanning scheduling, packing, routing, logic, combinatorics, and number theory. Garey and Johnson’s 1979 book Computers and Intractability catalogued over 300 in its first year; the web-maintained database now lists several thousand.

Chapter 8: Beyond This Course

Natural Next Topics

The topics in CS 341 — divide and conquer, graph search, greedy algorithms, dynamic programming, matching, and NP-completeness — constitute the foundational layer of algorithm design. Every one of them has a rich extension that forms the subject of advanced courses and research.

Network Flow

Bipartite matching is the simplest instance of network flow. A flow network is a directed graph with edge capacities \(c_e \geq 0\), a source \(s\), and a sink \(t\). The maximum \(s\)-\(t\) flow problem asks to route as much “flow” as possible from \(s\) to \(t\) without exceeding any capacity. The Max-Flow Min-Cut Theorem (Ford-Fulkerson 1956) is the network-flow analogue of König’s theorem: the maximum flow value equals the minimum capacity of an \(s\)-\(t\) cut. König’s theorem, bipartite matching, and many other min-max theorems are special cases.

The Edmonds-Karp algorithm (a BFS-based implementation of Ford-Fulkerson) runs in \(O(nm^2)\) or \(O(m^2 \log n)\) depending on the variant. Modern algorithms include Dinic’s \(O(n^2 m)\) algorithm (optimal for unit-capacity graphs) and push-relabel algorithms achieving \(O(n^2 \sqrt{m})\). For the applications — project selection, image segmentation, scheduling, survey design, transportation — and the elegant LP duality theory, see CO 351.

Approximation Algorithms

For NP-hard optimisation problems, we cannot (assuming P \(\neq\) NP) find exact polynomial-time solutions. Approximation algorithms find solutions that are provably close to optimal. A \(\rho\)-approximation guarantees solution value within a factor \(\rho\) of optimal.

The greedy set cover algorithm achieves a \(\ln n\)-approximation for set cover: repeatedly pick the set covering the most uncovered elements. Remarkably, this is essentially optimal — Dinur and Steurer (2014) proved that no polynomial-time algorithm can do better than \((1-\epsilon) \ln n\) unless P = NP.

Vertex cover has a tight 2-approximation: include both endpoints of any maximal matching. Whether a \((2-\epsilon)\)-approximation is achievable remains one of the most tantalising open problems in approximation theory, connected to the Unique Games Conjecture.

The Christofides algorithm (1976) achieves a \(3/2\)-approximation for metric TSP: find a minimum spanning tree, add a minimum-weight perfect matching on odd-degree vertices, find an Eulerian circuit, and shortcut repeated vertices. In 2021, Karlin, Klein, and Oveis Gharan improved this to \((3/2 - \epsilon)\) for the first time in 45 years.

Randomised Algorithms

Randomisation is a powerful tool for algorithm design. Randomised quicksort achieves expected \(O(n \log n)\) time with constant probability of achieving near-optimal behaviour, simpler to implement than deterministic \(O(n \log n)\) algorithms. Karger’s algorithm for minimum cut finds the minimum cut in a graph with probability \(\Omega(1/n^2)\) per trial, leading to an \(O(n^2 \log n)\)-time algorithm with high probability.

Hashing with universal or perfect hash functions achieves \(O(1)\) expected lookup/insertion. The birthday paradox governs collision probabilities. Bloom filters use randomised hashing for space-efficient approximate set membership queries. Randomised rounding converts LP solutions to integer solutions with probabilistic approximation guarantees, bridging optimisation theory and algorithm design.

The Exponential Time Hypothesis and Fine-Grained Complexity

NP-completeness tells us that certain problems are unlikely to have polynomial-time algorithms, but it says nothing about the precise exponent of the polynomial for tractable problems, or the base of the exponential for hard ones.

The Exponential Time Hypothesis (ETH), proposed by Impagliazzo and Paturi (2001), asserts that 3-SAT cannot be solved in time \(2^{o(n)}\) where \(n\) is the number of variables. The Strong ETH (SETH) strengthens this: for every \(\epsilon > 0\), \(k\)-SAT cannot be solved in time \(O(2^{(1-\epsilon)n})\) for all \(k\). These conjectures imply tight lower bounds for many problems: under SETH, there is no truly subquadratic algorithm for edit distance, LCS, or many other classical dynamic programming problems.

Fine-grained complexity uses these conjectures to explain why certain algorithms seem stuck at particular exponents — \(O(n^2)\) for edit distance, \(O(nm)\) for single-source shortest paths with negative edges, \(O(n^3)\) for all-pairs shortest paths. These are not accidents; they reflect deep structural barriers, conditional on widely-believed conjectures.

No-Regret Learning and Online Algorithms

An online algorithm must make decisions as input arrives, without knowledge of future inputs. The competitive ratio measures how much worse an online algorithm performs compared to an optimal offline algorithm that sees all input in advance. For the ski rental problem (rent or buy skis each day you ski, not knowing how many days you will ski in total), the optimal deterministic competitive ratio is 2, achieved by renting until the rental cost equals the purchase price, then buying.

No-regret learning algorithms achieve sublinear regret — the difference in performance compared to the best fixed action in hindsight. The multiplicative weights update method is a meta-algorithm achieving \(O(\sqrt{T \log n})\) regret after \(T\) rounds over \(n\) actions. No-regret learning underpins modern developments in boosting, online convex optimisation, and equilibrium computation in game theory (as studied in CO 456).

Minimum Spanning Trees

Closely related to Dijkstra’s algorithm — and arguably the canonical example of a greedy algorithm on graphs — is the minimum spanning tree (MST) problem: given a weighted undirected graph, find a spanning tree of minimum total edge weight. Two classical greedy algorithms solve this:

Prim’s algorithm grows the MST one edge at a time, always adding the lightest edge crossing the cut between the current tree and the rest of the graph. With a Fibonacci heap it runs in \(O(n \log n + m)\), matching Dijkstra. The correctness proof — the “cut property” — shows that the minimum-weight edge crossing any cut must be in some MST, so greedy commitment is safe.

Kruskal’s algorithm sorts all edges by weight and adds them one by one, skipping any edge that would create a cycle, until \(n-1\) edges have been added. The cycle check is handled by a Union-Find (Disjoint Set Union) data structure supporting union and find operations in near-\(O(1)\) amortised time via path compression and union by rank (inverse Ackermann \(\alpha(n)\) per operation). Total time: \(O(m \log m)\).

MSTs appear throughout algorithm design: they seed metric TSP approximations, network design problems, and cluster analysis. They are also deeply connected to LP theory — the MST polytope is an integral LP, meaning the LP relaxation of the MST integer program always has integer optimal solutions. This places MST among the “easy” problems where LP duality gives exact algorithms rather than just relaxations.

The optimal MST algorithm, due to Karger, Klein, and Tarjan (1995), runs in expected \(O(m)\) time using randomisation and a clever verification procedure. Whether a deterministic linear-time MST algorithm exists for general graphs is still open.

Linear Programming and Algorithm Design

Linear programming (LP) — optimising a linear objective over a polyhedron defined by linear constraints — is the backbone of a substantial fraction of modern algorithm design. For this course, LP appears implicitly in shortest paths (dual variables are distance potentials), matching (König’s theorem is LP duality for a network matrix), and Dijkstra’s algorithm (the distance label is a dual variable).

The simplex algorithm (Dantzig, 1947) solves LP in practice but is exponential in the worst case. Ellipsoid method (Khachiyan, 1979) and interior point methods (Karmarkar, 1984) are provably polynomial. Interior point methods run in \(O(n^{3.5} L)\) arithmetic operations (where \(L\) is the input size in bits), and are practical for instances with millions of variables.

LP enters algorithm design in two ways beyond exact solvers. First, LP relaxation: formulate a combinatorial problem as an integer LP, relax the integrality constraints, and solve the LP. If the constraint matrix is totally unimodular (as for bipartite matching, shortest paths, and flows), the LP solution is automatically integral. Otherwise the LP gives a lower bound and the integrality gap quantifies how much is lost. Second, randomised rounding: round the fractional LP solution to an integer solution randomly, preserving the objective value in expectation up to a factor quantified by the integrality gap. This technique underlies the best known approximations for facility location, set cover, and survivable network design.

Parameterised Complexity

Many NP-complete problems become tractable on small inputs or when a natural parameter is small. Parameterised complexity makes this precise. A problem is fixed-parameter tractable (FPT) with respect to parameter \(k\) if it is solvable in time \(f(k) \cdot n^{O(1)}\) for some computable function \(f\). The key point: the exponential blowup is confined to \(k\), so if \(k\) is small the algorithm is practical even for large \(n\).

Vertex cover, for example, is NP-complete in general but FPT in the solution size \(k\): the branching algorithm that either includes or excludes a vertex and prunes the graph runs in \(O(2^k \cdot (n + m))\). For \(k \leq 50\) this is fast even for million-vertex graphs.

Kernelisation is a preprocessing technique that reduces an instance to a small “kernel” whose size depends only on \(k\), not on \(n\). For vertex cover, any vertex of degree greater than \(k\) must be in every vertex cover of size \(k\), so it can be immediately included; after this reduction the kernel has at most \(k^2\) edges. The entire FPT algorithm then runs on the kernel.

The W hierarchy classifies problems believed to be FPT-hard: problems in \(W[1]\) are unlikely to be FPT (analogous to NP for polynomial-time), and \(k\)-Clique is the canonical \(W[1]\)-complete problem. This provides a framework for proving that certain parameterisations are unlikely to admit FPT algorithms.

Advanced Data Structures

CS 341 treats data structures as a black box: algorithms use heaps, queues, and arrays without asking how they work. Advanced algorithm courses study data structures as first-class objects.

van Emde Boas trees support predecessor/successor queries on integers in \(\{0, 1, \ldots, U-1\}\) in \(O(\log \log U)\) time — beating the \(O(\log n)\) of balanced BSTs. The key is a recursive structure that divides the universe into blocks of size \(\sqrt{U}\) and maintains a “summary” structure at the top level.

Fibonacci heaps achieve \(O(1)\) amortised decrease-key and \(O(\log n)\) extract-min, enabling Dijkstra and MST to run in \(O(n \log n + m)\). The amortised analysis uses a potential function that charges costly restructuring to previously “lazy” operations.

Segment trees and fenwick trees support range queries and point updates on arrays in \(O(\log n)\), enabling efficient solutions to problems that would otherwise require \(O(n)\) per query. They are indispensable in competitive programming and many practical applications.

Suffix arrays and the Burrows-Wheeler transform enable pattern matching in strings in \(O(m \log n)\) after \(O(n \log n)\) preprocessing, with applications to genome sequencing and data compression (bzip2). The connection to LCP arrays and the beautiful algorithms for constructing them in \(O(n)\) time is a highlight of string algorithmics.

Follow-up Courses and Reading

At the University of Waterloo, CS 341 is the gateway to a constellation of more advanced courses. CS 466 (Algorithm Design and Analysis) and CO 754 (Approximation Algorithms) develop approximation techniques and hardness of approximation for NP-hard problems. CO 351 (Network Flow Theory) develops the theory of flows, cuts, and transportation, with applications to matching, scheduling, and supply-chain problems. CO 452 (Integer Programming) studies LP relaxations, total unimodularity, cutting-plane methods, and the Gomory-Chvátal procedure. CS 761 (Sequential and Parallel Algorithms) studies randomised algorithms, advanced data structures, and computational geometry at the graduate level. CO 759 (Special Topics in Combinatorial Optimisation) covers current research themes including exact algorithms, parameterised complexity, and polyhedral combinatorics.

For algorithms on graphs with special structure, CO 342 (Introduction to Graph Theory) and CO 442 (Graph Theory II, including treewidth and Hadwiger’s conjecture) are natural companions. CS 486 (Introduction to Artificial Intelligence) applies algorithmic thinking to search, inference, and machine learning problems; search algorithms like A* are Dijkstra with a heuristic, making CS 341 prerequisite knowledge for AI.

Peer-university equivalents of CS 341 worth studying alongside include:

  • MIT 6.046J Design and Analysis of Algorithms (available free on MIT OpenCourseWare): a comparable syllabus with particularly strong problem sets. The Spring 2015 offering by Demaine, Devadas, and Lynch adds modules on van Emde Boas trees, amortised analysis, and cache-oblivious algorithms.
  • Stanford CS161 Introduction to Computer Science: Design and Analysis of Algorithms: free on Coursera with video lectures and assessed problem sets.
  • Tim Roughgarden’s CS261 at Stanford (Beyond Worst-Case Analysis): develops average-case, smoothed, and parameterised analyses, asking why algorithms that perform well in practice (like simplex for LP) cannot be explained by worst-case theory.
  • Erik Demaine’s 6.851 Advanced Data Structures (MIT OCW): an advanced graduate course covering the full depth of data structure theory, including lower bounds and cell-probe complexity.

Standard graduate references beyond the CS 341 textbooks include:

  • Mitzenmacher and Upfal, Probability and Computing, 2nd ed. (Cambridge, 2017): the standard graduate introduction to randomised algorithms, covering random graphs, Markov chains, hashing, and concentration inequalities.
  • Williamson and Shmoys, The Design of Approximation Algorithms (Cambridge, 2011): comprehensive, freely available online, covering primal-dual methods, LP rounding, semidefinite programming, and PTAS constructions.
  • Schrijver, Combinatorial Optimization: Polyhedra and Efficiency (Springer, 2003): the encyclopaedic three-volume reference for everything involving LP and polyhedral combinatorics.
  • Garey and Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, 1979): the canonical NP-completeness catalogue; still the definitive reference for reduction blueprints.
  • Vazirani, Approximation Algorithms (Springer, 2001): concise and elegant, covering vertex cover, set cover, metric TSP, Steiner tree, bin packing, and scheduling.
  • Roughgarden, Algorithms Illuminated (four volumes, 2017–2020): accessible, modern, and free online; based on Stanford CS161/CS261 with excellent exercises.
  • Cygan et al., Parameterized Algorithms (Springer, 2015): the standard graduate text on FPT algorithms and the W-hierarchy, freely available from the authors.
  • Ausiello et al., Complexity and Approximation (Springer, 1999): systematic survey of approximation ratios and inapproximability for hundreds of combinatorial optimisation problems.

At the University of Waterloo, CS 341 is the gateway to a constellation of more advanced courses. CS 466 (Algorithm Design and Analysis) and CO 754 (Approximation Algorithms) develop approximation techniques and hardness of approximation for NP-hard problems. CO 351 (Network Flow Theory) develops the theory of flows, cuts, and transportation, with applications to matching, scheduling, and supply-chain problems. CO 452 (Integer Programming) studies LP relaxations, total unimodularity, and cutting-plane methods. CS 761 (Sequential and Parallel Algorithms) studies randomised algorithms, data structures, and computational geometry at the graduate level. CO 759 (Special Topics) covers current research themes in combinatorial optimisation.

For algorithms on graphs with special structure, CO 342 (Graph Theory) and CO 442 (Graph Theory II, including treewidth) are natural companions. CS 486 (Introduction to Artificial Intelligence) applies algorithmic thinking to search, inference, and learning problems.

Standard graduate references beyond the textbooks used in CS 341 include:

  • Mitzenmacher and Upfal, Probability and Computing, 2nd ed. (Cambridge, 2017): the standard graduate introduction to randomised algorithms.
  • Williamson and Shmoys, The Design of Approximation Algorithms (Cambridge, 2011): comprehensive and freely available online.
  • Schrijver, Combinatorial Optimization: Polyhedra and Efficiency (Springer, 2003): the encyclopaedic reference, three volumes.
  • Garey and Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, 1979): the original catalogue; still the definitive reference for NP-completeness reductions.
  • Vazirani, Approximation Algorithms (Springer, 2001): concise and elegant, covering vertex cover, set cover, metric TSP, steiner tree, and bin packing.
  • Roughgarden, Algorithms Illuminated (four volumes, 2017–2020): accessible, modern, and free online.

MIT 6.046J (Design and Analysis of Algorithms, available on MIT OpenCourseWare) covers a comparable syllabus with superb lecture notes and problem sets. The Spring 2015 offering by Demaine, Devadas, and Lynch includes modules on van Emde Boas trees, amortised analysis, and cache-oblivious algorithms not covered in CS 341.

Tim Roughgarden’s CS261 at Stanford (Beyond Worst-Case Analysis) develops average-case, smoothed, and parameterised analyses, asking why algorithms that seem to perform well in practice (like simplex for LP) cannot be explained by worst-case theory.

Open Problems and Active Research

Algorithm design remains a vibrant research field. Several of the open problems most directly adjacent to CS 341 are among the deepest in theoretical computer science.

P vs NP

The central open problem: is every problem whose solutions can be verified in polynomial time also solvable in polynomial time? More than five decades after Cook’s theorem, no progress has been made toward a proof in either direction. The problem is entangled with circuit complexity lower bounds, and barrier results (relativisation, natural proofs, algebrisation) suggest that new mathematical techniques will be required. A polynomial-time algorithm for any NP-complete problem would imply polynomial-time algorithms for all NP problems, with profound implications for cryptography, optimisation, and AI.

The Unique Games Conjecture

Khot’s Unique Games Conjecture (2002) asserts that a certain label-coverage problem — Unique Games — is NP-hard even to approximately solve. If true, it implies optimal hardness of approximation for vertex cover (\(2 - \epsilon\) is NP-hard), max-cut (\(0.878\) is optimal), and dozens of other problems. Barak et al. (2011) gave a sub-exponential algorithm for Unique Games, providing the first evidence against the conjecture. The question remains open and is the most debated conjecture in approximation complexity.

All-Pairs Shortest Paths in Sub-Cubic Time

For graphs with integer weights, Floyd-Warshall runs in \(O(n^3)\). Can all-pairs shortest paths be solved in \(O(n^{3-\epsilon})\) for some \(\epsilon > 0\)? For dense graphs, this has been open for over 60 years. The only improvements come for special cases: unweighted graphs in \(O(n^3/\log n)\), graphs with small integer weights using matrix multiplication. Under SETH, no truly sub-cubic algorithm is believed to exist. This is a paradigmatic example of a fine-grained lower bound: the problem is polynomial, but the polynomial exponent seems to hit a hard barrier.

Planar Graph Algorithms and Practical Shortest Paths

For planar graphs, Dijkstra’s \(O(n \log n)\) running time can be improved, and the all-pairs shortest-path problem is solvable in \(O(n^{3/2})\) — better than the \(O(n^2 \log n)\) you would get from dense-graph algorithms. The question of whether truly sub-quadratic algorithms exist for planar APSP drives current research in computational geometry and graph algorithms. Practical algorithms for shortest paths in very large road networks (billions of edges) use heuristics like contraction hierarchies that are empirically very fast but not worst-case optimal.

Faster Algorithms for Maximum Matching

For bipartite graphs, Hopcroft and Karp (1973) achieved \(O(m\sqrt{n})\) matching. For general graphs, Micali and Vazirani (1980) generalised this to \(O(m\sqrt{n})\) as well. In 2021, van den Brand et al. gave an \(O(n^{\omega} + n^{2+1/6})\) algorithm (where \(\omega < 2.371\) is the matrix multiplication exponent), achieving the first improvement in 40 years. Bringing the complexity of maximum matching closer to that of bipartite matching, or showing a separation, is an active research frontier.

Dynamic Graph Algorithms

CS 341 focuses on static graphs — the input is fixed. In many applications (social networks, road networks, communication networks), the graph changes over time: edges are inserted or deleted, and we want to maintain properties like shortest paths, connected components, and matchings under these updates. Dynamic graph algorithms balance the update time against the query time. Fully dynamic connectivity (maintaining connected components as edges are inserted and deleted) is solvable in \(O(\log^2 n)\) amortised per operation (Holm, de Lichtenberg, Thorup, 2001); fully dynamic shortest paths remain much harder.

Sublinear Algorithms and Property Testing

As inputs grow to terabyte scale, even linear-time algorithms are too slow. Sublinear algorithms read only a small fraction of the input. The field of property testing (Rubinfeld-Sudan, 1996) asks: can we determine whether an input has a certain property (is the graph bipartite? is the function linear?) by querying only \(O(\text{poly}(1/\epsilon))\) bits, with error probability \(\epsilon\)? For monotonicity testing and other classical problems, optimal query complexity is now known. The field connects to coding theory, harmonic analysis, and pseudorandomness.

Quantum Algorithms

Quantum computers offer super-polynomial speedups for specific problems. Grover’s search algorithm searches an unsorted list of \(N\) items in \(O(\sqrt{N})\) quantum operations — a quadratic speedup over classical search. This has implications for NP-complete problems: a quantum computer can search the \(2^n\)-size space of TSP solutions in \(O(2^{n/2}\text{poly}(n))\) operations. Shor’s algorithm factors integers in polynomial quantum time, breaking RSA cryptography.

More relevantly for algorithm design: quantum walk algorithms give quadratic speedups for graph problems including element distinctness (\(O(n^{2/3})\) queries vs \(O(n)\) classically) and triangle detection. The BQP class (bounded-error quantum polynomial time) sits above P and is believed to be strictly below NP, meaning quantum computers cannot solve NP-complete problems efficiently. The exact placement of BQP in the complexity hierarchy is an open problem.

ETH-Tight Algorithms

For many NP-complete problems, the best known algorithm has complexity \(2^{O(n)}\), and ETH implies no \(2^{o(n)}\) algorithm exists. For \(k\)-SAT, the Paturi-Pudlák-Saks-Zane algorithm achieves \(O^*(2^{(1-\Omega(1/k))n})\), and SETH says this cannot be improved to \(2^{(1-\epsilon)n}\) for large \(k\). Active research seeks ETH-tight algorithms: algorithms that match the ETH lower bound exactly, explaining the precise complexity of problems like Hamiltonian Path (\(O^*(2^n)\)), Set Cover (\(O^*(2^n)\)), and the travelling salesman (\(O^*(2^n)\)).

The quest for ETH-tight algorithms is driving a renaissance in exact exponential algorithm design, with new techniques from inclusion-exclusion, fast subset convolution, and algebraic methods over finite fields.

Back to top