CO 485: Mathematics of Public-Key Cryptography

Sam Jaques

Estimated study time: 2 hr 56 min

Table of contents

Sources and References

Primary textbook — Jeffrey Hoffstein, Jill Pipher, and Joseph H. Silverman, An Introduction to Mathematical Cryptography, 2nd ed., Springer Undergraduate Texts in Mathematics, 2014. Supplementary texts — Dan Boneh and Victor Shoup, A Graduate Course in Applied Cryptography (draft v0.6, freely available at toc.cryptobook.us); Michael A. Nielsen and Isaac L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, 2000; Peter Shor, “Algorithms for quantum computation: discrete logarithms and factoring,” FOCS 1994; Lov K. Grover, “A fast quantum mechanical algorithm for database search,” STOC 1996. Online resources — MIT 6.875 Cryptography lecture notes (Vinod Vaikuntanathan); Stanford CS 359C Post-Quantum Cryptography notes; NIST Post-Quantum Cryptography Standardization documentation (csrc.nist.gov/projects/post-quantum-cryptography); Dan Bernstein and Tanja Lange, “Post-Quantum Cryptography,” Nature 549 (2017), 188–194.


Chapter 1: The Discrete Logarithm Problem and Diffie-Hellman

1.1 Diffie-Hellman Key Exchange

Public-key cryptography rests on a simple but powerful asymmetry: certain mathematical operations are easy to perform but hard to reverse. The Diffie-Hellman protocol, introduced in 1976, was the first published scheme to exploit this asymmetry for key agreement, allowing two parties to establish a shared secret over an entirely public channel.

Protocol (Diffie-Hellman Key Exchange). Let \( G \) be a group and \( g \in G \) a fixed public generator. Alice and Bob each choose private exponents uniformly at random: Alice picks \( a \in \{1,\ldots,|G|-1\} \) and Bob picks \( b \). They exchange public values \( g^a \) and \( g^b \). The shared secret is \( g^{ab} \), which Alice computes as \( (g^b)^a \) and Bob computes as \( (g^a)^b \).

An eavesdropper Eve sees \( g \), \( g^a \), and \( g^b \), but not \( a \), \( b \), or \( g^{ab} \) — provided the discrete logarithm problem (DLOG) is hard in \( G \). The DLOG problem asks: given \( g \) and \( h = g^x \), find \( x \). The security of DH rests on the intractability of this problem.

1.2 Square-and-Multiply

For DH to be practical, computing \( g^a \) must be efficient even when \( a \) is hundreds of bits long. Naive repeated multiplication requires \( a \) group operations, which is exponential in the bit-length of \( a \). The square-and-multiply algorithm brings this down to \( O(\log a) \) operations.

Algorithm (Square-and-Multiply). Write \( a = \sum_{i=0}^{k-1} a_i 2^i \) in binary with \( a_{k-1} = 1 \). Initialize \( r \leftarrow g \). For \( i \) from \( k-2 \) down to \( 0 \): set \( r \leftarrow r^2 \), and if \( a_i = 1 \) set \( r \leftarrow r \cdot g \). Return \( r \).

The algorithm processes each bit of \( a \) with at most two group operations (one squaring, one optional multiplication), yielding at most \( 2\lfloor \log_2 a \rfloor \) operations total. This is \( O(\log a) \), making exponentiation efficient even for 256-bit or 2048-bit exponents.

1.3 Design Requirements for the Group G

Not every group is suitable for Diffie-Hellman. Five requirements must be satisfied simultaneously, each motivated by a concrete practical or security concern.

Requirements (DH Group). A group \( G \) with generator \( g \) is suitable for Diffie-Hellman if it satisfies all of the following:
  1. Hard DLOG: The discrete logarithm problem must be computationally infeasible. Without this, Eve directly recovers \( a \) from \( g^a \) and then computes the shared secret \( (g^b)^a \) herself.
  2. Compact representation: Group elements should be representable in a small number of bits (e.g., 256 bits). A DH handshake transmits at least two group elements; if each element is kilobytes, the protocol becomes a bandwidth bottleneck for real-time applications.
  3. Efficient group operations: Multiplication (or addition) in \( G \) must be computable in polynomial time. Exponentiation requires \( O(\log a) \) group operations via square-and-multiply; slow group operations make the protocol impractical even on fast hardware.
  4. Large prime-order subgroup: The generator \( g \) should have prime order \( q \), ideally \( q \approx |G| \), to avoid small-subgroup attacks. Pohlig-Hellman shows that DLOG in a group of order \( N = \prod p_i^{e_i} \) reduces to DLOG in each prime-power subgroup — at cost \( O(\sum e_i\sqrt{p_i}) \). If \( |G| \) has small prime factors, DLOG becomes easy regardless of the size of \( |G| \).
  5. Message embedding: It must be possible to encode plaintexts as group elements (for ElGamal-style encryption). Without a natural encoding, the group can be used for key agreement but not directly for encryption.

These five requirements drastically narrow the field. Two families of groups satisfy all of them in practice: the multiplicative group \( (\mathbb{Z}/p\mathbb{Z})^* \) for a large prime \( p \), and the group of points on an elliptic curve over a finite field. All other commonly studied groups either have easy DLOG (e.g., \( (\mathbb{Z}/n\mathbb{Z}, +) \)) or fail other criteria.

Example (Small DH Key Exchange). Let \( G = \mathbb{Z}_{23}^* \) with prime \( p = 23 \) and generator \( g = 5 \) (a primitive root mod 23, since \( |\mathbb{Z}_{23}^*| = 22 \) and 5 has order 22). \[ A = 5^6 \bmod 23 = 15625 \bmod 23. \]

Since \( 15625 = 679 \times 23 + 8 \), we get \( A = 8 \).

Bob chooses private key \( b = 15 \) and computes \( B = 5^{15} \bmod 23 \) by repeated squaring:

\[ 5^1 = 5,\quad 5^2 = 25 \equiv 2,\quad 5^4 \equiv 4,\quad 5^8 \equiv 16 \pmod{23}. \]

Then \( 5^{15} = 5^8 \cdot 5^4 \cdot 5^2 \cdot 5^1 = 16 \cdot 4 \cdot 2 \cdot 5 = 640 \). Now \( 640 = 27 \times 23 + 19 \), so \( B = 19 \).

\[ 19^2 \equiv 361 \equiv 361 - 15 \times 23 = 16,\quad 19^3 \equiv 16 \times 19 = 304 \equiv 304 - 13\times 23 = 5, \]\[ 19^6 \equiv 5^2 = 25 \equiv 2 \pmod{23}. \]

For Bob: \( 8^2 = 64 \equiv 18 \), \( 8^4 \equiv 18^2 = 324 \equiv 324 - 14\times 23 = 2 \), \( 8^8 \equiv 4 \), \( 8^{15} = 8^8 \cdot 8^4 \cdot 8^2 \cdot 8^1 \equiv 4 \cdot 2 \cdot 18 \cdot 8 = 1152 \equiv 1152 - 50\times 23 = 2 \pmod{23} \).

Both parties recover shared secret \( 2 \). An eavesdropper sees \( p=23 \), \( g=5 \), \( A=8 \), \( B=19 \) but must solve the discrete logarithm problem to find \( a \) or \( b \).

1.4 Why \((\mathbb{Z}/n\mathbb{Z}, +)\) Fails

The additive group of integers modulo \( n \) looks superficially like a cyclic group suitable for DH. However, its DLOG problem — given \( g \) and \( h = g \cdot x \pmod{n} \) (i.e., \( gx \equiv h \pmod{n} \)), find \( x \) — reduces to computing \( x = g^{-1} h \pmod{n} \). The modular inverse \( g^{-1} \pmod{n} \) is computed efficiently by the Extended Euclidean Algorithm, making DLOG trivial. This disqualifies the additive group entirely.

1.5 The Extended Euclidean Algorithm

The Extended Euclidean Algorithm (EEA) computes \( \gcd(a, b) \) and simultaneously finds Bézout coefficients \( s, t \) such that \( sa + tb = \gcd(a,b) \). It is the backbone of modular arithmetic.

Algorithm (Extended Euclidean). Maintain triples \( (r_i, s_i, t_i) \) satisfying \( s_i a + t_i b = r_i \). Initialize \( (r_0, s_0, t_0) = (a, 1, 0) \) and \( (r_1, s_1, t_1) = (b, 0, 1) \). At each step compute \( q_i = \lfloor r_{i-1}/r_i \rfloor \) and set \( r_{i+1} = r_{i-1} - q_i r_i \), \( s_{i+1} = s_{i-1} - q_i s_i \), \( t_{i+1} = t_{i-1} - q_i t_i \). Stop when \( r_{i+1} = 0 \); then \( \gcd(a,b) = r_i \).

Runtime analysis via Fibonacci argument. Each step produces a remainder \( r_{i+1} = r_{i-1} \bmod r_i < r_{i-1} \). More precisely, if \( q_i \geq 2 \) then \( r_{i+1} \leq r_{i-1}/2 \), and if \( q_i = 1 \) then \( r_{i+2} < r_i/2 \) (since two steps with quotient 1 cut the remainder at least in half). In the worst case (all quotients equal 1), the sequence \( r_0, r_1, r_2, \ldots \) follows the Fibonacci recurrence, and since \( F_k \approx \phi^k / \sqrt{5} \), the number of steps is \( O(\log_\phi b) = O(\log b) \). Thus EEA runs in \( O(\log n) \) divisions, or equivalently \( O((\log n)^2) \) bit operations.

1.6 DLOG Algorithms

Several classical algorithms attack DLOG. Understanding them guides parameter selection.

Remark (Baby-Step Giant-Step — Shanks). To compute \( \log_g h \) in \( \mathbb{Z}_p^* \), write the unknown exponent as \( a = qm + r \) where \( m = \lceil \sqrt{p-1} \rceil \) and \( 0 \le r, q < m \). Then \( g^a = h \) means \( g^r = h \cdot (g^{-m})^q \).

Baby steps: precompute and store the table \( \{(g^r \bmod p,\, r) : r = 0, 1, \ldots, m-1\} \).

Giant steps: compute \( h \cdot (g^{-m})^q \bmod p \) for \( q = 0, 1, \ldots, m-1 \) until a match with the table is found. A match yields both \( r \) and \( q \), giving \( a = qm + r \).

Total work: \( O(\sqrt{p}) \) time and \( O(\sqrt{p}) \) space. This is why \( p \) must be at least 2048 bits for 112-bit security: a square root attack on a 2048-bit prime group costs \( \sqrt{2^{2048}} = 2^{1024} \) operations, which is far too large even for quantum computers. However, the index calculus algorithm (below) does much better for \( \mathbb{Z}_p^* \), so the effective security is lower, and BSGS serves mainly as a baseline for groups (like elliptic curves) where index calculus has no analogue.

Theorem (Pohlig-Hellman Reduction). If \( |G| = p_1^{e_1} \cdots p_k^{e_k} \) is a smooth number (all prime factors \( p_i \) are small), then DLOG in \( G \) reduces to DLOG in groups of prime order \( p_i \) by the Chinese Remainder Theorem. The total cost is \[ O\!\left(\sum_{i=1}^k e_i \bigl(\log|G| + \sqrt{p_i}\bigr)\right). \] If all prime factors \( p_i \) are small, then each subproblem \( \sqrt{p_i} \) is easy, and DLOG in \( G \) is entirely tractable regardless of the size of \( |G| \). This is why DH groups must have a large prime subgroup of order \( q \): if \( q \approx |G| \), Pohlig-Hellman gains nothing, and the attacker is forced to work in the full prime-order subgroup at cost \( O(\sqrt{q}) \).

Pohlig-Hellman is the precise algebraic reason behind requirement 4 in the DH group definition: choosing a group whose order has large prime factors is not merely a convention but a hard necessity. A group of order \( 2^{100} \), for instance, would have \( p_1 = 2 \) and \( e_1 = 100 \), and DLOG would cost only \( O(100 \cdot (\log |G| + \sqrt{2})) = O(\log^2 |G|) \) operations — polynomial in the key size.

Baby-step Giant-step (BSGS). For a group element \( h = g^x \) with \( x \in [0, q) \), write \( x = im + j \) where \( m = \lceil \sqrt{q} \rceil \). Precompute a table of \( g^j \) for \( j = 0, \ldots, m-1 \). Then compute \( h \cdot (g^{-m})^i \) for \( i = 0,1,\ldots \) until a match with the table is found. Time and space complexity: \( O(\sqrt{q}) \). This is the baseline against which other algorithms are compared.

Pohlig-Hellman. If \( |G| = \prod p_i^{e_i} \) has small prime factors, DLOG in \( G \) reduces to DLOG in each prime-power subgroup of order \( p_i^{e_i} \), which costs \( O(e_i \sqrt{p_i}) \) per factor. Combined cost is \( O(\sum e_i \sqrt{p_i}) \). For security, one must use a group whose order has a large prime factor \( q \) (ideally prime order); then Pohlig-Hellman gains nothing, and BSGS in the subgroup of order \( q \) costs \( O(\sqrt{q}) \).

Index calculus. In \( (\mathbb{Z}/p\mathbb{Z})^* \), a factor base \( \mathcal{B} = \{\text{small primes}\} \) is chosen. One collects relations by checking whether random powers \( g^k \bmod p \) factor over \( \mathcal{B} \). A linear algebra step over \( \mathbb{Z}_{p-1} \) then recovers discrete logs for all elements of \( \mathcal{B} \), and individual logs follow quickly. The resulting sub-exponential complexity is \( L_p[1/2, c] = e^{c(\log p)^{1/2}(\log \log p)^{1/2}} \), which is much faster than \( O(\sqrt{p}) \) for large \( p \). Index calculus has no known analogue on elliptic curves, which is why ECC offers much shorter key sizes.

Pollard rho. Uses a pseudo-random walk on \( G \) with Floyd’s cycle detection, achieving \( O(\sqrt{q}) \) expected time but only \( O(1) \) space (unlike BSGS which needs \( O(\sqrt{q}) \) space). For elliptic curves, Pollard rho is essentially optimal among known algorithms.

The table below summarizes the complexity landscape:

AlgorithmGroupTimeSpace
Baby-step Giant-stepAny\( O(\sqrt{q}) \)\( O(\sqrt{q}) \)
Pohlig-HellmanComposite-order\( O(\sum e_i \sqrt{p_i}) \)\( O(\max \sqrt{p_i}) \)
Index calculus\( (\mathbb{Z}/p\mathbb{Z})^* \)sub-exp in \( p \)sub-exp
Pollard rhoAny\( O(\sqrt{q}) \)\( O(1) \)

The existence of index calculus for \( (\mathbb{Z}/p\mathbb{Z})^* \) explains why elliptic curve keys can be 8–12× shorter than DH keys at equivalent security.


Chapter 2: Modular Arithmetic and Algebraic Structures

The cryptographic constructions in Chapter 1 require working in specific algebraic structures — the multiplicative group \( \mathbb{Z}_p^* \), rings of integers modulo composites, and polynomial quotient rings — and their security depends intimately on the arithmetic properties of those structures. This chapter develops the toolkit: cyclic groups and generators (which explain when DH is well-defined), Euler’s theorem (the engine behind RSA correctness), the Chinese Remainder Theorem (the reason CRT-RSA is 4× faster than naive RSA), quadratic residues (which explain why DDH fails in \( \mathbb{Z}_p^* \) and motivates the move to prime-order subgroups), and polynomial rings (the algebraic setting for Ring-LWE in Chapter 5). Each section is both self-contained and forward-pointing: the abstract algebra here will reappear concretely in every subsequent protocol.

2.1 The Group \(\mathbb{Z}_p^*\) and Cyclic Structure

Fix a prime \( p \). The multiplicative group \( \mathbb{Z}_p^* = \{1, 2, \ldots, p-1\} \) under multiplication modulo \( p \) is a cyclic group of order \( p-1 \). Being cyclic means there exists a generator (primitive root) \( g \) such that every element of \( \mathbb{Z}_p^* \) is a power of \( g \). This is non-trivial: not every group is cyclic, and even for \( \mathbb{Z}_p^* \) the proof requires some field theory. The upshot for cryptography is that DH key exchange makes sense in \( \mathbb{Z}_p^* \) — we can take arbitrary powers of a fixed base and cover the whole group.

Theorem (Fermat's Little Theorem). For any \( a \not\equiv 0 \pmod{p} \), we have \( a^{p-1} \equiv 1 \pmod{p} \).

Fermat’s little theorem is a consequence of Lagrange’s theorem applied to \( \mathbb{Z}_p^* \): the order of any element divides the group order \( p-1 \). More generally, Euler’s theorem states \( a^{\phi(n)} \equiv 1 \pmod{n} \) for any \( a \) with \( \gcd(a,n) = 1 \), where \( \phi \) is Euler’s totient function.

2.2 Order of an Element

Definition. The order of \( a \in \mathbb{Z}_n^* \) is the smallest positive integer \( t \) such that \( a^t \equiv 1 \pmod{n} \). Denote it \( \text{ord}(a) \).

Several key properties follow:

  • \( a^{|G|} = 1 \) (Lagrange), so \( \text{ord}(a) \mid |G| \).
  • \( a^s = 1 \) if and only if \( \text{ord}(a) \mid s \).
  • \( a^x = a^y \) if and only if \( x \equiv y \pmod{\text{ord}(a)} \).

These properties are central to analyzing algorithms involving modular exponentiation, including primality tests and key generation procedures.

Example (Primitive Root mod 7). The group \( \mathbb{Z}_7^* = \{1,2,3,4,5,6\} \) has order \( \phi(7) = 6 = 2 \times 3 \). An element \( g \) is a primitive root (generator) if and only if its order is exactly 6, which by Lagrange requires that \( g^2 \not\equiv 1 \) and \( g^3 \not\equiv 1 \pmod 7 \).

Test \( g = 3 \):

\[ 3^1 = 3,\quad 3^2 = 9 \equiv 2,\quad 3^3 = 27 \equiv 6 \equiv -1,\quad 3^6 \equiv (-1)^2 = 1 \pmod 7. \]

Since \( 3^2 = 2 \ne 1 \) and \( 3^3 = 6 \ne 1 \), the order of 3 is not 1, 2, or 3. The order must divide 6, so \( \text{ord}(3) = 6 \): the element 3 is a primitive root mod 7.

The number of primitive roots mod 7 is \( \phi(\phi(7)) = \phi(6) = 2 \). They are 3 and 5 (one can verify \( 5^1=5,\, 5^2=4,\, 5^3=6,\, 5^6=1 \) with no earlier return to 1). The other elements have orders dividing 6 but strictly less than 6: \( \text{ord}(2) = 3 \), \( \text{ord}(4) = 3 \), \( \text{ord}(6) = 2 \).

2.3 Euler’s Phi Function

Definition. The Euler totient function \( \phi(n) \) counts the number of integers in \( \{1,\ldots,n\} \) coprime to \( n \). Equivalently, \( \phi(n) = |(\mathbb{Z}/n\mathbb{Z})^*| \).

Key formulas:

  • For prime \( p \): \( \phi(p) = p - 1 \).
  • For prime power: \( \phi(p^k) = p^{k-1}(p-1) \).
  • Multiplicativity: if \( \gcd(m,n)=1 \) then \( \phi(mn) = \phi(m)\phi(n) \).
  • For \( n = p_1^{e_1} \cdots p_r^{e_r} \): \(\phi(n) = n \prod_{p \mid n} (1 - 1/p)\).

2.4 Chinese Remainder Theorem

The Chinese Remainder Theorem (CRT) is one of the most powerful tools in modular arithmetic. It establishes that working modulo a product \( mn \) (when \( \gcd(m,n) = 1 \)) is equivalent to working modulo \( m \) and \( n \) simultaneously.

Theorem (CRT). If \( \gcd(m, n) = 1 \), then the natural map \(\mathbb{Z}/mn\mathbb{Z} \to \mathbb{Z}/m\mathbb{Z} \times \mathbb{Z}/n\mathbb{Z}\) sending \( x \mapsto (x \bmod m,\, x \bmod n) \) is a ring isomorphism.

Constructive proof. Given \( (a, b) \in \mathbb{Z}/m\mathbb{Z} \times \mathbb{Z}/n\mathbb{Z} \), apply Bézout’s theorem to find \( s, t \) with \( sm + tn = 1 \). Set \( x = a \cdot tn + b \cdot sm \pmod{mn} \). Then \( x \equiv a \cdot tn \equiv a(1 - sm) \equiv a \pmod{m} \) and similarly \( x \equiv b \pmod{n} \). Uniqueness follows since the map is between sets of the same finite size.

CRT underlies CRT-RSA (for fast decryption), Pohlig-Hellman, and the ring structure of cyclotomic rings used in lattice cryptography.

Remark (CRT as the Foundation of RSA). When \( N = pq \) with \( p, q \) distinct primes, the CRT isomorphism \( \mathbb{Z}_{pq} \cong \mathbb{Z}_p \times \mathbb{Z}_q \) is the algebraic reason RSA encryption is invertible. The exponent identity \( ed \equiv 1 \pmod{\phi(N)} \) ensures that both \( m^{ed} \equiv m \pmod p \) (by Fermat) and \( m^{ed} \equiv m \pmod q \) simultaneously, so by CRT, \( m^{ed} \equiv m \pmod{pq} \). Without knowing the factorization \( p, q \), an adversary cannot compute \( \phi(N) = (p-1)(q-1) \) and therefore cannot find \( d \) — the factoring problem is the computational barrier protecting the CRT structure. The same isomorphism powers CRT-RSA decryption (Section 6.5): working mod \( p \) and mod \( q \) separately then lifting via CRT reduces the cost from one large exponentiation to two smaller ones, giving a 4× speedup.

2.5 Quadratic Residues

Quadratic residues appear naturally when studying the structure of \( \mathbb{Z}_p^* \), but their cryptographic significance is subtle and important: they provide a computable partial information about elements of \( \mathbb{Z}_p^* \) that can be used to distinguish \( g^{ab} \) from a random group element — thereby breaking DDH in the full group \( \mathbb{Z}_p^* \). This is the precise reason why DDH-based schemes (like ElGamal) must work in the prime-order subgroup, not the full group.

Definition. For odd prime \( p \), an element \( a \in \mathbb{Z}_p^* \) is a quadratic residue (QR) if there exists \( x \) with \( x^2 \equiv a \pmod{p} \). Otherwise \( a \) is a quadratic non-residue (QNR).
Theorem (Euler's Criterion). \( a \) is a QR mod \( p \) if and only if \( a^{(p-1)/2} \equiv 1 \pmod{p} \). It is a QNR if and only if \( a^{(p-1)/2} \equiv -1 \pmod{p} \).

The Legendre symbol \( \left(\frac{a}{p}\right) \) is defined as \( +1 \) if \( a \) is a QR, \( -1 \) if QNR, and \( 0 \) if \( p \mid a \). There are exactly \( (p-1)/2 \) quadratic residues in \( \mathbb{Z}_p^* \). The Legendre symbol is multiplicative: \( \left(\frac{ab}{p}\right) = \left(\frac{a}{p}\right)\left(\frac{b}{p}\right) \). This multiplicativity is the DDH killer: since \( \left(\frac{g^{ab}}{p}\right) = \left(\frac{g^a}{p}\right)^b \), an adversary can efficiently compute whether \( g^{ab} \) is a QR given \( g^a \) and \( b \) (from \( g^b \)), whereas for a random \( g^c \) no such constraint exists. Computing square roots mod \( p \) can be done efficiently via the Tonelli-Shanks algorithm: write \( p - 1 = 2^s \cdot q \) with \( q \) odd, find a QNR \( n \), and iteratively reduce the exponent using the Sylow 2-subgroup structure. The algorithm runs in \( O(\log^2 p) \) time.

Example (Quadratic Residues mod 11). The group \( \mathbb{Z}_{11}^* \) has order 10. The quadratic residues are the squares: \[ 1^2=1,\; 2^2=4,\; 3^2=9,\; 4^2=5,\; 5^2=3 \pmod{11}, \] so \( \mathrm{QR} = \{1, 3, 4, 5, 9\} \) and \( \mathrm{QNR} = \{2, 6, 7, 8, 10\} \). There are exactly \( (11-1)/2 = 5 \) quadratic residues, as expected. \[ 2^{(11-1)/2} = 2^5 = 32 \equiv 32 - 2\times 11 = 10 \equiv -1 \pmod{11}. \]

So \( \left(\frac{2}{11}\right) = -1 \), confirming 2 is a QNR mod 11. (Note: \( 11 \equiv 3 \pmod 8 \), and the standard formula gives \( \left(\frac{2}{p}\right) = -1 \) whenever \( p \equiv 3 \) or \( 5 \pmod 8 \).)

Relevance to DDH. The Legendre symbol distinguishes \( g^{ab} \) from a random group element in \( \mathbb{Z}_p^* \): since \( \left(\frac{g^{ab}}{p}\right) = \left(\frac{g^a}{p}\right)^b \) is a computable constraint, DDH fails in the full group and prime-order subgroups must be used.

2.6 Rings, Fields, and Polynomial Quotient Rings

Definition. A ring \( R \) is a set with addition and multiplication satisfying: \( (R, +) \) is an abelian group; multiplication is associative and distributes over addition. A field is a commutative ring where every nonzero element has a multiplicative inverse.

The ring \( \mathbb{Z}_q = \mathbb{Z}/q\mathbb{Z} \) is a field if and only if \( q \) is prime. For composite \( q = ab \), the elements \( a \) and \( b \) are zero divisors with no inverses.

Polynomial quotient rings are essential for lattice-based cryptography. Given a prime \( q \) and a polynomial \( p(x) \in \mathbb{Z}_q[x] \), the ring \( \mathbb{Z}_q[x]/(p(x)) \) consists of all polynomials with coefficients in \( \mathbb{Z}_q \) reduced modulo \( p(x) \). Each element has a unique representative of degree less than \( \deg(p(x)) \), obtained via polynomial long division.

Theorem. \( \mathbb{Z}_q[x]/(p(x)) \) is a field if and only if \( p(x) \) is irreducible over \( \mathbb{Z}_q \).

This parallels the fact that \( \mathbb{Z}/n\mathbb{Z} \) is a field iff \( n \) is prime. Irreducible polynomials play the role of primes in the polynomial ring setting. When \( p(x) = x^d + 1 \) for \( d = 2^k \), this polynomial is irreducible over \( \mathbb{Z} \) (a cyclotomic polynomial), making \( \mathbb{Z}_q[x]/(x^d+1) \) a natural choice for ring-LWE.

2.7 Primality Testing

Generating large primes is a prerequisite for RSA and DH parameter generation. The naive approach (trial division up to \( \sqrt{n} \)) is exponential in the bit length. Two modern approaches:

Miller-Rabin test (probabilistic). Write \( n-1 = 2^s d \) with \( d \) odd. A witness for compositeness is an \( a \) such that \( a^d \not\equiv 1 \pmod{n} \) and \( a^{2^r d} \not\equiv -1 \pmod{n} \) for all \( r \in \{0,\ldots,s-1\} \). If \( n \) is prime, no witness exists (by Fermat). If \( n \) is composite, at least \( 3/4 \) of bases are witnesses. Running \( k \) rounds with random bases gives error probability \( \leq 4^{-k} \). With \( k = 40 \), the false-prime probability is \( 2^{-80} \), negligible in practice.

AKS primality test. Agrawal, Kayal, and Saxena (2004) proved that PRIMES is in P: there is a deterministic polynomial-time algorithm. AKS checks, for appropriate \( r \), whether \( (x+a)^n \equiv x^n + a \pmod{x^r-1, n} \) for several values of \( a \). Runtime is \( \tilde{O}((\log n)^{6}) \), deterministic. Though slower than Miller-Rabin in practice, AKS resolved a decades-old open question and is a landmark in complexity theory.

Carmichael numbers are composite integers \( n \) satisfying \( a^{n-1} \equiv 1 \pmod{n} \) for all \( \gcd(a,n)=1 \). They fool the naive Fermat test but not Miller-Rabin (Miller-Rabin is not based on \( a^{n-1} \equiv 1 \) alone). The smallest Carmichael number is 561 = 3·11·17.


Chapter 3: Elliptic Curves

The multiplicative group \( \mathbb{Z}_p^* \) supports Diffie-Hellman but has a critical weakness: the index calculus algorithm (Section 1.6) solves DLOG in sub-exponential time \( L_p[1/2] \). This means a 2048-bit prime is needed to achieve 112-bit security — a significant bandwidth and computation cost. Elliptic curves offer a different trade-off: their group of rational points \( E(\mathbb{F}_q) \) has no known analogue of index calculus, so the best DLOG algorithms are Pollard rho at \( O(\sqrt{q}) \) — fully exponential in the key size. The consequence is that a 256-bit elliptic curve provides the same 128-bit security as a 3072-bit DH group, with 12× smaller keys and proportionally faster operations.

The reason index calculus fails for elliptic curves is structural: index calculus relies on factoring random group elements over a “factor base” of small primes, which works because the integers have multiplicative structure. Points on an elliptic curve have no such multiplicative decomposition; there is no notion of “small” or “smooth” that applies. This structural difference is also, from a different angle, why elliptic curves remain quantum-vulnerable despite ECC’s classical advantages: Shor’s algorithm needs only the group’s abelian structure, which ECC possesses.

3.1 Definition and Weierstrass Form

An elliptic curve over a field \( \mathbb{F}_q \) is, in its simplest form, the set of solutions to a short Weierstrass equation together with a distinguished point at infinity.

Definition. An elliptic curve over \( \mathbb{F}_q \) (char \( \neq 2, 3 \)) in short Weierstrass form is: \[ E: y^2 = x^3 + ax + b, \quad a, b \in \mathbb{F}_q, \] subject to the non-singularity condition \( 4a^3 + 27b^2 \neq 0 \) (equivalently, the cubic \( x^3 + ax + b \) has no repeated roots). The set of affine points is \( \{(x,y) \in \mathbb{F}_q^2 : y^2 = x^3 + ax + b\} \), to which we adjoin a formal point at infinity \( \mathcal{O} \).

The discriminant condition ensures that the curve is smooth, which is necessary for the group law to be well-defined. In projective coordinates \( [X:Y:Z] \), the point at infinity is \( [0:1:0] \), and the Weierstrass equation becomes \( Y^2 Z = X^3 + aXZ^2 + bZ^3 \). Projective coordinates are used in efficient implementations to avoid field inversions.

3.2 The Group Law

The chord-and-tangent construction gives \( E(\mathbb{F}_q) \) the structure of an abelian group with identity \( \mathcal{O} \).

Group Law (Chord-and-Tangent). Given \( P = (x_1, y_1) \) and \( Q = (x_2, y_2) \) on \( E \):
  • \( \mathcal{O} \) is the identity: \( P + \mathcal{O} = P \).
  • The inverse of \( P = (x,y) \) is \( -P = (x,-y) \).
  • If \( P \neq Q \) and \( x_1 \neq x_2 \): let \( \lambda = (y_2-y_1)/(x_2-x_1) \). Then \( P + Q = (x_3, y_3) \) where \( x_3 = \lambda^2 - x_1 - x_2 \) and \( y_3 = \lambda(x_1-x_3)-y_1 \).
  • Point doubling (\( P = Q \), \( y_1 \neq 0 \)): \( \lambda = (3x_1^2+a)/(2y_1) \), same formulas for \( x_3, y_3 \).

The geometric picture: draw the line through \( P \) and \( Q \); it meets the curve in a third point \( R \); reflect \( R \) over the \( x \)-axis to get \( P + Q \). Associativity of this law is non-trivial and requires a careful algebraic verification (via explicit polynomial identities) or an appeal to the theory of divisors and the Riemann-Roch theorem on algebraic curves.

Example (Points on \( E: y^2 = x^3 + x + 1 \) over \( \mathbb{F}_7 \)). The quadratic residues mod 7 are \( \{0, 1, 2, 4\} \) (since \( 1^2=1,\, 2^2=4,\, 3^2=2 \) mod 7). For each \( x \in \{0,1,\ldots,6\} \), compute \( f(x) = x^3 + x + 1 \bmod 7 \) and check whether \( f(x) \) is a square mod 7:
  • \( x=0 \): \( f(0)=1 \). Since \( 1 = 1^2 \), it is a QR. Points: \( (0,1) \) and \( (0,6) \).
  • \( x=1 \): \( f(1)=3 \). Is 3 a QR mod 7? Check \( 3^{(7-1)/2} = 3^3 = 27 \equiv 6 \equiv -1 \pmod 7 \). So 3 is a QNR. No points.
  • \( x=2 \): \( f(2)=8+2+1=11\equiv 4 \). Since \( 2^2=4 \), it is a QR. Points: \( (2,2) \) and \( (2,5) \).
  • \( x=3 \): \( f(3)=27+3+1=31\equiv 3 \). QNR (as above). No points.
  • \( x=4 \): \( f(4)=64+4+1=69\equiv 69-9\times 7=6 \). Is 6 a QR mod 7? \( 6^3 = 216 \equiv 6 \equiv -1 \pmod 7 \). QNR. No points.
  • \( x=5 \): \( f(5)=125+5+1=131\equiv 131-18\times 7=5 \). Is 5 a QR mod 7? \( 5^3=125\equiv 6\equiv -1 \pmod 7 \). QNR. No points.
  • \( x=6 \): \( f(6)=216+6+1=223\equiv 223-31\times 7=6 \). QNR. No points.

Together with the point at infinity \( \mathcal{O} \), the curve has \( \#E(\mathbb{F}_7) = 5 \) points: \( \{\mathcal{O},\, (0,1),\, (0,6),\, (2,2),\, (2,5)\} \). By Hasse’s theorem, \( |5 - (7+1)| = 3 \leq 2\sqrt{7} \approx 5.29 \). Consistent.

Example (Point Addition on \( E \) over \( \mathbb{F}_7 \)). Using the same curve \( y^2 = x^3 + x + 1 \) over \( \mathbb{F}_7 \), compute \( P + Q \) where \( P = (0, 1) \) and \( Q = (2, 2) \).

Since \( x_1 = 0 \ne 2 = x_2 \), use the chord formula:

\[ \lambda = \frac{y_2 - y_1}{x_2 - x_1} = \frac{2 - 1}{2 - 0} = \frac{1}{2} \equiv 1 \cdot 2^{-1} \pmod 7. \]

The inverse of 2 mod 7: \( 2 \times 4 = 8 \equiv 1 \), so \( 2^{-1} \equiv 4 \). Thus \( \lambda = 4 \).

\[ x_3 = \lambda^2 - x_1 - x_2 = 16 - 0 - 2 = 14 \equiv 0 \pmod 7. \]\[ y_3 = \lambda(x_1 - x_3) - y_1 = 4(0 - 0) - 1 = -1 \equiv 6 \pmod 7. \]

So \( P + Q = (0, 6) \). Verify: is \( (0,6) \) on the curve? \( 6^2 = 36 \equiv 1 \equiv 0^3 + 0 + 1 \pmod 7 \). Yes. Note also that \( (0,6) = -(0,1) \) since negation flips the \( y \)-coordinate, confirming \( P + Q + (-P) = Q \), i.e., \( (0,1) + (2,2) + (0,6) = \mathcal{O} \) — a consistency check.

3.3 Hasse’s Theorem and Torsion

Theorem (Hasse, 1936). For an elliptic curve \( E \) over \( \mathbb{F}_q \), \[ \left| \#E(\mathbb{F}_q) - (q+1) \right| \leq 2\sqrt{q}. \]

This is a profound result: the number of rational points is close to \( q+1 \), with deviation controlled by the trace of Frobenius \( t = q+1-\#E(\mathbb{F}_q) \). Hasse’s theorem was proved using the Riemann hypothesis for function fields over finite fields. It implies that the group order is between \( (\sqrt{q}-1)^2 \) and \( (\sqrt{q}+1)^2 \), so all elements have order \( O(\sqrt{q}) \) — a useful bound for attack analysis.

The \( n \)-torsion subgroup \( E[n] = \{P \in E(\bar{\mathbb{F}}_q) : nP = \mathcal{O}\} \) over the algebraic closure satisfies \( E[n] \cong (\mathbb{Z}/n\mathbb{Z})^2 \) when \( \text{char}(\mathbb{F}_q) \nmid n \). Torsion subgroups underlie the definition of Weil and Tate pairings, which are used in pairing-based cryptography.

Remark (Hasse's Theorem and ECC Efficiency). Writing \( \#E(\mathbb{F}_p) = p + 1 - t \) where \( |t| \leq 2\sqrt{p} \), the group order is approximately \( p \). This means: to achieve 128-bit security (requiring group order \( \geq 2^{256} \) for Pollard rho to cost \( 2^{128} \)), an elliptic curve over a 256-bit prime \( p \) suffices, since \( \#E(\mathbb{F}_p) \approx p \approx 2^{256} \). In contrast, the multiplicative group \( \mathbb{Z}_p^* \) requires a 3072-bit prime at the same security level because of index calculus. ECC thus offers a 12× reduction in key size, which translates directly into smaller certificates, faster handshakes, and reduced power consumption on embedded devices.
Remark (Why ECC Still Falls to Shor's Algorithm). Shor's algorithm attacks any abelian group in which the order of a group element can be encoded as a quantum period-finding problem. The group \( E(\mathbb{F}_p) \) is abelian: by the structure theorem for finite abelian groups, \( E(\mathbb{F}_p) \cong \mathbb{Z}_n \) or \( \mathbb{Z}_{n_1} \times \mathbb{Z}_{n_2} \) for some integers \( n, n_1, n_2 \). This abelian structure is all Shor needs: the unitary \( U_P : |Q\rangle \mapsto |Q + P\rangle \) on elliptic curve points is efficiently implementable as a quantum circuit, and phase estimation on this unitary recovers the order of \( P \) — equivalently, the discrete logarithm. The index calculus obstruction that makes classical ECC better than classical DH simply does not exist in the quantum setting: Shor's algorithm is agnostic to whether the group is \( \mathbb{Z}_p^* \) or \( E(\mathbb{F}_p) \). Both fall in quantum polynomial time.

3.4 Discrete Logarithm on Elliptic Curves

The absence of a known sub-exponential DLOG algorithm for elliptic curves (over prime-order subgroups without special structure) is the central security advantage of ECC over DH in \( (\mathbb{Z}/p\mathbb{Z})^* \). The best known algorithms — Pollard rho and BSGS — take \( O(\sqrt{q}) \) time. For a 256-bit prime-order curve, this is \( O(2^{128}) \), equivalent in security to a 3072-bit DH modulus or 3072-bit RSA key, while requiring only 256 bits to represent a point.

The Pohlig-Hellman reduction applies to ECC as well: if the curve order has small prime factors, DLOG becomes much easier. This is why real-world curves have prime (or near-prime) order.

3.5 Real-World Curves

NIST P-256 (secp256r1). Defined by \( y^2 = x^3 - 3x + b \) over a 256-bit prime field \( \mathbb{F}_p \), with \( p = 2^{256} - 2^{224} + 2^{192} + 2^{96} - 1 \). The constants \( b \) and the base point were chosen by NIST with a hash of a seed string (a “nothing-up-my-sleeve” procedure). P-256 is widely deployed in TLS and used in NIST’s ECDSA standard.

Curve25519. Proposed by Bernstein (2006). Uses Montgomery form \( Bv^2 = u^3 + Au^2 + u \) with \( A = 486662 \), \( B = 1 \), over \( \mathbb{F}_p \) with \( p = 2^{255} - 19 \). The prime \( p \) has special structure enabling fast arithmetic. Crucially, the Montgomery ladder algorithm computes scalar multiplication using only \( x \)-coordinates, achieving constant-time execution with no secret-dependent branches — a major advantage against side-channel attacks. Curve25519 is used in Signal, WireGuard, and TLS 1.3. Its design is transparent and carefully documented for security.

secp256k1. Weierstrass form \( y^2 = x^3 + 7 \) over a 256-bit prime, used in Bitcoin’s ECDSA. Unlike P-256 or Curve25519, it lacks special optimizations for the field arithmetic, but is notable for the simplicity of its curve constants.

3.6 Montgomery Ladder

The Montgomery ladder computes \( nP \) for scalar \( n \) using a loop that processes each bit of \( n \), maintaining two accumulators \( R_0 = kP \) and \( R_1 = (k+1)P \) updated as:

\[ (R_0, R_1) \leftarrow \begin{cases} (2R_0,\, R_0 + R_1) & \text{if bit is 0} \\ (R_0 + R_1,\, 2R_1) & \text{if bit is 1} \end{cases} \]

The key property: both branches perform one doubling and one addition, so the sequence of operations is independent of the scalar bits. This eliminates timing differences, protecting against timing side-channel attacks. Montgomery ladder is the recommended method for scalar multiplication on Curve25519 and similar curves.

3.7 Failure of DDH on \((\mathbb{Z}/p\mathbb{Z})^*\)

The Decisional Diffie-Hellman (DDH) assumption asks whether an adversary can distinguish \( (g, g^a, g^b, g^{ab}) \) from \( (g, g^a, g^b, g^c) \) with \( c \) random. In \( (\mathbb{Z}/p\mathbb{Z})^* \), DDH is false as stated for the full group: the Legendre symbol distinguisher works by checking \( \left(\frac{g^{ab}}{p}\right) = \left(\frac{g^a}{p}\right)^{\log_g g^b} \), which can be computed efficiently. For this reason, DDH is assumed only in the prime-order subgroup of \( \mathbb{Z}_p^* \) of order \( q \mid p-1 \). On elliptic curves over prime-order groups, no analogous distinguisher is known, and DDH is believed to hold.


Chapter 4: Formal Security Definitions

Chapters 1–3 established the computational problems that underpin cryptographic security. But saying “DH is secure because DLOG is hard” is not yet a mathematical statement — it conflates the hardness of an abstract problem with the security of a specific protocol. A protocol’s security must be defined independently of any particular attack, then proved by reduction. This chapter develops the formal framework that makes such proofs possible.

The game-based approach, pioneered by Goldwasser-Micali and refined into the modern asymmetric-cryptography canon through the work of Bellare and Rogaway, addresses a simple question: what does it mean for an adversary to have “broken” an encryption scheme? Colloquially, breaking encryption means reading the message. But this is too strong and too narrow: too strong because we cannot hope to prevent a lucky guess; too narrow because a scheme that leaks the parity of the message is also broken in a meaningful sense. The right notion, semantic security, says a scheme is secure if a ciphertext reveals nothing about the plaintext — formally, the ciphertext provides zero computational advantage over guessing. IND-CPA formalizes this.

4.1 Game-Based Security

Modern cryptography defines security through games between a challenger and an adversary. The challenger runs a scheme; the adversary tries to break it. The adversary wins if they achieve a specific goal (e.g., guess a hidden bit) with probability noticeably better than random chance.

Definition. A function \( \varepsilon : \mathbb{N} \to \mathbb{R}_{\geq 0} \) is negligible if for every polynomial \( p \), there exists \( N \) such that for all \( n \geq N \), \( \varepsilon(n) < 1/p(n) \). An adversary's advantage is its winning probability minus the baseline (usually \( 1/2 \) for bit-guessing games); the scheme is secure if all efficient adversaries have negligible advantage.

This framework makes “security” a precise, falsifiable mathematical statement. A scheme is broken if there exists an efficient adversary with non-negligible advantage; it is secure if no such adversary exists (assuming hard mathematical problems).

4.2 DLOG, CDH, DDH Hierarchy

Security proofs always reduce a protocol’s security to the assumed hardness of some computational problem. The three most important problems in discrete-log-based cryptography form a strict hierarchy, and choosing the right assumption for a proof matters both for confidence (a weaker assumption yields a stronger theorem) and for diagnosing exactly what breaks if the assumption fails.

The hardness hierarchy is:

\[ \text{DLOG hard} \Rightarrow \text{CDH hard} \Rightarrow \text{DDH hard.} \]

Each arrow means “if the left problem is hard, so is the right.” Reading contrapositively: if the right problem turns out to be easy, the left is also easy. Cryptographic protocols should be proved under the weakest necessary assumption — that assumption is the most credible and hardest to break.

  • DLOG: Given \( g^x \), find \( x \). The strongest assumption: even this does not directly break passive DH.
  • CDH (Computational DH): Given \( g^a, g^b \), compute \( g^{ab} \). Breaking CDH would enable passive recovery of the DH shared secret. Strictly weaker than DLOG: there is no known reduction from CDH to DLOG in general groups.
  • DDH (Decisional DH): Distinguish \( (g^a, g^b, g^{ab}) \) from \( (g^a, g^b, g^c) \) with \( c \) random. The weakest and therefore most useful assumption for protocol proofs. ElGamal’s IND-CPA security reduces precisely to DDH: hardness of DDH implies that the ciphertext \( (g^b,\, g^{ab} \cdot m) \) is computationally indistinguishable from the one-time pad \( (g^b,\, g^c \cdot m) \), revealing nothing about \( m \).

The hierarchy is strict in general. The canonical example where DDH fails while CDH is believed hard is \( (\mathbb{Z}/p\mathbb{Z})^* \): the Legendre symbol \( \left(\frac{\cdot}{p}\right) \) immediately distinguishes \( g^{ab} \) from a random \( g^c \), because \( \left(\frac{g^{ab}}{p}\right) = \left(\frac{g^a}{p}\right)^b \) is a computable constraint on \( g^{ab} \), while \( g^c \) has independent Legendre symbol. For this reason, DDH is only assumed in the prime-order subgroup of \( \mathbb{Z}_p^* \), and elliptic curves — where no analogous algebraic distinguisher is known — are strictly preferable for DDH-based constructions. In the random oracle model, DDH and CDH are equivalent for certain groups, but the gap is real and must be respected in concrete instantiations.

4.3 IND-CPA Security

Definition (IND-CPA). A public-key encryption scheme \( (\text{KeyGen}, \text{Enc}, \text{Dec}) \) is IND-CPA secure if for all efficient adversaries \( \mathcal{A} \), the advantage in the following game is negligible:
  1. Challenger runs \( (pk, sk) \leftarrow \text{KeyGen}(1^\lambda) \), sends \( pk \) to \( \mathcal{A} \).
  2. \( \mathcal{A} \) (with oracle access to \( \text{Enc}_{pk}(\cdot) \)) submits two equal-length messages \( m_0, m_1 \).
  3. Challenger samples \( b \leftarrow \{0,1\} \) uniformly, sends \( c^* = \text{Enc}_{pk}(m_b) \).
  4. \( \mathcal{A} \) outputs \( b' \); wins if \( b' = b \).
Advantage: \( |\Pr[b' = b] - 1/2| \).

IND-CPA formalizes semantic security: the ciphertext reveals nothing about which message was encrypted. Any deterministic encryption scheme (like textbook RSA) immediately fails IND-CPA, since the adversary can simply re-encrypt \( m_0 \) and \( m_1 \) and compare with \( c^* \).

4.4 IND-CCA1 and IND-CCA2

Chosen-ciphertext attacks model a more powerful adversary who can also query a decryption oracle.

  • IND-CCA1: The adversary may query \( \text{Dec}_{sk}(\cdot) \) before seeing the challenge ciphertext \( c^* \), but not after. This models a “lunchtime attack” scenario.
  • IND-CCA2: The adversary may query \( \text{Dec}_{sk}(\cdot) \) both before and after seeing \( c^* \), with the sole restriction that they cannot query \( c^* \) itself. This is the strongest standard notion.

IND-CCA2 matters because many natural encryption schemes are malleable: given a ciphertext \( c \) of \( m \), one can construct \( c' \neq c \) that decrypts to a related message \( m' \). A malleable scheme cannot be IND-CCA2 secure. ElGamal is a classic example.

4.5 El-Gamal Encryption

Scheme (El-Gamal Encryption). Let \( G = \langle g \rangle \) be a cyclic group of prime order \( q \).
  • KeyGen: Sample \( a \leftarrow \mathbb{Z}_q \); public key \( pk = (G, g, h = g^a) \), secret key \( sk = a \).
  • Enc\( (pk, m) \): Sample \( b \leftarrow \mathbb{Z}_q \); output \( (A, B) = (g^b,\, h^b \cdot m) \).
  • Dec\( (sk, (A,B)) \): Output \( B / A^a = B \cdot (g^{ab})^{-1} = m \).
Example (ElGamal Encryption and Decryption). Work in \( G = \mathbb{Z}_{23}^* \) with \( p = 23 \), \( g = 5 \). Alice's private key \( x = 6 \), public key \( h = g^x = 5^6 \bmod 23 = 8 \) (computed earlier). \[ C_1 = g^r = 5^4 = 625 \bmod 23. \]

\( 625 = 27 \times 23 + 4 \), so \( C_1 = 4 \). Next:

\[ C_2 = m \cdot h^r = 10 \cdot 8^4 \bmod 23. \]

Compute \( 8^2 = 64 \equiv 18 \), \( 8^4 \equiv 18^2 = 324 \equiv 324 - 14\times 23 = 2 \pmod{23} \). So \( C_2 = 10 \times 2 = 20 \bmod 23 \). Ciphertext: \( (C_1, C_2) = (4, 20) \).

\[ 4^2 = 16,\quad 4^3 = 64 \equiv 18,\quad 4^6 \equiv 18^2 = 324 \equiv 2 \pmod{23}. \]

So \( s = 2 \). Now \( s^{-1} \bmod 23 \): since \( 2 \times 12 = 24 \equiv 1 \), we have \( s^{-1} = 12 \). Recover:

\[ m = C_2 \cdot s^{-1} = 20 \times 12 = 240 \bmod 23. \]

\( 240 = 10 \times 23 + 10 \), so \( m = 10 \). Message successfully recovered.

IND-CPA security proof sketch. Suppose \( \mathcal{A} \) wins the IND-CPA game with non-negligible advantage. We construct a DDH distinguisher: given a DDH tuple \( (g, g^a, g^b, Z) \) (where \( Z = g^{ab} \) or \( Z = g^c \)), simulate the El-Gamal challenger with \( h = g^a \) and challenge ciphertext \( c^* = (g^b, Z \cdot m_b) \). If \( Z = g^{ab} \), this is a valid El-Gamal encryption, and \( \mathcal{A} \) wins with non-negligible advantage. If \( Z = g^c \) is random, then \( c^* \) is a one-time pad on \( m_b \), perfectly hiding \( b \). So \( \mathcal{A} \)’s advantage equals the DDH distinguisher’s advantage, contradicting DDH hardness.

Malleability. Given \( (A, B) = (g^b, g^{ab} m) \), one can form \( (A, B \cdot m') = (g^b, g^{ab} \cdot m \cdot m') \), which decrypts to \( mm' \). This homomorphic property is useful in some protocols but makes El-Gamal not IND-CCA2 secure.

Remark (The Random Oracle Model). Many "secure" schemes — RSA-OAEP, ElGamal over a hashed shared secret, Schnorr signatures — are proved secure only in the Random Oracle Model (ROM), where hash functions are treated as truly random functions that map every input to an independent uniformly random output. In the ROM, an adversary can only query the hash function at specific points; it cannot exploit any algebraic or structural properties. This is a useful proof technique but a heuristic, not a guarantee of concrete security: no real hash function (SHA-256, SHA-3) is a random oracle, and there exist artificial encryption schemes that are ROM-secure but provably insecure when instantiated with any real hash. The gap between ROM security and standard-model security has motivated decades of work on standard-model proofs (e.g., Cramer-Shoup encryption, which achieves IND-CCA2 without any hash assumption). In practice, the ROM is almost universally accepted as a sound heuristic for well-designed hash functions, but its limitations should be understood by any cryptographer reading a security proof.
Example (Hybrid Encryption in Practice). Suppose Bob wants to send Alice a 1 GB video file. Encrypting the file directly under Alice's ElGamal or RSA public key is impractical: ElGamal requires one group operation per message block, and RSA's modulus size limits the plaintext to roughly 256 bytes.

The solution is hybrid encryption:

  1. Bob generates a fresh 256-bit symmetric key \( K \) uniformly at random.
  2. Bob encrypts the 1 GB file under \( K \) using AES-256-GCM, producing a ciphertext \( C_{\text{data}} \) and authentication tag.
  3. Bob encrypts \( K \) under Alice's public key using ElGamal (or RSA-OAEP), producing a short ciphertext \( C_K \) of a few hundred bytes.
  4. Bob transmits \( (C_K, C_{\text{data}}) \). Alice decrypts \( C_K \) to recover \( K \), then decrypts \( C_{\text{data}} \).
Security: the symmetric encryption of the file is semantically secure and authenticated (AES-GCM provides IND-CPA and integrity). The public-key encryption of \( K \) is IND-CPA (or IND-CCA2 with OAEP), so an adversary cannot distinguish \( K \) from a random key without breaking the algebraic hardness assumption. The composition is secure if both components are. This is precisely how TLS, PGP, and S/MIME work: the public-key operation encrypts only a short session key, and bulk data is protected by symmetric cryptography.

4.6 Fujisaki-Okamoto Transform

The FO transform is a powerful generic construction that converts an IND-CPA scheme into an IND-CCA2 KEM (Key Encapsulation Mechanism) in the random oracle model.

Construction (FO Transform, sketch). Given an IND-CPA PKE scheme \( \Pi \):
  1. To encapsulate: sample \( m \leftarrow \{0,1\}^\lambda \); compute \( c = \text{Enc}(pk, m; H_1(m)) \) using \( H_1(m) \) as deterministic randomness; output \( (c, K = H_2(m)) \).
  2. To decapsulate: compute \( m' = \text{Dec}(sk, c) \); recheck \( c = \text{Enc}(pk, m'; H_1(m')) \); if so output \( H_2(m') \), else output reject.

The re-encryption check is critical: it prevents the adversary from submitting malformed ciphertexts that would reveal information about the secret key. In the random oracle model, the FO transform is tightly secure if the underlying scheme is \( \delta \)-correct (decryption failure probability \( \delta \)) and IND-CPA. The Kyber/ML-KEM standard applies precisely this transform.

4.7 Zero-Knowledge Proofs and Sigma Protocols

A zero-knowledge proof allows a prover to convince a verifier of a statement’s truth without revealing anything beyond the truth of the statement.

Definition. An interactive proof system \( (P, V) \) for language \( L \) is zero-knowledge if it satisfies:
  • Completeness: If \( x \in L \), an honest prover convinces an honest verifier with overwhelming probability.
  • Soundness: If \( x \notin L \), no (possibly cheating) prover convinces an honest verifier with non-negligible probability.
  • Zero-Knowledge: There exists a simulator \( S \) such that for all \( x \in L \), the distribution of \( S(x) \) is computationally indistinguishable from the real view of an honest verifier interacting with the prover.

Sigma (\( \Sigma \)) protocols are a particularly elegant class of 3-round honest-verifier zero-knowledge proofs: (1) prover sends a commitment, (2) verifier sends a random challenge, (3) prover sends a response. They are simple to analyze and widely deployed.

4.8 Schnorr Identification Protocol

Protocol (Schnorr Identification). Public statement: \( X = g^x \) (prover knows \( x \)).
  1. Commit: Prover samples \( r \leftarrow \mathbb{Z}_q \), sends \( R = g^r \).
  2. Challenge: Verifier sends \( c \leftarrow \mathbb{Z}_q \).
  3. Respond: Prover sends \( z = r + cx \pmod{q} \).
  4. Verify: Check \( g^z = R \cdot X^c \).

Honest-verifier zero-knowledge. The simulator, knowing \( c \) in advance, samples \( z \leftarrow \mathbb{Z}_q \) uniformly, computes \( R = g^z X^{-c} \), and outputs \( (R, c, z) \). This is identically distributed to an honest transcript because in both cases \( z \) is uniform in \( \mathbb{Z}_q \) and \( R = g^z X^{-c} \).

Special soundness. Given two accepting transcripts \( (R, c, z) \) and \( (R, c', z') \) with \( c \neq c' \) (same commitment, different challenges), one can extract \( x = (z - z')/(c - c') \pmod{q} \). This means a prover who can answer two different challenges for the same \( R \) must know the witness.

4.9 Fiat-Shamir Transform

The Fiat-Shamir transform removes interaction from a \( \Sigma \)-protocol by replacing the verifier’s challenge with a hash of the transcript so far.

Transform (Fiat-Shamir). Given a \( \Sigma \)-protocol with commitment \( R \), replace the challenge \( c \) by \( c = H(\text{statement}, R) \) where \( H \) is a hash function modeled as a random oracle. The resulting non-interactive proof \( (R, z) \) is a signature on a message \( m \) if \( m \) is included in the hash: \( c = H(\text{statement}, R, m) \).

In the random oracle model, Fiat-Shamir produces signatures that are existentially unforgeable under chosen-message attacks (EUF-CMA) via the forking lemma (see below).

4.10 Digital Signatures: Schnorr and ECDSA

Schnorr signatures. Applying Fiat-Shamir to Schnorr ID:

  • Sign\( (x, m) \): Sample \( r \leftarrow \mathbb{Z}_q \); compute \( R = g^r \), \( c = H(R, m) \), \( z = r - cx \pmod{q} \). Output \( (z, c) \).
  • Verify\( (X, m, (z,c)) \): Accept iff \( H(g^z X^c, m) = c \).
Example (Schnorr Signature on a Small Group). Work in \( \mathbb{Z}_{23}^* \) with prime \( p = 23 \). The subgroup of order \( q = 11 \) (since \( 11 \mid 22 = p-1 \)) is generated by \( g = 4 \): verify \( 4^{11} \bmod 23 \).

Compute powers of 4 mod 23: \( 4^1=4 \), \( 4^2=16 \), \( 4^4=256\equiv 256-11\times 23=3 \), \( 4^8\equiv 9 \), \( 4^{11}=4^8\cdot 4^2\cdot 4^1=9\times 16\times 4=576\equiv 576-25\times 23=576-575=1\pmod{23} \). So ord\((4)=11\). Good.

Alice’s private key \( x = 7 \). Public key:

\[ y = 4^7 = 4^4\cdot 4^2\cdot 4^1 = 3\times 16\times 4 = 192 \equiv 192-8\times 23 = 8 \pmod{23}. \]\[ R = g^k = 4^3 = 64 \equiv 64-2\times 23 = 18 \pmod{23}. \]

Compute \( e = H(M \| R) = 5 \) (illustrative). Response:

\[ s = (k - x\cdot e)\bmod q = (3 - 7\times 5)\bmod 11 = (3-35)\bmod 11 = -32\bmod 11. \]

Since \( -32 = -3\times 11 + 1 \), we get \( s = 1 \). Signature: \( (e, s) = (5, 1) \).

\[ 8^2 = 64\equiv 18,\quad 8^4\equiv 18^2=324\equiv 324-14\times 23=2,\quad 8^5\equiv 2\times 8=16\pmod{23}. \]

So \( R' = 4\times 16 = 64\equiv 18\pmod{23} \). Now check: \( H(M\|R') = H(M\|18) \). Since \( R' = 18 = R \), the hash equals \( e = 5 \). Verification passes. The algebraic reason: \( g^s y^e = g^{k-xe} \cdot (g^x)^e = g^k = R \).

ECDSA. In ECDSA (used in Bitcoin, TLS), the signing procedure is:

  • Sample ephemeral key \( k \leftarrow \mathbb{Z}_q \); compute \( R = kG \), \( r = x(R) \bmod q \) (\( x \)-coordinate).
  • Compute \( s = k^{-1}(H(m) + rx) \bmod q \); output \( (r, s) \).
  • Verify: \( u_1 = s^{-1}H(m) \), \( u_2 = s^{-1}r \); check \( x(u_1 G + u_2 X) \equiv r \pmod{q} \).

ECDSA’s security is harder to analyze formally than Schnorr’s because the \( r = x(R) \bmod q \) step introduces an irregular structure. Schnorr signatures have a tight EUF-CMA proof in the ROM via the forking lemma: any forger can be rewound to produce two signatures with the same \( R \) but different challenges, enabling extraction of the secret key. ECDSA reuses \( k \) only implicitly, and a single nonce reuse \( (k_1 = k_2) \) immediately leaks the private key \( x = (z_1 H(m_2) - z_2 H(m_1)) / (r(z_2 - z_1)) \) — a catastrophic failure exploited against the PlayStation 3 in 2010.


Chapter 5: Learning with Errors and Lattices

The constructions in Chapters 1–4 — Diffie-Hellman, ElGamal, Schnorr, ECDSA — all rest on a common foundation: the hardness of the discrete logarithm problem in a group where the group operation is fast. This foundation is elegant and well-studied, but it carries a structural vulnerability: every DLOG-based scheme is broken in polynomial time by Shor’s algorithm (Chapter 7), which reduces order-finding in any finite abelian group to quantum phase estimation. Elliptic-curve groups enjoy no special protection — Shor’s algorithm extracts the group periodicity regardless of the ambient geometry. RSA (Chapter 6) falls by the same algorithm via the integer factoring reduction. If large-scale quantum computers are built, the entire algebraic paradigm collapses simultaneously.

This motivates the search for hard problems whose structure is fundamentally different. Two desiderata guide the choice: first, no known quantum algorithm should provide more than a modest (square-root) speedup; second, and more subtly, the problem should admit a worst-case to average-case reduction — if SVP (Shortest Vector Problem) on an arbitrary lattice is hard in the worst case, then the specific random instances used in cryptography are also hard. This second property is remarkable. DLOG and factoring have no such reduction; their hardness on random instances is an assumption with no worst-case backing. Lattice problems, by contrast, do: Regev’s landmark 2005 paper showed that solving random LWE samples is as hard as solving worst-case lattice problems. This reduction provides a qualitatively different level of confidence than anything achievable in the DLOG world.

5.1 The Learning With Errors Problem

The Learning With Errors (LWE) problem, introduced by Regev (2005), is the central hardness assumption for post-quantum cryptography. The intuition is elementary: if \( \mathbf{s} \in \mathbb{Z}_q^n \) is a secret vector and you are given many linear equations \( b_i = \mathbf{a}_i^\top \mathbf{s} \pmod{q} \) over a finite field with no noise, solving for \( \mathbf{s} \) by Gaussian elimination is trivial. But add a small random error \( e_i \) to each right-hand side: the system becomes overdetermined and inconsistent, Gaussian elimination fails catastrophically, and even distinguishing such noisy samples from purely random \( (\mathbf{a}_i, b_i) \) pairs appears to require exponential time — and provably so, via the lattice reduction in Regev’s theorem.

Definition (LWE). Let parameters \( m, n, q \in \mathbb{Z} \) and distributions \( \chi_s, \chi_e \) over \( \mathbb{Z}_q \) be given. The LWE distribution is: sample \( \mathbf{a} \leftarrow U(\mathbb{Z}_q^n) \), \( s \leftarrow \chi_s^n \), \( e \leftarrow \chi_e \); output \( (\mathbf{a}, b = \mathbf{a}^\top \mathbf{s} + e) \). The decision-LWE problem asks to distinguish \( m \) samples from this distribution vs. \( m \) uniform samples \( (\mathbf{a}, b) \leftarrow U(\mathbb{Z}_q^{n+1}) \). The search-LWE problem asks to recover \( \mathbf{s} \) given \( m \) LWE samples.

The error \( e \) is small (magnitude \( \ll q/2 \)), so \( b \approx \mathbf{a}^\top \mathbf{s} \) but with enough noise to prevent lattice attacks from easily recovering \( \mathbf{s} \).

Example (Small LWE Instance). Take \( n = 3 \) dimensions and \( q = 7 \). Let the secret \( \mathbf{s} = (1, 2, 3)^\top \in \mathbb{Z}_7^3 \).

Three LWE samples with small errors \( e \in \{-1, 0, 1\} \):

  • \( \mathbf{a}_1 = (1, 0, 2) \): noiseless dot product \( \mathbf{a}_1^\top \mathbf{s} = 1\cdot 1 + 0\cdot 2 + 2\cdot 3 = 7 \equiv 0 \pmod 7 \). With error \( e_1 = 1 \): \( b_1 = 1 \).
  • \( \mathbf{a}_2 = (3, 1, 0) \): \( \mathbf{a}_2^\top \mathbf{s} = 3 + 2 + 0 = 5 \). With error \( e_2 = -1 \): \( b_2 = 4 \).
  • \( \mathbf{a}_3 = (0, 2, 1) \): \( \mathbf{a}_3^\top \mathbf{s} = 0 + 4 + 3 = 7 \equiv 0 \pmod 7 \). With error \( e_3 = 2 \): \( b_3 = 2 \).

The observed samples are \( (\mathbf{a}_i, b_i) \): \( ((1,0,2),\, 1) \), \( ((3,1,0),\, 4) \), \( ((0,2,1),\, 2) \).

An attacker trying to solve \( \mathbf{a}_1 \cdot \mathbf{s} \equiv b_1 \pmod 7 \) as if there were no noise would write \( s_1 + 2s_3 \equiv 1 \pmod 7 \). With the true secret \( s_1=1, s_3=3 \): \( 1 + 6 = 7 \equiv 0 \ne 1 \pmod 7 \). The equation is wrong precisely because \( e_1 = 1 \) shifted the right-hand side. Gaussian elimination on noisy equations produces contradictions; recovering \( \mathbf{s} \) requires either knowing the errors or solving the equivalent BDD lattice problem.

Example (Regev LWE Encryption). In Regev's scheme, the public key consists of \( m \) LWE samples \( (\mathbf{A}, \mathbf{b}) \) where \( \mathbf{A} \in \mathbb{Z}_q^{m \times n} \) and \( \mathbf{b} = \mathbf{A}\mathbf{s} + \mathbf{e} \pmod q \) for secret \( \mathbf{s} \) and small error vector \( \mathbf{e} \). \[ \mathbf{u} = \mathbf{A}^\top \mathbf{r} \in \mathbb{Z}_q^n, \qquad v = \mathbf{b}^\top \mathbf{r} + \mu \left\lfloor q/2 \right\rfloor \in \mathbb{Z}_q. \]

Ciphertext: \( (\mathbf{u}, v) \).

\[ v - \mathbf{s}^\top \mathbf{u} = \mathbf{b}^\top \mathbf{r} + \mu\lfloor q/2\rfloor - \mathbf{s}^\top \mathbf{A}^\top \mathbf{r} = (\mathbf{A}\mathbf{s} + \mathbf{e})^\top \mathbf{r} + \mu\lfloor q/2\rfloor - \mathbf{s}^\top \mathbf{A}^\top \mathbf{r} = \mathbf{e}^\top \mathbf{r} + \mu\lfloor q/2\rfloor. \]

Since \( \mathbf{e} \) is small and \( \mathbf{r} \in \{0,1\}^m \), the noise term \( \mathbf{e}^\top \mathbf{r} \) is much smaller than \( q/4 \). Round the result to the nearest multiple of \( \lfloor q/2 \rfloor \): the answer is \( 0 \) if \( \mu = 0 \) (result near 0) and \( \lfloor q/2 \rfloor \) if \( \mu = 1 \). Divide by \( \lfloor q/2 \rfloor \) to recover \( \mu \). Correctness holds as long as \( |\mathbf{e}^\top \mathbf{r}| < q/4 \), which is guaranteed by the parameter choices in Regev’s theorem.

5.2 Hardness and Parameter Taxonomy

Error distribution. The standard choice is a discrete Gaussian \( \chi_e = D_{\mathbb{Z},\sigma} \), truncated to \( \mathbb{Z}_q \). Larger \( \sigma \) means more noise and (intuitively) harder LWE. Gaussian errors are theoretically “hardest” in the sense that average-case LWE with Gaussian errors is as hard as worst-case lattice problems (via Regev’s reduction).

Secret distribution. The secret \( \mathbf{s} \) may be uniform over \( \mathbb{Z}_q^n \) (standard secret) or drawn from \( \chi_s = \chi_e \) (normal form secret). Both are believed equally hard; the normal form simplifies security proofs.

Normal form equivalence. Tall LWE has \( m > n \) samples with a uniform secret. Normal form LWE has \( m = n \) samples with secret from \( \chi_e \). The equivalence proof (sketch): Given tall LWE samples \( (\mathbf{A}, \mathbf{b}) \) with \( \mathbf{A} \in \mathbb{Z}_q^{m \times n} \) and \( m > n \), split \( \mathbf{A} = \begin{pmatrix} \mathbf{A}_0 \\ \mathbf{A}_1 \end{pmatrix} \) where \( \mathbf{A}_0 \in \mathbb{Z}_q^{n \times n} \) is invertible (happens with high probability). Compute \( \mathbf{A}_1 \mathbf{A}_0^{-1} \mathbf{b}_0 - \mathbf{b}_1 = \mathbf{A}_1 \mathbf{A}_0^{-1}(\mathbf{A}_0 \mathbf{s} + \mathbf{e}_0) - (\mathbf{A}_1 \mathbf{s} + \mathbf{e}_1) = \mathbf{A}_1 \mathbf{A}_0^{-1}\mathbf{e}_0 - \mathbf{e}_1 \), which is a normal-form LWE sample.

Search–Decision reduction. Search-LWE reduces to \( O(nq) \) calls to a decision-LWE oracle via a hybrid argument: guess each coordinate of \( \mathbf{s} \) by checking consistency of modified samples with the oracle. This means that if decision-LWE is hard, so is search-LWE (over the same parameters).

5.3 Lattices and Lattice Problems

Definition. A lattice \( \mathcal{L} \) in \( \mathbb{R}^n \) is the set of all integer linear combinations of a basis \( B = \{\mathbf{b}_1,\ldots,\mathbf{b}_n\} \subseteq \mathbb{R}^n \): \[ \mathcal{L}(B) = \left\{ \sum_{i=1}^n z_i \mathbf{b}_i : z_i \in \mathbb{Z} \right\}. \] The determinant (or volume) of the lattice is \( \det(\mathcal{L}) = |\det(B)| \), independent of the choice of basis. The successive minima \( \lambda_1(\mathcal{L}) \leq \lambda_2(\mathcal{L}) \leq \cdots \leq \lambda_n(\mathcal{L}) \) are the smallest values of \( r \) such that \( \mathcal{L} \) contains \( i \) linearly independent vectors of norm \( \leq r \).
Theorem (Minkowski). For any \( n \)-dimensional lattice \( \mathcal{L} \), \[ \lambda_1(\mathcal{L}) \leq \sqrt{n} \cdot \det(\mathcal{L})^{1/n}. \]

This is a fundamental existence theorem: any lattice must contain a non-zero vector of length at most \( \sqrt{n} \cdot \det(\mathcal{L})^{1/n} \). Algorithms for finding such short vectors — lattice reduction — are the main tools for attacking LWE.

The geometric picture deserves emphasis. A lattice is like a multidimensional periodic grid, but the grid need not align with coordinate axes and its basis vectors can be highly skewed — long, nearly parallel, far from orthogonal. Given such a bad basis, the lattice appears intractable: computing an integer combination that lands on a particular short vector is like navigating a tilted, oblique grid without a compass. A good basis — short, nearly orthogonal vectors — makes many computations easy. Lattice reduction is the algorithmic challenge of transforming any given basis into a nearly-good one. The LLL algorithm (Lenstra-Lenstra-Lovász, 1982) achieves polynomial-time reduction but only produces a moderately good basis, sufficient to break small parameters. BKZ-\(\beta\) achieves progressively better reduction at exponential cost \( \sim 2^{0.292\beta} \), and concrete LWE parameters are chosen so that even BKZ-\(\beta\) at optimal \( \beta \) falls short.

Key lattice problems:

  • SVP (Shortest Vector Problem): Find a non-zero \( \mathbf{v} \in \mathcal{L} \) with \( \|\mathbf{v}\| = \lambda_1(\mathcal{L}) \).
  • CVP (Closest Vector Problem): Given target \( \mathbf{t} \notin \mathcal{L} \), find \( \mathbf{v} \in \mathcal{L} \) minimizing \( \|\mathbf{t} - \mathbf{v}\| \).
  • BDD (Bounded Distance Decoding): As CVP, but promised that \( \min_{\mathbf{v} \in \mathcal{L}} \|\mathbf{t} - \mathbf{v}\| < \lambda_1(\mathcal{L})/2 \).

Kannan’s embedding. BDD reduces to SVP: given target \( \mathbf{t} \) and lattice \( \mathcal{L} \), form the augmented lattice \( \mathcal{L}' \) with basis \( \begin{pmatrix} B & 0 \\ \mathbf{t}^\top & \varepsilon \end{pmatrix} \). The closest lattice point to \( \mathbf{t} \) corresponds to a short vector in \( \mathcal{L}' \).

LWE as BDD. An LWE sample \( (\mathbf{A}, \mathbf{b}) \) with \( \mathbf{b} = \mathbf{A}\mathbf{s} + \mathbf{e} \) and small \( \mathbf{e} \) defines a BDD instance: \( \mathbf{b} \) is close to the lattice point \( \mathbf{A}\mathbf{s} \) in the \( q \)-ary lattice \( \mathcal{L}_q(\mathbf{A}) = \{\mathbf{y} \in \mathbb{Z}^m : \mathbf{y} = \mathbf{A}\mathbf{s} \pmod{q},\, \mathbf{s} \in \mathbb{Z}_q^n\} \). The error \( \mathbf{e} \) is the deviation from the lattice. Solving BDD on this lattice recovers \( \mathbf{s} \).

Best known attack: BKZ. The BKZ (Block Korkine-Zolotarev) lattice reduction algorithm interpolates between LLL (polynomial time, weak reduction quality) and full SVP. BKZ-\( \beta \) uses an SVP oracle in dimension \( \beta \) as a subroutine, running in time roughly \( 2^{0.292\beta} \) (heuristically). Concrete LWE security estimates use the lattice estimator (Albrecht et al.) which optimizes over all BKZ-based attacks.

Remark (SVP Hardness and Quantum Resistance). The best known algorithm for finding the shortest vector in an \( n \)-dimensional lattice is BKZ with a lattice sieving subroutine. For block size \( \beta \), BKZ runs in time \( \exp(O(\beta \log \beta)) \) and achieves an approximation factor of roughly \( 2^{n/\beta} \) (the smaller the block size, the worse the approximation). Crucially, there is no known quantum speedup for SVP beyond a polynomial constant factor: the best quantum attack is Grover-enhanced sieving, which reduces the exponent from \( \approx 0.292\beta \) to \( \approx 0.265\beta \) — a constant improvement in the base, not an exponential speedup. This contrasts sharply with DLOG and factoring, where Shor's algorithm gives an exponential classical-to-quantum improvement. The absence of an exponential quantum speedup for SVP is the fundamental reason lattice problems are post-quantum candidates: even with a large quantum computer, an attacker would need to run BKZ at enormous block sizes, costing superexponential time.
Remark (NIST PQC Standardization). After a 7-year competition (2017–2024), NIST standardized the following post-quantum schemes: CRYSTALS-Kyber (now ML-KEM, FIPS 203) for key encapsulation, based on Module-LWE; CRYSTALS-Dilithium (ML-DSA, FIPS 204) and FALCON (FN-DSA, FIPS 206) for digital signatures, based on Module-LWE/SIS and NTRU lattices respectively; and SPHINCS+ (SLH-DSA, FIPS 205) for stateless hash-based signatures, whose security relies only on the collision resistance of an underlying hash function and is therefore independent of all lattice assumptions.

The transition from RSA and ECC to these standards is actively underway. Apple deployed ML-KEM in iMessage’s PQ3 protocol in 2024. Signal upgraded to ML-KEM-1024 in 2023. Cloudflare and Google Chrome have supported hybrid X25519Kyber768 TLS key exchange since 2023. The migration pressure is driven primarily by “harvest now, decrypt later” attacks: an adversary archiving TLS ciphertexts today can retroactively decrypt them once a sufficiently large quantum computer is available, making the effective deadline for migration earlier than the availability of quantum computers themselves.

5.4 Ring-LWE and Module-LWE

Standard LWE requires matrices \( \mathbf{A} \in \mathbb{Z}_q^{m \times n} \), which have key sizes quadratic in the dimension. Ring-LWE and Module-LWE reduce key sizes dramatically by replacing \( \mathbb{Z}_q^n \) with polynomial rings.

Definition (Ring-LWE). Fix \( R_q = \mathbb{Z}_q[x]/(x^d+1) \) with \( d = 2^k \). Ring-LWE samples are \( (a, b = as + e) \) with \( a \leftarrow U(R_q) \), \( s,e \leftarrow \chi \) over \( R_q \).

Polynomial multiplication in \( R_q \) can be viewed as multiplication by a circulant matrix — specifically, the companion (negacyclic) matrix of \( x^d + 1 \). This allows applying fast algorithms. The Number Theoretic Transform (NTT) is a finite-field analogue of the FFT: when \( d \mid q-1 \), one can find all \( d \)-th roots of unity in \( \mathbb{Z}_q \) and evaluate polynomials at these roots in \( O(d \log d) \) operations, reducing polynomial multiplication to componentwise multiplication in the NTT domain. NTT is the key algorithmic ingredient making lattice-based schemes practical.

The efficiency advantage over plain LWE is dramatic. In standard LWE with dimension \( n \), a public key consists of a matrix \( \mathbf{A} \in \mathbb{Z}_q^{m \times n} \), costing \( O(n^2) \) ring elements. In Ring-LWE, a single polynomial \( a \in R_q \) plays the role of the entire matrix — polynomial multiplication by \( a \) acts as the “random linear map,” and the ring structure forces all the algebraic relationships that the full matrix would carry. This drops the key size from \( O(n^2) \) to \( O(n) \) elements, which at \( n = 256 \) and \( q = 3329 \) translates to public keys of roughly 1 KB rather than 1 MB.

Example (Ring-LWE Arithmetic, \( n=4 \), \( q=17 \)). Work in \( R_{17} = \mathbb{Z}_{17}[x]/(x^4+1) \). Every element is a polynomial of degree at most 3 with coefficients in \( \mathbb{Z}_{17} \). The reduction rule is \( x^4 \equiv -1 \pmod{x^4+1} \), i.e., \( x^4 = 16 \) in \( \mathbb{Z}_{17} \).

Setup. Let the secret be \( s = 1 + 2x + 0x^2 + 1x^3 \) (small coefficients in \( \{0,1,2\} \)) and let a public polynomial be \( a = 3 + 5x + 7x^2 + 2x^3 \) (chosen uniformly from \( R_{17} \)).

\[ a \cdot s = (3 + 5x + 7x^2 + 2x^3)(1 + 2x + x^3). \]

Distributing term by term and using \( x^4 = -1 \), \( x^5 = -x \), \( x^6 = -x^2 \), \( x^7 = -x^3 \) (all mod \( x^4+1 \)): \begin{align*} 3\cdot 1 &= 3 \ 3\cdot 2x &= 6x \ 3\cdot x^3 &= 3x^3 \ 5x\cdot 1 &= 5x \ 5x\cdot 2x &= 10x^2 \ 5x\cdot x^3 &= 5x^4 = 5\cdot(-1) = -5 \ 7x^2\cdot 1 &= 7x^2 \ 7x^2\cdot 2x &= 14x^3 \ 7x^2\cdot x^3 &= 7x^5 = -7x \ 2x^3\cdot 1 &= 2x^3 \ 2x^3\cdot 2x &= 4x^4 = -4 \ 2x^3\cdot x^3 &= 2x^6 = -2x^2. \end{align*} Collecting by degree:

  • Degree 0: \( 3 + (-5) + (-4) = -6 \equiv 11 \pmod{17} \).
  • Degree 1: \( 6x + 5x + (-7x) = 4x \).
  • Degree 2: \( 10x^2 + 7x^2 + (-2x^2) = 15x^2 \).
  • Degree 3: \( 3x^3 + 14x^3 + 2x^3 = 19x^3 \equiv 2x^3 \pmod{17} \).
So \( a \cdot s = 11 + 4x + 15x^2 + 2x^3 \in R_{17} \). \[ b = as + e = (11+1) + (4+0)x + (15-1)x^2 + (2+0)x^3 = 12 + 4x + 14x^2 + 2x^3. \]

The Ring-LWE pair \( (a, b) = (3+5x+7x^2+2x^3,\; 12+4x+14x^2+2x^3) \) is publicly observable. The secret \( s \) and error \( e \) are hidden. An attacker cannot recover \( s \) by polynomial division because \( b \ne as \) exactly — the error \( e \) makes the relationship inconsistent, and finding \( s \) is as hard as the worst-case ideal lattice problem.

Comparison with plain LWE. In standard LWE at the same dimension \( n=4 \) and \( q=17 \), a single sample is a pair \( (\mathbf{a}, b) \in \mathbb{Z}_{17}^4 \times \mathbb{Z}_{17} \) — 5 field elements. A Ring-LWE sample is also 5 coefficients (the two polynomials of degree \( \le 3 \)), but the algebraic structure of \( R_{17} \) means one single-polynomial “matrix” \( a \) generates all the random linear equations at once, just as a circulant matrix is fully specified by its first row.

Remark (Why \( x^d + 1 \) and not \( x^d - 1 \)?). The polynomial \( x^d - 1 \) factors over \( \mathbb{Z} \) as \( \prod_{r^d=1}(x - r) \), making \( \mathbb{Z}[x]/(x^d-1) \) non-integral — it has zero-divisors. For \( d = 2^k \), the polynomial \( x^d + 1 \) is the \( 2^{k+1} \)-th cyclotomic polynomial, which is irreducible over \( \mathbb{Q} \). This irreducibility gives \( \mathbb{Z}[x]/(x^d+1) \) the structure of a ring of integers in a number field, which is exactly the algebraic setting in which Lyubashevsky–Peikert–Regev (2010) proved the Ring-LWE hardness reduction from ideal-lattice SVP.

Module-LWE. Generalizes Ring-LWE to small matrices over \( R_q \):

\[ (\mathbf{A}, \mathbf{b} = \mathbf{A}\mathbf{s} + \mathbf{e}), \quad \mathbf{A} \in R_q^{k \times \ell},\; \mathbf{s} \in R_q^\ell,\; \mathbf{e} \in R_q^k. \]

Module-LWE interpolates between standard LWE (large \( k = n, \ell = 1 \)) and Ring-LWE (\( k = \ell = 1 \)). Kyber uses small \( k, \ell \in \{2,3,4\} \), giving compact keys (hundreds of bytes) while relying on a well-studied hardness assumption.

Theorem 5.4.6 (Kyber IND-CPA security). If decisional module-LWE\( (k+1, k, q, x^{256}+1, \chi_s, \chi_e) \) is hard, then the Kyber PKE scheme is IND-CPA secure.

The proof reduces an IND-CPA distinguisher to an MLWE distinguisher by a hybrid argument over the rows of \( \mathbf{A} \) and the components of the ciphertext.

5.5 Kyber KEM (ML-KEM)

Kyber (CRYSTALS-Kyber) was selected by NIST in 2022 as the primary post-quantum KEM standard, now called ML-KEM (FIPS 203).

Scheme (Kyber PKE). Parameters: module rank \( k \), polynomial ring \( R_q = \mathbb{Z}_q[x]/(x^{256}+1) \), error distribution \( \chi \) (centered binomial).
  • KeyGen: Sample \( \mathbf{A} \leftarrow U(R_q^{k \times k}) \), \( \mathbf{s}, \mathbf{e} \leftarrow \chi^k \); compute \( \mathbf{t} = \mathbf{A}\mathbf{s} + \mathbf{e} \). Public key: \( (A, t) \); secret key: \( \mathbf{s} \).
  • Enc\( (pk, m \in \{0,1\}^{256}) \): Sample \( \mathbf{r}, \mathbf{e}_1 \leftarrow \chi^k \), \( e_2 \leftarrow \chi \). Compute \( \mathbf{u} = \mathbf{A}^\top \mathbf{r} + \mathbf{e}_1 \), \( v = \mathbf{t}^\top \mathbf{r} + e_2 + \lfloor q/2 \rfloor \mathbf{m} \). Ciphertext: \( (\mathbf{u}, v) \).
  • Dec\( (sk, (\mathbf{u}, v)) \): Compute \( w = v - \mathbf{s}^\top \mathbf{u} = \lfloor q/2 \rfloor m + \text{(small noise)} \). Round each coefficient to nearest multiple of \( \lfloor q/2 \rfloor \) to recover \( m \).

Correctness. The noise in \( w \) is \( e_2 + \mathbf{e}_1^\top \mathbf{r} - \mathbf{s}^\top \mathbf{e}_1 + (\text{cross terms}) \), all small. As long as the total noise magnitude is less than \( q/4 \), rounding correctly recovers \( m \). This holds with overwhelming probability for the chosen parameters.

The Fujisaki-Okamoto transform is applied to produce the IND-CCA2 KEM: the message \( m \) is sampled as \( m = H_1(\text{random seed}) \), the encapsulation recomputes from \( m \), and decapsulation re-encrypts to verify. The shared key is \( K = H_2(m, H(\mathbf{u}, v)) \).

Concrete parameters:

Variant\( k \)\( q \)\( \eta_1, \eta_2 \)Public key (bytes)Security
Kyber-512233293, 2800~128-bit PQ
Kyber-768333292, 21184~192-bit PQ
Kyber-1024433292, 21568~256-bit PQ

5.6 Dilithium Signatures (ML-DSA)

CRYSTALS-Dilithium (now ML-DSA, FIPS 204) is the primary NIST-standardized post-quantum signature scheme. It combines module-LWE with the Short Integer Solution (SIS) hardness assumption.

The signature scheme follows a “Fiat-Shamir with aborts” paradigm:

Scheme (Dilithium, simplified).
  • KeyGen: Sample \( \mathbf{A} \leftarrow U(R_q^{k \times \ell}) \), \( \mathbf{s}_1 \leftarrow \chi^\ell \), \( \mathbf{s}_2 \leftarrow \chi^k \); set \( \mathbf{t} = \mathbf{A}\mathbf{s}_1 + \mathbf{s}_2 \). Public key: \( (\mathbf{A}, \mathbf{t}) \).
  • Sign\( (sk, \mu) \): Sample \( \mathbf{y} \leftarrow \chi_y^\ell \) (uniform in \( [-\gamma_1, \gamma_1] \)); compute \( \mathbf{w} = \mathbf{A}\mathbf{y} \); compute challenge \( c = H(\mu, \text{HighBits}(\mathbf{w})) \in \{0,\pm 1\}^{256} \); compute \( \mathbf{z} = \mathbf{y} + c\mathbf{s}_1 \). If \( \|\mathbf{z}\|_\infty \geq \gamma_1 - \beta \), abort and resample. Output \( (\mathbf{z}, c) \).
  • Verify\( (pk, \mu, (\mathbf{z}, c)) \): Check \( \|\mathbf{z}\|_\infty < \gamma_1 - \beta \); check \( c = H(\mu, \text{HighBits}(\mathbf{A}\mathbf{z} - c\mathbf{t})) \).

The rejection sampling step (abort if \( \|\mathbf{z}\|_\infty \) is too large) ensures that the distribution of \( \mathbf{z} = \mathbf{y} + c\mathbf{s}_1 \) is independent of the secret \( \mathbf{s}_1 \): by choosing \( \gamma_1 \gg \|c\mathbf{s}_1\|_\infty \), the rejection probability is bounded, and the accepted \( \mathbf{z} \) is statistically close to uniform. This is the key privacy property (analogous to the Schnorr ZK simulator).

Security rests on the module-LWE problem (for key recovery) and the module-SIS problem (for forgery): producing \( (\mathbf{z}, c) \) satisfying verification without knowing \( \mathbf{s}_1, \mathbf{s}_2 \) requires finding a short vector in a module lattice.

5.7 NTRU Cryptosystem

NTRU (Hoffstein–Pipher–Silverman, 1998) is one of the oldest lattice-based cryptosystems and remains deployed in production systems today. Its hardness rests on the difficulty of recovering short vectors in an NTRU lattice, a structured 2\( n \)-dimensional lattice. Unlike LWE-based systems, NTRU does not have a known worst-case hardness reduction, but it has withstood more than 25 years of cryptanalysis and is the basis of FALCON (FN-DSA), one of the four NIST-standardized schemes.

The central algebraic setting is the polynomial ring \( R = \mathbb{Z}[x]/(x^n - 1) \) (or the negacyclic version \( \mathbb{Z}[x]/(x^n+1) \) in FALCON). NTRU’s security rests on the observation that, in a polynomial ring, dividing by a “small” polynomial modulo two different moduli \( p \) and \( q \) hides the numerator: if \( f \) is small and \( g \) is small, the quotient \( h = g \cdot f^{-1} \pmod{q} \) looks like a uniformly random polynomial modulo \( q \), yet carries hidden short-vector structure that enables efficient decryption.

NTRU Key Generation. Fix positive integers \( n, p, q \) with \( \gcd(p,q) = 1 \) and \( q \gg p \). Work in \( R = \mathbb{Z}[x]/(x^n - 1) \). Choose "small" polynomials \( f, g \in R \) (coefficients in \( \{-1, 0, 1\} \), with \( f \) invertible mod both \( p \) and \( q \)).
  • Compute \( f_p = f^{-1} \bmod p \) and \( f_q = f^{-1} \bmod q \) (using extended Euclidean in \( R \)).
  • Compute the public key \( h = p \cdot f_q \cdot g \bmod q \).
  • Private key: \( f \) (and \( f_p \) for decryption).
NTRU Encryption and Decryption.
  • Encrypt\( (h, m) \): Message \( m \in R \) has small coefficients in \( \{0,1,\ldots,p-1\} \). Choose random small polynomial \( r \in R \) (blinding polynomial). Ciphertext: \( e = r \cdot h + m \bmod q \).
  • Decrypt\( (f, f_p, e) \):
    1. Compute \( a = f \cdot e \bmod q \), centering coefficients in \( (-q/2, q/2] \).
    2. Compute \( b = a \bmod p \) (reduce coefficients mod \( p \)).
    3. Recover \( m = f_p \cdot b \bmod p \).
Decryption Correctness (Centering Argument). Expand \( a = f \cdot e \bmod q \): \[ a = f \cdot (r \cdot h + m) = f \cdot r \cdot h + f \cdot m = f \cdot r \cdot p \cdot f_q \cdot g + f \cdot m = p \cdot r \cdot g + f \cdot m \pmod{q}. \] The polynomial \( p \cdot r \cdot g + f \cdot m \) has small coefficients because \( r, g, f, m \) are all small and \( p \) is small. If these coefficients lie in \( (-q/2, q/2] \), then reducing mod \( q \) changes nothing — we have the exact integer equality \( a = p \cdot r \cdot g + f \cdot m \) (not just mod \( q \)). Now reduce mod \( p \): since \( p \mid p \cdot r \cdot g \), we get \( b = a \bmod p = f \cdot m \bmod p \). Finally, multiply by \( f_p = f^{-1} \bmod p \): \( f_p \cdot b = f_p \cdot f \cdot m = m \bmod p \). Since \( m \) has coefficients in \( \{0,\ldots,p-1\} \), this recovers \( m \) exactly.

The centering step — lifting \( a \bmod q \) to the symmetric interval \( (-q/2, q/2] \) — is the analogue of the rounding step in LWE decryption, and fails if the noise polynomial \( p \cdot r \cdot g + f \cdot m \) has any coefficient outside that interval. Choosing \( q \gg p \cdot n \cdot \|r\|_\infty \cdot \|g\|_\infty \) ensures correctness with overwhelming probability.

Example (NTRU with \( n=5 \), \( p=3 \), \( q=11 \)). Work in \( R = \mathbb{Z}[x]/(x^5-1) \). Elements are polynomials of degree at most 4 with integer coefficients, reduced modulo \( x^5 = 1 \) (cyclic convolution).

Key generation. Choose \( f = -1 + x + x^2 - x^4 \) and \( g = 1 + x - x^3 - x^4 \) (coefficients in \( \{-1,0,1\} \)).

Verify \( f \) is invertible mod 3: We need \( f \cdot f_3 \equiv 1 \pmod{3} \). By inspection or EEA in \( \mathbb{F}_3[x]/(x^5-1) \), suppose \( f_3 = 1 + 2x + 2x^2 + x^3 + 2x^4 \) (computed offline). Check: \( f \cdot f_3 \equiv 1 \pmod{3, x^5-1} \) — we accept this for the example.

Compute \( f_q = f^{-1} \bmod 11 \) similarly: \( f_q = 5 + 4x + 7x^2 + 4x^3 + 9x^4 \) (offline computation).

Public key: \( h = p \cdot f_q \cdot g \bmod 11 \). First, \( 3 \cdot f_q = 15 + 12x + 21x^2 + 12x^3 + 27x^4 \equiv 4 + x + 10x^2 + x^3 + 5x^4 \pmod{11} \). Multiplying by \( g = 1 + x - x^3 - x^4 \) in \( R \) and reducing mod \( (x^5-1, 11) \) yields some polynomial \( h \) — for this example, say \( h = 2 + 9x + 3x^2 + x^3 + 7x^4 \).

\[ e = r \cdot h + m \bmod 11. \]

\( r \cdot h = (x + x^2)(2 + 9x + 3x^2 + x^3 + 7x^4) \), expanded and reduced mod \( (x^5-1, 11) \). The cyclic convolution of \( r = (0,1,1,0,0) \) and \( h = (2,9,3,1,7) \) gives: \begin{align*} e_0 &= 0\cdot 2 + 0\cdot 7 + 1\cdot 1 + 1\cdot 3 = 4, \ e_1 &= 1\cdot 2 + 0\cdot 7 + 0\cdot 1 + 1\cdot 1 = 3, \ e_2 &= 1\cdot 9 + 1\cdot 2 + 0\cdot 7 + 0\cdot 1 = 11 \equiv 0, \ e_3 &= 0\cdot 3 + 1\cdot 9 + 1\cdot 2 + 0\cdot 7 = 11 \equiv 0, \ e_4 &= 0\cdot 1 + 0\cdot 3 + 1\cdot 9 + 1\cdot 2 = 11 \equiv 0. \end{align*} So \( r\cdot h \equiv 4 + 3x \pmod{11} \). Adding \( m = 1 + x^2 \):

\[ e = (4+1) + 3x + x^2 = 5 + 3x + x^2 \pmod{11}. \]

Decryption. Compute \( a = f \cdot e \bmod 11 \). Here \( f = -1 + x + x^2 - x^4 = (10, 1, 1, 0, 10) \) mod 11 and \( e = (5, 3, 1, 0, 0) \). The cyclic convolution \( a_k = \sum_{j} f_j e_{k-j \bmod 5} \) yields (computing mod 11 and centering to \( (-5,5] \)): suppose the result is \( a = 1 + x + x^2 + 0\cdot x^3 + 0\cdot x^4 \) (illustrative; the centering gives the exact polynomial \( p\cdot r\cdot g + f \cdot m \)). Reduce mod \( p=3 \): \( b = a \bmod 3 = 1 + x + x^2 \). Multiply by \( f_3 \): \( m' = f_3 \cdot b \bmod 3 = 1 + x^2 = m \). Decryption successful.

Remark (NTRU Security and FALCON). Attacking NTRU amounts to finding short vectors \( (f, g) \) in the 2\( n \)-dimensional NTRU lattice \( \Lambda = \{(u,v) \in \mathbb{Z}^{2n} : u \cdot h \equiv v \pmod q\} \). This is an instance of the Shortest Vector Problem. Unlike LWE, no quantum reduction from worst-case lattice problems to NTRU is known; however, the scheme has survived extensive cryptanalysis since 1998, and its structured lattice enables particularly efficient Gaussian sampling — the property that FALCON exploits to achieve signature sizes of 666 bytes at NIST Level 1, smaller than any other post-quantum signature scheme. FALCON (FN-DSA, FIPS 206) uses the negacyclic variant \( R_q = \mathbb{Z}_q[x]/(x^n+1) \) for \( n \in \{512, 1024\} \), with the GPV trapdoor construction for hash-and-sign signatures.

Chapter 6: RSA (Brief Treatment)

RSA predates elliptic curves and lattice-based cryptography by decades — it was invented in 1977, more than a quarter-century before Regev’s LWE. It appears in these notes after Chapter 5 for a deliberate pedagogical reason: having seen what makes lattice hardness quantum-resistant, you can now understand precisely what RSA lacks. RSA’s hardness rests on integer factoring, a problem with rich algebraic structure inside \( (\mathbb{Z}/n\mathbb{Z})^* \), and Shor’s algorithm exploits that structure directly. The contrast with LWE is clarifying: lattice problems resist Shor because they have no exploitable abelian periodicity; RSA falls because its core structure is an abelian group on which order-finding is exactly what phase estimation computes.

Beyond the quantum vulnerability, RSA’s catalogue of implementation attacks — PKCS#1 v1.5 padding oracles (Bleichenbacher 1998), small exponent attacks, CRT fault attacks (Bellcore attack), timing side channels (Kocher 1996) — provides an invaluable cautionary template. Many of these attack patterns recur in subtle forms in lattice-based schemes, making a thorough understanding of RSA attacks a prerequisite for auditing post-quantum implementations.

6.1 Setup and Key Generation

RSA (Rivest-Shamir-Adleman, 1977) was the first practical public-key cryptosystem. Its security rests on the presumed hardness of factoring large integers and computing \( e \)-th roots modulo a composite.

RSA Key Generation.
  1. Generate large primes \( p, q \) (each ~1024–2048 bits); set \( n = pq \).
  2. Compute \( \phi(n) = (p-1)(q-1) \).
  3. Choose public exponent \( e \) with \( \gcd(e, \phi(n)) = 1 \); standard choice \( e = 65537 = 2^{16}+1 \) (fast exponentiation).
  4. Compute \( d = e^{-1} \bmod \phi(n) \) via EEA.
  5. Public key: \( (n, e) \). Secret key: \( (n, d) \) (or \( (p, q, d) \)).

Textbook RSA. Encryption: \( c = m^e \bmod n \). Decryption: \( m = c^d \bmod n \). Correctness follows from Euler’s theorem: \( c^d = m^{ed} = m^{1 + k\phi(n)} = m \cdot (m^{\phi(n)})^k \equiv m \pmod{n} \) (for \( \gcd(m,n)=1 \)).

Example (Full RSA Key Generation with Small Primes). Take \( p = 11 \), \( q = 23 \). Then \( N = pq = 253 \) and \( \phi(N) = (p-1)(q-1) = 10 \times 22 = 220 \).

Choose public exponent \( e = 7 \). Check: \( \gcd(7, 220) = 1 \) (since 7 is prime and does not divide 220). Compute \( d = 7^{-1} \bmod 220 \) via the extended Euclidean algorithm:

\[ 220 = 31\times 7 + 3,\quad 7 = 2\times 3 + 1,\quad 3 = 3\times 1. \]

Back-substitute: \( 1 = 7 - 2\times 3 = 7 - 2(220 - 31\times 7) = 63\times 7 - 2\times 220 \). So \( d = 63 \). Verify: \( 7 \times 63 = 441 = 2\times 220 + 1 \equiv 1 \pmod{220} \). Correct.

Public key: \( (N, e) = (253, 7) \). Private key: \( (N, d) = (253, 63) \).

Encryption of \( m = 5 \): \( c = 5^7 \bmod 253 \). Using square-and-multiply: \( 5^2=25 \), \( 5^4=625\equiv 625-2\times 253=119 \), \( 5^7=5^4\cdot 5^2\cdot 5^1=119\times 25\times 5 \bmod 253 \). First \( 119\times 25 = 2975 \equiv 2975 - 11\times 253 = 2975-2783=192 \). Then \( 192\times 5=960\equiv 960-3\times 253=960-759=201 \). So \( c = 201 \).

Decryption: \( m = 201^{63} \bmod 253 \). One may compute using CRT: \( d_p = 63\bmod 10 = 3 \) and \( d_q = 63\bmod 22 = 19 \). Compute \( 201^3 \bmod 11 \): \( 201\equiv 201-18\times 11=3\pmod{11} \), \( 3^3=27\equiv 5\pmod{11} \). Compute \( 201^{19}\bmod 23 \): \( 201\equiv 201-8\times 23=17\pmod{23} \), \( 17^{19}\bmod 23 \). By Fermat \( 17^{22}\equiv 1 \), so \( 17^{19}=17^{22-3}=(17^{22})(17^{-3})\equiv 17^{-3}\pmod{23} \). Now \( 17^2=289\equiv 13 \), \( 17^3\equiv 13\times 17=221\equiv 221-9\times 23=14 \), and \( 14^{-1}\bmod 23 \): \( 14\times 5=70\equiv 1\pmod{23} \), so \( 17^{-3}\equiv 5 \pmod{23} \). CRT: need \( x\equiv 5\pmod{11} \) and \( x\equiv 5\pmod{23} \). Since both residues are 5 and \( \gcd(11,23)=1 \), the solution is \( x=5 \). Recovered \( m = 5 \). Correct.

6.2 Security Issues with Textbook RSA

Textbook RSA is insecure in multiple ways:

  1. Deterministic: Encrypting the same \( m \) always gives the same \( c \), violating IND-CPA. An adversary can simply encrypt \( m_0 \) and compare with \( c^* \).
  2. Multiplicative homomorphism: \( (m_1^e)(m_2^e) = (m_1 m_2)^e \bmod n \), so a ciphertext for \( m_1 m_2 \) is the product of ciphertexts. An adversary can transform a challenge ciphertext \( c^* = m_b^e \) to \( c^* \cdot 2^e \) (which decrypts to \( 2m_b \)), query decryption (in CCA model), and determine \( b \).
  3. Small exponent attack: If \( e = 3 \) and \( m \) is small, \( m^3 < n \), so \( c = m^3 \) exactly over the integers; compute \( m = c^{1/3} \) via integer cube root with no modular arithmetic needed.

6.3 OAEP and RSA-PSS

RSA-OAEP (Optimal Asymmetric Encryption Padding, Bellare-Rogaway 1994) is the standard for RSA-based encryption. It uses two hash functions to create a randomized, “full-domain” encoding of the message before exponentiation. In the random oracle model, RSA-OAEP achieves IND-CCA2 security assuming RSA is a one-way permutation. PKCS#1 v2.1 and v2.2 mandate OAEP.

RSA-PSS (Probabilistic Signature Scheme, Bellare-Rogaway 1996) applies a similar randomized hash encoding for signatures. RSA-PSS has a tight security proof in the ROM: any EUF-CMA adversary breaks RSA. This is in contrast to PKCS#1 v1.5 signatures (still widely used in TLS certificates) which lack a clean security proof.

6.4 Factoring Algorithms

The security of RSA ultimately depends on the hardness of factoring \( n = pq \). Known factoring algorithms:

AlgorithmTime complexity
Trial division\( O(\sqrt{n}) = O(2^{k/2}) \) for \( k \)-bit \( n \)
Pollard rho\( O(n^{1/4}) = O(2^{k/4}) \)
Quadratic Sieve\( L_n[1/2, 1] \)
Number Field Sieve (NFS)\( L_n[1/3, (64/9)^{1/3}] \approx L_n[1/3, 1.923] \)

Here \( L_n[\alpha, c] = \exp(c(\log n)^\alpha (\log \log n)^{1-\alpha}) \). NFS is sub-exponential (between polynomial and exponential) and is the best known classical algorithm. For 2048-bit RSA, NFS requires roughly \( 2^{112} \) operations, giving roughly 112-bit classical security; NIST recommends 2048-bit RSA as the minimum and 3072-bit for long-term security beyond 2030.

6.5 CRT-RSA and Side Channels

CRT speedup. Rather than computing \( c^d \bmod n \) directly (expensive for large \( n, d \)), compute \( c^{d_p} \bmod p \) and \( c^{d_q} \bmod q \) separately, where \( d_p = d \bmod (p-1) \) and \( d_q = d \bmod (q-1) \). By Fermat’s little theorem, \( c^d \equiv c^{d_p} \pmod{p} \). Combine the two results using CRT. Exponents \( d_p, d_q \) are about half the bit-length of \( d \), and the moduli are half the bit-length of \( n \), giving approximately a \( 4\times \) speedup.

Remark (CRT-RSA Speedup — Worked Through). In the example above: \( p=11 \), \( q=23 \), \( d=63 \). Reduced exponents: \( d_p = 63 \bmod (p-1) = 63 \bmod 10 = 3 \) and \( d_q = 63 \bmod (q-1) = 63 \bmod 22 = 19 \).

To decrypt ciphertext \( c = 201 \):

  • Mod \( p = 11 \): \( c \bmod 11 = 201 \bmod 11 = 3 \). Compute \( 3^{d_p} = 3^3 = 27 \equiv 5 \pmod{11} \).
  • Mod \( q = 23 \): \( c \bmod 23 = 201 \bmod 23 = 17 \). Compute \( 17^{d_q} = 17^{19} \bmod 23 = 5 \pmod{23} \) (computed above).
Apply CRT to find \( m \equiv 5 \pmod{11} \) and \( m \equiv 5 \pmod{23} \): answer \( m = 5 \).

The speedup: each sub-exponentiation uses a \( \log_2 d_p \approx 3\text{-bit} \) exponent and an 11-bit modulus, compared to the full \( 63\text{-bit} \) exponent and 253-bit modulus. In a 2048-bit RSA system, each CRT sub-exponentiation uses a 1024-bit exponent and 1024-bit modulus, while the direct computation uses a 2048-bit exponent and 2048-bit modulus. Since modular exponentiation cost scales roughly as \( O(k^3) \) in the key-length \( k \), we get \( 2 \times (1024^3 / 2048^3) = 2/8 = 1/4 \) of the work — a 4× speedup. This CRT optimization is standard in all production RSA implementations (OpenSSL, BoringSSL, Java JCE).

Side-channel attacks. Kocher (1996) showed that timing measurements of modular exponentiation can reveal the secret key bit by bit. The fix is exponent blinding: replace \( d \) by \( d + r\phi(n) \) for a random \( r \), so timing depends on a different exponent each run. Montgomery ladder provides a similar countermeasure for ECC.

Remark (Timing and Side-Channel Attacks on RSA). In square-and-multiply exponentiation, the processing time of each bit differs: the algorithm always squares but multiplies only when the current bit is 1. Kocher (1996) showed that by measuring precise timing of hundreds of decryption operations, an adversary can recover the private key \( d \) bit by bit: for each candidate bit \( d_i \in \{0, 1\} \), one hypothetical value predicts slightly faster execution (no multiply) and the other predicts slower execution (with multiply). Correlating these predictions against measured timings across many decryptions yields the correct key.

Defenses fall into two categories:

  1. Constant-time exponentiation (Montgomery ladder): Execute the same sequence of operations regardless of which bits of \( d \) are 0 or 1. On RSA this typically means using an always-multiply variant of square-and-multiply, or the Montgomery ladder adapted to integer exponentiation.
  2. Message blinding: Before decryption, multiply the ciphertext \( c \) by a random \( r^e \bmod n \): compute \( m' = (c \cdot r^e)^d \bmod n = m \cdot r \). After decryption, divide by \( r \) to recover \( m \). Since \( r \) is fresh each time, the exponentiation timing depends on a randomized input and leaks no information about \( d \).
Side-channel attacks remain a live threat in embedded systems where physical measurements are easy and constant-time code is hard to guarantee. The Bellcore fault attack (Boneh-DeMillo-Lipton 1997) is even more devastating: a single faulty decryption under CRT immediately leaks a prime factor of \( N \) via a GCD computation, completely breaking RSA-CRT.

6.6 RSA vs. ECC: Security Comparison

Security levelRSA/DH key sizeECC key size
80-bit1024 bits160 bits
128-bit3072 bits256 bits
192-bit7680 bits384 bits
256-bit15360 bits521 bits

The ratio grows as security level increases, because index calculus for DH/RSA scales as \( L_n[1/3] \) (faster than exponential) while ECC Pollard rho scales as \( O(2^{n/2}) \) (fully exponential in key size \( n \)).


Chapter 7: Quantum Algorithms and Post-Quantum Cryptography

The emergence of quantum computing does not merely threaten one or two cryptographic constructions — it threatens the entire algebraic paradigm that underlies public-key cryptography as it has been practised since Diffie-Hellman (1976). Shor’s algorithm (1994) reduces both integer factorization and discrete logarithm — in finite fields and on elliptic curves — to a single abstract problem solvable in quantum polynomial time. Grover’s algorithm (1996) halves the bit-security of every symmetric key and hash function. Simon’s algorithm (1994) breaks a surprising number of symmetric constructions outright when an adversary can query them quantumly. Understanding these algorithms at a mathematical level is now a core competency for anyone who designs or deploys cryptographic systems.

7.1 Quantum Computing Model

A classical bit holds either 0 or 1. A qubit is a unit vector in \( \mathbb{C}^2 \):

\[ |\psi\rangle = \alpha|0\rangle + \beta|1\rangle, \qquad |\alpha|^2 + |\beta|^2 = 1. \]

Measuring in the computational basis yields \( |0\rangle \) with probability \( |\alpha|^2 \) and \( |1\rangle \) with probability \( |\beta|^2 \), collapsing the superposition. An \( n \)-qubit register lives in \( \mathbb{C}^{2^n} \):

\[ |\psi\rangle = \sum_{x \in \{0,1\}^n} \alpha_x |x\rangle, \qquad \sum_x |\alpha_x|^2 = 1. \]

Entanglement arises when the joint state of two qubits cannot be written as a tensor product of individual qubit states. The Bell state \( \frac{1}{\sqrt{2}}(|00\rangle + |11\rangle) \) is the canonical example: measuring the first qubit instantly determines the second, regardless of physical separation.

Quantum gates are unitary matrices acting on the Hilbert space. Unitarity ensures reversibility and preserves the normalization constraint. Key single-qubit gates:

Hadamard gate: \( H = \frac{1}{\sqrt{2}}\begin{pmatrix}1 & 1 \\ 1 & -1\end{pmatrix} \). Maps \( |0\rangle \mapsto \frac{|0\rangle+|1\rangle}{\sqrt{2}} \) and \( |1\rangle \mapsto \frac{|0\rangle-|1\rangle}{\sqrt{2}} \). Applied to all \( n \) qubits of \( |0\cdots 0\rangle \): \( H^{\otimes n}|0^n\rangle = \frac{1}{\sqrt{2^n}}\sum_{x=0}^{2^n-1}|x\rangle \) — uniform superposition over all \( 2^n \) inputs.

Phase gate \( R_k \): \( R_k = \begin{pmatrix}1 & 0 \\ 0 & e^{2\pi i/2^k}\end{pmatrix} \). Applies a phase \( e^{2\pi i/2^k} \) to \( |1\rangle \). These controlled phase gates are the building blocks of the QFT circuit.

CNOT gate: Maps \( |c\rangle|t\rangle \mapsto |c\rangle|t \oplus c\rangle \). Entangles two qubits; used to create Bell states.

The quantum circuit model composes unitary gates in sequence. A circuit on \( n \) qubits using \( g \) gates runs in time \( O(g) \); efficiency is measured in gate count as a function of input size. The BQP complexity class contains all decision problems solvable by polynomial-size quantum circuits with bounded error. BQP is known to contain BPP (classical randomized polynomial time) and to be contained in PSPACE; its relationship to NP is unknown, but factoring (believed outside BPP) lies in BQP.

Quantum parallelism alone is not magic: applying a circuit \( U_f \) to a superposition \( \frac{1}{\sqrt{N}}\sum_x |x\rangle|0\rangle \) yields \( \frac{1}{\sqrt{N}}\sum_x|x\rangle|f(x)\rangle \), encoding \( f \) evaluated at all inputs simultaneously. But measuring this state collapses it to a single \( (x, f(x)) \) pair, no better than random sampling. The genuine power comes from quantum interference: careful sequences of gates cancel amplitudes at wrong answers and reinforce amplitudes at right answers. Both Shor and Grover exploit interference in sophisticated ways.

7.2 Quantum Fourier Transform

The Quantum Fourier Transform (QFT) is the engine of Shor’s algorithm. It is the quantum analogue of the Discrete Fourier Transform (DFT).

Definition (QFT\(_N\)). For \( N = 2^n \), the QFT acts on basis states as: \[ \text{QFT}_N|j\rangle = \frac{1}{\sqrt{N}} \sum_{k=0}^{N-1} \omega^{jk}|k\rangle, \qquad \omega = e^{2\pi i/N}. \] On a general state \( \sum_j a_j|j\rangle \), it maps amplitudes \( (a_j) \) to their DFT \( (\hat{a}_k) \) where \( \hat{a}_k = \frac{1}{\sqrt{N}}\sum_j a_j \omega^{jk} \).

The classical FFT computes the DFT of \( N \) numbers in \( O(N \log N) \) operations. The QFT achieves the same transformation of the \( N \) amplitudes using only \( O(n^2) = O((\log N)^2) \) quantum gates — an exponential reduction. This is not because the QFT reads out all \( N \) Fourier coefficients simultaneously (measuring collapses the state to one), but because the amplitude vector becomes the DFT, which can then be measured with interference-guided amplification.

Circuit decomposition. Writing \( j \) in binary as \( j = j_1 2^{n-1} + \cdots + j_n 2^0 \), the QFT has a product representation:

\[ \text{QFT}_N|j_1,\ldots,j_n\rangle = \frac{1}{\sqrt{N}} \bigotimes_{l=1}^{n} \left(|0\rangle + e^{2\pi i j / 2^l}|1\rangle\right). \]

Each factor \( |0\rangle + e^{2\pi i j/2^l}|1\rangle \) depends only on \( j \bmod 2^l \), i.e., on the last \( l \) bits of \( j \). This enables a recursive circuit: apply Hadamard to qubit \( l \), then apply controlled-\( R_k \) gates from qubits \( l+1, \ldots, n \) to add the fractional phases, then finish with a bit-reversal swap. The total gate count is \( \sum_{l=1}^n l = n(n+1)/2 = O(n^2) \).

What QFT finds. If the input state has period \( r \) — that is, \( a_j = a_{j+r} \) for all \( j \) — then the QFT concentrates amplitude at multiples of \( N/r \). Measuring yields \( \approx s \cdot N/r \) for a random integer \( s \), and the ratio \( s/r \) can be recovered via continued fractions. This period-finding power is the core of Shor’s algorithm.

7.3 Quantum Phase Estimation

Phase estimation is the central subroutine in Shor’s algorithm, generalizing period-finding to arbitrary unitary operators.

Problem (Phase Estimation). Given a unitary \( U \) and an eigenvector \( |u\rangle \) with \( U|u\rangle = e^{2\pi i \theta}|u\rangle \), estimate \( \theta \in [0,1) \) to \( t \) bits of precision using \( O(t) \) controlled-\( U \) operations.

Circuit. Use two registers: \( t \) ancilla qubits initialized to \( |0\rangle \), and the eigenvector register initialized to \( |u\rangle \).

  1. Apply \( H^{\otimes t} \) to the ancilla register: \( \frac{1}{\sqrt{2^t}}\sum_{j=0}^{2^t-1}|j\rangle \otimes |u\rangle \).
  2. Apply controlled-\( U^{2^k} \) gates for \( k = 0, 1, \ldots, t-1 \), where the \( k \)-th ancilla qubit controls \( U^{2^k} \) on the eigenvalue register. Because \( U^{2^k}|u\rangle = e^{2\pi i \cdot 2^k \theta}|u\rangle \), the ancilla register accumulates: \[ \frac{1}{\sqrt{2^t}}\sum_{j=0}^{2^t-1} e^{2\pi i j\theta}|j\rangle \otimes |u\rangle. \]
  3. Apply \( \text{QFT}^{-1}_{2^t} \) to the ancilla register. If \( \theta = s/2^t \) exactly, the state collapses to \( |s\rangle \), recovering \( \theta \) perfectly. If \( \theta \) is not an exact \( t \)-bit fraction, measuring gives the closest \( t \)-bit approximation with probability \( \geq 4/\pi^2 \approx 0.405 \).

The \( t \) controlled-\( U^{2^k} \) operations each require one call to \( U^{2^k} \) (implemented by repeated squaring, \( O(t) \) multiplications each). Total cost: \( O(t^2) \) controlled-\( U \) applications plus \( O(t^2) \) QFT gates.

7.4 Shor’s Algorithm

Shor’s algorithm reduces integer factorization and discrete logarithm to order-finding, which is solved by phase estimation.

Reduction: Factoring to Order-Finding

Given \( N = pq \) (odd, composite, not a prime power), pick random \( a \) with \( \gcd(a, N) = 1 \). Define the function \( f(x) = a^x \bmod N \); it is periodic with period \( r = \text{ord}_N(a) \). If \( r \) is even and \( a^{r/2} \not\equiv -1 \pmod{N} \) (which holds with probability \( \geq 1/2 \) over random \( a \)), then:

\[ a^r \equiv 1 \pmod{N} \implies N \mid (a^{r/2}-1)(a^{r/2}+1). \]

Since \( a^{r/2} \not\equiv \pm 1 \pmod{N} \), neither factor is divisible by \( N \) alone, so \( \gcd(a^{r/2}-1, N) \) and \( \gcd(a^{r/2}+1, N) \) are both non-trivial factors. Repeat with fresh \( a \) until a non-trivial factor is found; expected \( O(1) \) repetitions.

Order-Finding via Phase Estimation

The unitary \( U_a : |x\rangle \mapsto |ax \bmod N\rangle \) acts on \( \lceil \log_2 N \rceil \)-qubit registers. Its eigenvectors are:

\[ |u_s\rangle = \frac{1}{\sqrt{r}}\sum_{j=0}^{r-1} e^{-2\pi i js/r}|a^j \bmod N\rangle, \qquad s = 0, 1, \ldots, r-1, \]

with eigenvalues \( e^{2\pi i s/r} \). Observe that \( \frac{1}{\sqrt{r}}\sum_{s=0}^{r-1}|u_s\rangle = |1\rangle \), so starting from \( |1\rangle \) in the second register (which we can prepare easily) gives an equal superposition of all eigenvectors.

Phase estimation on \( U_a \) with input \( |1\rangle \) yields a random \( s/r \) (for random \( s \)), to \( t \) bits of precision, after which:

Continued Fractions Step. From a measured approximation \( \tilde{\phi} \approx s/r \), run the classical continued fraction algorithm to find the best rational approximation \( p/q \) with \( q \leq N \). If \( \gcd(s,r) = 1 \), then \( p/q = s/r \) exactly and \( q = r \). Since \( \gcd(s,r) = 1 \) holds for \( \phi(r)/r \) fraction of values \( s \), success probability per trial is \( \Omega(1/\log\log N) \). Repeat \( O(\log\log N) \) times to recover \( r \) with constant probability.

Complexity. Modular exponentiation \( U_a^{2^k} \) needs \( O((\log N)^2) \) quantum gates via schoolbook multiplication (or \( O(\log N \cdot (\log\log N)^2) \) with FFT-based multiplication). Phase estimation with \( t = 2\lceil \log_2 N \rceil + O(1) \) precision bits requires \( O(\log N) \) controlled applications. Total quantum gate count: \( O((\log N)^3) \), polynomial in the bit-length of \( N \). Compare to the classical NFS running time \( \exp(O((\log N)^{1/3}(\log\log N)^{2/3})) \).

Discrete Logarithm

Shor also solves DLP: given \( g, h \in G \) with \( h = g^x \), find \( x \). Define unitaries \( U_g : |j\rangle \mapsto |g^j \bmod N\rangle \) and \( U_h \). Phase estimation on the two-register state \( U_g^a U_h^{-b} \) for random \( a, b \) allows recovery of \( x = \log_g h \). The algorithm works in any group where the group operation is efficient as a quantum circuit — in particular, in \( (\mathbb{Z}/p\mathbb{Z})^* \) and on elliptic curves over finite fields. The latter is critical: ECC was designed to resist classical DLP attacks (no subexponential algorithm), but Shor’s algorithm breaks it in \( O((\log p)^3) \) quantum gate operations.

Abelian Hidden Subgroup Problem

Both factoring and DLP are instances of the Abelian Hidden Subgroup Problem (HSP):

Abelian HSP. Let \( G \) be a finite abelian group (given as a product of cyclic groups) and \( f: G \to S \) be a function that is constant and distinct on cosets of an unknown subgroup \( H \leq G \). Find the generators of \( H \).

QFT over the abelian group \( G \) solves HSP in quantum polynomial time. Factoring corresponds to \( G = \mathbb{Z}_{N-1} \) (order of the multiplicative group), DLP to \( G = \mathbb{Z}_q \times \mathbb{Z}_q \). For non-abelian groups — such as the symmetric group \( S_n \) (relevant for Graph Isomorphism) — QFT-based approaches work only partially; no polynomial-time quantum algorithm is known for the non-abelian HSP in general, which is why lattice problems (which lack exploitable abelian structure) resist Shor-style attacks.

Resource Estimates and Gate Complexity

Exact gate count. For an \( n \)-bit integer \( N \) (so \( n = \log_2 N \)):

  • Phase estimation requires \( t = 2n + O(1) \) ancilla qubits for sufficient precision.
  • Each controlled-\( U_a^{2^k} \) application is a controlled modular exponentiation: \( U_a^{2^k} \) computes \( a^{2^k} \bmod N \), which requires \( O(n^2) \) Toffoli gates by schoolbook multiplication (each of \( O(n) \) multiply steps costs \( O(n) \) gates). Total across \( 2n \) steps: \( O(n^3) \) Toffoli gates.
  • The QFT inverse uses \( O(n^2) \) gates.
  • Total: \( O(n^3) \) gates, which for RSA-2048 (\( n = 2048 \)) is \( O(2048^3) \approx 8.6 \times 10^9 \) Toffoli gates.

Breaking RSA-2048 requires approximately 4,000 logical qubits with full quantum error correction. Each logical qubit requires roughly 1,000 physical qubits under the surface code (assumed error rate \( p \approx 10^{-3} \)). Total physical qubit count: \( \sim 4 \times 10^6 \). Runtime: a few hours at a clock rate of 1 MHz logical operations. As of 2025, the best superconducting quantum processors (Google Willow, IBM Heron) have \( \sim 10^3 \) physical qubits with error rates around \( 10^{-3} \), still far from the fault-tolerant regime needed for Shor at cryptographic scale. The gap is at least one order of magnitude in qubit count and likely more in quality; however, quantum error correction overhead may decrease significantly with improved hardware, making the timeline deeply uncertain.

Remark (Why RSA-2048 Requires ~4000 Logical Qubits). The \( 2n \)-qubit ancilla register (for \( n = 2048 \): 4096 qubits) dominates the logical qubit count. The \( n \)-qubit register for \( x \bmod N \) adds 2048 more, but optimized circuits (Beauregard 2003, Häner–Roetteler–Svore 2017) reduce the total to roughly \( 2n + 3 \) qubits through clever qubit reuse during modular arithmetic. For RSA-2048, the optimized Beauregard circuit uses \( 2 \times 2048 + 3 = 4099 \) logical qubits — the source of the "~4000 logical qubits" figure. At a surface code distance \( d = 27 \) (required for error rate \( p \approx 10^{-3} \) below threshold), each logical qubit costs \( 2d^2 - 1 = 1457 \) physical qubits. Total: \( 4099 \times 1457 \approx 6 \times 10^6 \) physical qubits. More recent optimizations (Gidney–Ekerå 2021) reduce this to approximately \( 1 \times 10^6 \) physical qubits by trading space for time using windowed arithmetic, but the fundamental \( O(n^3) \) gate complexity remains.

7.5 Simon’s Algorithm and Symmetric Cryptography Attacks

Simon’s algorithm (1994) predates Shor and was the first to demonstrate an exponential quantum speedup. It has surprising consequences for symmetric cryptography.

Simon's Problem. Given oracle access to \( f: \{0,1\}^n \to \{0,1\}^n \) promised to satisfy \( f(x) = f(x \oplus s) \) for all \( x \) and a hidden string \( s \in \{0,1\}^n \), find \( s \). Classical query complexity: \( \Omega(2^{n/2}) \) (birthday bound). Quantum query complexity: \( O(n) \).

Algorithm. Prepare \( \frac{1}{\sqrt{2^n}}\sum_x|x\rangle|0\rangle \). Apply the oracle: \( \frac{1}{\sqrt{2^n}}\sum_x|x\rangle|f(x)\rangle \). Measure the second register; this collapses the first to an equal superposition \( \frac{1}{\sqrt{2}}(|x_0\rangle + |x_0 \oplus s\rangle) \) for a random \( x_0 \). Apply Hadamard to the first register; the resulting state has support only on \( \{y : y \cdot s = 0\pmod{2}\} \). Collect \( O(n) \) such vectors \( y \) and solve the resulting linear system over \( \mathbb{F}_2 \) to recover \( s \).

Application to symmetric cryptography. Consider the Even-Mansour construction \( E_{k_1,k_2}(x) = P(x \oplus k_1) \oplus k_2 \), where \( P \) is a public random permutation. Classically, Even-Mansour has \( 2^n \) key security. Quantumly, Simon’s algorithm recovers \( k_1 \oplus k_2 \) in \( O(n) \) queries to a quantum oracle for \( E \), because \( E(x) \oplus E(x \oplus k_1) = P(x \oplus k_1) \oplus k_2 \oplus P(x) \oplus k_2 = P(x) \oplus P(x\oplus k_1) \) has Simon’s hidden-shift structure. More generally:

  • CBC-MAC and PMAC are broken in \( O(n) \) quantum queries (Kaplan–Leurent–Leverrier–Naya-Plasencia, 2016).
  • GHASH (used in AES-GCM) is vulnerable to polynomial evaluation attacks lifted to quantum.
  • Any construction with an XOR-periodic internal function is potentially vulnerable.

These attacks require a quantum adversary — one who can query the cipher on a superposition of inputs, which is not a standard threat model. In the Q1 model (classical queries, quantum computation), Even-Mansour and AES-GCM retain their classical security. In the Q2 model (quantum queries), many symmetric constructions require redesign. The NIST symmetric standards (AES, SHA-3) are believed secure in Q1; their Q2 security is an active research topic.

7.6 Grover’s Algorithm

Grover’s algorithm solves unstructured search quadratically faster than any classical algorithm.

Grover's Algorithm. Given an oracle \( O_f: |x\rangle \mapsto (-1)^{f(x)}|x\rangle \) for \( f:\{0,1\}^n \to \{0,1\} \) with \( M \) solutions out of \( N = 2^n \) inputs, Grover's algorithm finds a solution in \( O\!\left(\sqrt{N/M}\right) \) oracle queries. The classical lower bound is \( \Omega(N/M) \). This speedup is provably optimal for quantum query algorithms.

Geometric interpretation. Let \( |\alpha\rangle = \frac{1}{\sqrt{N-M}}\sum_{f(x)=0}|x\rangle \) (non-solutions) and \( |\beta\rangle = \frac{1}{\sqrt{M}}\sum_{f(x)=1}|x\rangle \) (solutions). The initial state \( H^{\otimes n}|0\rangle = \sqrt{\frac{N-M}{N}}|\alpha\rangle + \sqrt{\frac{M}{N}}|\beta\rangle \) lies at angle \( \theta/2 \) from \( |\alpha\rangle \), where \( \sin(\theta/2) = \sqrt{M/N} \).

Each Grover iteration — oracle phase flip \( O_f \) followed by inversion-about-mean \( 2H|0\rangle\langle 0|H - I \) — rotates the state by angle \( \theta \) in the 2D plane spanned by \( |\alpha\rangle \) and \( |\beta\rangle \). After \( k \) iterations, the amplitude at \( |\beta\rangle \) is \( \sin\!\left((2k+1)\theta/2\right) \). This is maximized at \( k \approx \frac{\pi}{4}\sqrt{N/M} \) iterations, at which point measurement yields a solution with probability \( \geq 1 - O(M/N) \).

Optimality. Bennett, Bernstein, Brassard, and Vazirani (1997) proved that any quantum algorithm requires \( \Omega(\sqrt{N/M}) \) oracle queries. Grover’s algorithm is therefore optimal.

Multiple solutions. When \( M \) is unknown, run Grover with geometrically increasing iteration counts \( 1, 2, 4, \ldots \) until a solution is found; expected total queries \( O(\sqrt{N/M}) \).

Amplitude amplification. Grover’s iteration generalizes to any quantum algorithm \( \mathcal{A} \) that finds a solution with probability \( p \): the amplified algorithm solves the problem in \( O(1/\sqrt{p}) \) applications of \( \mathcal{A} \), regardless of the structure of the search space. This makes Grover applicable as an inner loop in more complex quantum algorithms.

7.7 Impact on Cryptographic Security Levels

Grover’s algorithm halves the bit-security of every primitive that relies on exhaustive search.

Block ciphers. An \( n \)-bit key has classical security \( 2^n \) (exhaustive key search). Grover gives a quantum key search in \( O(2^{n/2}) \) evaluations. Consequence:

CipherKey sizeClassical securityQuantum security
AES-128128 bits\( 2^{128} \)\( \approx 2^{64} \) (insecure)
AES-256256 bits\( 2^{256} \)\( \approx 2^{128} \) (safe)
3DES-112112 bits\( 2^{112} \)\( \approx 2^{56} \) (insecure)

Hash functions. Preimage resistance: Grover finds a preimage of an \( n \)-bit hash in \( O(2^{n/2}) \) queries. Collision resistance: the Brassard-Høyer-Tapp (BHT) algorithm finds collisions in \( O(2^{n/3}) \) quantum queries (using quantum walk on a Johnson graph with QRAM), better than Grover’s \( O(2^{n/2}) \).

HashOutputClassical collisionQuantum collision (BHT)
SHA-256256 bits\( 2^{128} \)\( \approx 2^{85} \)
SHA-384384 bits\( 2^{192} \)\( \approx 2^{128} \)
SHA3-256256 bits\( 2^{128} \)\( \approx 2^{85} \)

NIST symmetric recommendations: Use AES-256 and SHA-384 (or SHA3-256 for collision resistance under \( 2^{85} \)-bound models; SHA3-384 for stronger margin) in post-quantum deployments. The KEM/signature schemes standardized by NIST already build on SHA3-256 with explicit quantum security analysis.

7.8 Why Lattice Problems Resist Quantum Algorithms

The abelian group structure of \( (\mathbb{Z}/N\mathbb{Z})^* \) and elliptic curve groups is precisely what Shor’s algorithm exploits. Lattice problems have no known exploitable group structure.

HSP obstruction. SVP and CVP are not known to reduce to any HSP. The closest structural problem is the dihedral hidden subgroup problem (relevant for ideal lattices), for which no efficient quantum algorithm is known despite decades of effort. The best known quantum algorithm for dihedral HSP (Kuperberg 2005) runs in \( 2^{O(\sqrt{\log N})} \) time — subexponential but not polynomial.

Quantum speedups for lattice algorithms. The best classical lattice algorithm for \( n \)-dimensional SVP is the BKZ-\(\beta\) algorithm with sieving subroutine, running in time \( \sim 2^{0.292\beta} \) per BKZ tour. Quantum speedups:

  1. Grover speedup on classical sieving: The inner sieve can use Grover search, reducing the exponent from \( 0.292 \) to \( 0.265 \). This is only a modest constant-factor improvement in the exponent, not exponential.
  2. Quantum walk speedup on sieving: Laarhoven (2016) showed quantum walks can reduce sieving to \( \sim 2^{0.2653n} \). Again only a constant improvement.
  3. No known exponential speedup exists for lattice sieving or enumeration. The best quantum attack on LWE remains classical BKZ with minor Grover enhancement.

Concrete numbers. For Kyber-768 (\( n = 768 \), \( q = 3329 \)):

  • Classical core-SVP hardness: \( \approx 2^{183} \)
  • Quantum core-SVP hardness: \( \approx 2^{166} \)
  • NIST target: 128-bit quantum security (Category 3)

The gap between 166 and 128 provides a healthy security margin. This is the strongest current evidence that lattice cryptography is post-quantum secure.

7.9 The NIST PQC Standardization Process

NIST initiated its Post-Quantum Cryptography Standardization in 2016, receiving 69 complete submissions. After four public rounds and extensive cryptanalysis:

FIPS StandardSchemePrimitiveMathematical BasisYear
FIPS 203 (ML-KEM)CRYSTALS-KyberKEMModule-LWE2024
FIPS 204 (ML-DSA)CRYSTALS-DilithiumSignatureModule-LWE + SIS2024
FIPS 205 (SLH-DSA)SPHINCS+SignatureHash functions (XMSS/FORS)2024
FIPS 206 (FN-DSA)FALCONSignatureNTRU lattices + GPV2024

Why four schemes? Kyber and Dilithium share the same mathematical assumption (Module-LWE/SIS); a breakthrough against one likely breaks the other. SPHINCS+ provides an independent safety net based only on hash function security — if all lattice assumptions fail, SPHINCS+ still stands, at the cost of larger signatures (8 KB for SPHINCS+-128s). FALCON offers smaller signatures than Dilithium (about 666 bytes vs 2.4 KB at NIST Level 1) but requires precise Gaussian sampling over lattices, making it harder to implement without side-channel leakage.

SIKE breach (2022). The Supersingular Isogeny Key Encapsulation (SIKE) scheme was a round-4 alternate candidate. In July 2022, Castryck and Decru published a polynomial-time classical attack exploiting the additional structure of SIDH’s auxiliary torsion points via isogeny computations on higher-dimensional abelian varieties. SIKE was completely broken in under one hour on a laptop. This episode is a cautionary tale: SIDH had been studied for nearly a decade before a fatal flaw was discovered. NIST’s conservative multi-assumption approach is directly vindicated by SIKE’s failure.

7.10 Hybrid Cryptography and Post-Quantum Migration

The uncertainty in quantum computing timelines, combined with the long lifespan of encrypted archives, creates a case for immediate action.

“Harvest now, decrypt later.” A well-resourced adversary intercepting TLS handshakes today can store the classical RSA/ECDH ciphertexts and decrypt them retroactively once a large-scale quantum computer becomes available. Secrets with long lifespans — diplomatic communications, financial records, health data, intellectual property — are at risk even before any quantum computer is built.

Hybrid key exchange. During the transition period, NIST recommends combining a classical scheme with a post-quantum scheme:

Hybrid Key Exchange (X25519Kyber768). In a TLS handshake:
  1. Run X25519 ECDH: both parties compute shared secret \( K_\text{EC} = \text{X25519}(a, B) \).
  2. Run Kyber-768 KEM: encapsulate a second shared secret \( K_\text{PQ} \).
  3. Derive session key: \( K = \text{KDF}(K_\text{EC} \| K_\text{PQ} \| \text{transcript}) \).
Security: the scheme is secure if either X25519 or Kyber is secure. A classical attacker breaks neither; a quantum attacker breaks X25519 but not Kyber. The combination is safe against both.

X25519Kyber768 (now standardized as ML-KEM-768 + X25519 in RFC 9497) is deployed in Chrome 116+, Firefox 132+, and Cloudflare’s TLS termination infrastructure.

Migration checklist for cryptographic engineers:

  • Replace RSA and ECDH with ML-KEM-768 or ML-KEM-1024 for key exchange.
  • Replace ECDSA and RSA signatures with ML-DSA-65 or FN-DSA-512 for authentication.
  • Replace AES-128 with AES-256; replace SHA-256 with SHA-384 for integrity.
  • Audit protocols for Simon-vulnerable constructions (CBC-MAC, counter-mode with short tags) if quantum oracle access is a concern in the future threat model.
  • Deploy hybrid schemes during transition to retain classical security until post-quantum deployment is complete.

Chapter 8: Pairing-Based Cryptography and Advanced Topics

8.1 Bilinear Pairings

Pairing-based cryptography derives additional functionality — identity-based encryption, short signatures, non-interactive zero knowledge — from special bilinear maps between elliptic curve groups.

Definition. Let \( G_1, G_2, G_T \) be cyclic groups of prime order \( q \) with generators \( P_1, P_2, P_T \). A bilinear pairing is a map \( e : G_1 \times G_2 \to G_T \) satisfying:
  • Bilinearity: \( e(aP, bQ) = e(P, Q)^{ab} \) for all \( a, b \in \mathbb{Z}_q \).
  • Non-degeneracy: \( e(P_1, P_2) \neq 1_{G_T} \).
  • Efficiency: \( e \) is computable in polynomial time.

The standard examples are the Weil pairing and Tate pairing on supersingular or pairing-friendly elliptic curves. The Weil pairing \( e_n(P, Q) \) for \( n \)-torsion points \( P, Q \in E[n] \) is defined via rational functions (Weil’s \( f_P \) functions), evaluated at divisors supported outside the torsion points.

Bilinearity is the key property: \( e(aP, bQ) = e(P, Q)^{ab} \), so the pairing collapses two separate DH-style computations into one group target element.

Weil pairing intuition. For an elliptic curve \( E \) over \( \mathbb{F}_q \) and integer \( n \) with \( \gcd(n, q) = 1 \), the \( n \)-torsion subgroup \( E[n] \cong (\mathbb{Z}/n\mathbb{Z})^2 \) over the algebraic closure. For each \( P \in E[n] \), there exists a rational function \( f_P \) on the curve with divisor \( n(P) - n(\mathcal{O}) \). The Weil pairing is:

\[ e_n(P, Q) = \frac{f_P(D_Q)}{f_Q(D_P)}, \]

where \( D_P, D_Q \) are divisors of degree 0 supported away from each other’s support, linearly equivalent to \( (P) - (\mathcal{O}) \) and \( (Q) - (\mathcal{O}) \) respectively. Miller’s algorithm (1986) computes \( f_P(D_Q) \) in \( O(\log n) \) steps by a double-and-add procedure on the function values, making pairing computation practical. The bilinearity \( e_n(aP, Q) = e_n(P, Q)^a \) follows from the theory of divisors and Weil reciprocity.

Protocol (Joux Tripartite Diffie-Hellman, 2000). Let \( e: G_1 \times G_1 \to G_T \) be a symmetric bilinear pairing (Type-1: \( G_1 = G_2 \)) with generator \( P \in G_1 \). Alice, Bob, and Carol choose private exponents \( a, b, c \in \mathbb{Z}_q \) respectively and broadcast public values \( aP, bP, cP \in G_1 \).
  • Alice computes: \( e(bP, cP)^a = e(P,P)^{abc} \).
  • Bob computes: \( e(aP, cP)^b = e(P,P)^{abc} \).
  • Carol computes: \( e(aP, bP)^c = e(P,P)^{abc} \).
All three compute the same shared secret \( K = e(P,P)^{abc} \in G_T \) with a single pairing evaluation. In classical DH, achieving three-party key agreement non-interactively in one round is impossible — two rounds are required. The pairing collapses the second round entirely, enabling one-round three-party key agreement. Security relies on the BDDH (Bilinear Decisional DH) assumption in \( G_T \): given \( (P, aP, bP, cP, T) \), distinguish \( T = e(P,P)^{abc} \) from \( T = e(P,P)^r \) for random \( r \).
Remark (Why Pairings Enable Tripartite DH but Classical Groups Cannot). In a classical group \( G \) with generator \( g \), three parties with public keys \( g^a, g^b, g^c \) cannot compute \( g^{abc} \) non-interactively. Alice can compute \( (g^b)^a = g^{ab} \), but to get \( g^{abc} \) she would need \( g^{ab}^c \) — which requires Carol's private key \( c \), unavailable to Alice. The pairing resolves this by mapping into a target group: \( e(g^b, g^c)^a = e(g,g)^{abc} \) combines Bob's and Carol's public values in a way that Alice can exponentiate. The target group element \( e(g,g)^{abc} \) is computable by each party from the other two's public keys — something impossible in a group without a bilinear structure. This is the fundamental new capability that pairings add to public-key cryptography.

8.2 Decisional Bilinear Diffie-Hellman (DBDH)

The security of pairing-based schemes typically relies on the DBDH assumption: given \( (g, g^a, g^b, g^c, T) \in G_1^4 \times G_T \), distinguish \( T = e(g,g)^{abc} \) from \( T = e(g,g)^r \) with \( r \) random. This assumption does not follow from standard CDH/DDH in \( G_1 \), making it an independent hardness assumption.

Pairings destroy DDH hardness in \( G_1 \): if \( G_1 \) has a pairing, then given \( (g, g^a, g^b, Z) \), compute \( e(g^a, g^b) \) and compare to \( e(g, Z) \); they are equal iff \( Z = g^{ab} \). So DDH in \( G_1 \) is easy with a pairing, which means El-Gamal in pairing groups is not IND-CPA under DDH. Pairing-based schemes instead build security on the separate DBDH or BDH assumptions.

8.3 Boneh-Franklin Identity-Based Encryption

The most famous application of pairings is Identity-Based Encryption (IBE), where a party’s public key is simply their identity string (email address, name, employee ID). A trusted Private Key Generator (PKG) issues private keys corresponding to identities.

Scheme (Boneh-Franklin IBE, sketch).
  • Setup: PKG samples master secret \( s \leftarrow \mathbb{Z}_q^* \); computes \( P_\text{pub} = sP \). Public parameters include \( (G_1, G_T, e, P, P_\text{pub}, H_1, H_2) \).
  • KeyGen(ID): PKG computes \( Q_{ID} = H_1(ID) \in G_1 \) and issues \( d_{ID} = s \cdot Q_{ID} \) to the user.
  • Enc(ID, M): Compute \( Q_{ID} = H_1(ID) \), sample \( r \leftarrow \mathbb{Z}_q \); ciphertext \( (rP,\, M \oplus H_2(e(Q_{ID}, P_\text{pub})^r)) \).
  • Dec(\( d_{ID} \), \( (U, V) \)): Compute \( V \oplus H_2(e(d_{ID}, U)) \). Correctness: \( e(d_{ID}, U) = e(sQ_{ID}, rP) = e(Q_{ID}, P)^{sr} = e(Q_{ID}, P_\text{pub})^r \).

Security is proven in the random oracle model under the BDH assumption. IBE enables powerful applications: timed-release encryption (identity = “2030-01-01”), revocation via key expiry, and hierarchical key delegation.

Remark (IBE as the Killer App of Pairings). Identity-based encryption was proposed by Shamir in 1984 — but Shamir only posed the problem; he had no construction. For 17 years, the question of whether a practical IBE scheme existed remained open. The Boneh-Franklin construction (2001) was the first efficient solution, and it required bilinear pairings in an essential way: the pairing is what allows the PKG's master secret \( s \) to be "threaded through" both the key derivation (\( d_{ID} = s\cdot Q_{ID} \)) and the encryption (\( e(Q_{ID}, P_\text{pub})^r = e(Q_{ID}, sP)^r \)) so that decryption becomes self-consistent without revealing \( s \). Without a pairing, no algebraic tool for this "key escrow without key exposure" was known.

IBE eliminates the need for a public-key certificate infrastructure (PKI): there are no certificates to manage, no CAs to trust, and no certificate revocation lists. A sender encrypting to “alice@example.com” needs only the public parameters and Alice’s email address — they do not need Alice’s certificate or even Alice’s prior participation. The practical appeal is significant for closed systems (corporate email, government communications), and the theoretical reach extends to attribute-based encryption and functional encryption, which are direct generalizations of IBE enabled by the same pairing structure.

8.4 BLS Short Signatures

The Boneh-Lynn-Shacham (BLS) signature scheme uses pairings to achieve extremely short signatures (about 48 bytes at the 128-bit security level, compared to ECDSA’s 64 bytes and Schnorr’s 64 bytes), with a particularly clean algebraic structure.

Scheme (BLS Signatures).
  • KeyGen: \( sk = x \leftarrow \mathbb{Z}_q \), \( pk = xP \in G_1 \).
  • Sign(sk, m): \( \sigma = x \cdot H(m) \in G_1 \) where \( H: \{0,1\}^* \to G_1 \).
  • Verify(pk, m, \( \sigma \)): Check \( e(\sigma, P) = e(H(m), pk) \). Correctness: \( e(xH(m), P) = e(H(m), P)^x = e(H(m), xP) \).

BLS signatures are aggregatable: \( n \) signatures \( \sigma_1, \ldots, \sigma_n \) on messages \( m_1, \ldots, m_n \) under keys \( pk_1, \ldots, pk_n \) aggregate to a single signature \( \sigma = \sum_i \sigma_i \), verifiable as \( e(\sigma, P) = \prod_i e(H(m_i), pk_i) \). This enables blockchain applications (Ethereum 2.0 uses BLS for validator signatures) where thousands of signatures must be aggregated.

Example (BLS Signature Scheme). The BLS scheme uses an asymmetric pairing \( e: G_1 \times G_2 \to G_T \) where \( G_1, G_2, G_T \) are cyclic groups of prime order \( q \) with generators \( g_1 \in G_1 \) and \( g_2 \in G_2 \).

Key generation: Private key \( x \leftarrow \mathbb{Z}_q \). Public key \( \mathit{pk} = g_2^x \in G_2 \).

Signing: Compute \( \sigma = H(m)^x \in G_1 \), where \( H:\{0,1\}^* \to G_1 \) is a hash function modeled as mapping to random curve points.

Verification: Check \( e(\sigma, g_2) = e(H(m), \mathit{pk}) \).

\[ e(\sigma, g_2) = e(H(m)^x, g_2) = e(H(m), g_2)^x = e(H(m), g_2^x) = e(H(m), \mathit{pk}). \]\[ e(\sigma_{\text{agg}}, g_2) = e(\sigma_1\sigma_2, g_2) = e(\sigma_1, g_2)\cdot e(\sigma_2, g_2) = e(H(m_1), \mathit{pk}_1)\cdot e(H(m_2), \mathit{pk}_2). \]

A verifier with public keys \( \mathit{pk}_1, \mathit{pk}_2 \) can verify both signatures with only two pairing computations regardless of how many signers there are, compared to \( 2n \) pairings for \( n \) individual verifications. This is why Ethereum’s proof-of-stake protocol uses BLS for aggregating the signatures of hundreds of thousands of validators into a single compact proof per block.

8.5 Post-Quantum Status of Pairing-Based Cryptography

Pairings operate in elliptic curve groups \( G_1, G_2 \) and the multiplicative group \( G_T \) of a finite field extension. Shor’s algorithm breaks DLP in all of these groups (both elliptic curve and finite field). Consequently, all current pairing-based schemes are broken by quantum computers. There is active research on post-quantum analogues (e.g., isogeny-based pairings), but no standard post-quantum pairing scheme exists as of 2025.

This means that IBE, BLS aggregation, and other pairing-based applications will require entirely new post-quantum constructions — likely based on lattices or other quantum-resistant assumptions. Lattice-based IBE and signatures exist but tend to have larger keys and signatures than pairing-based counterparts.

8.6 Key Encapsulation and Hybrid Encryption Summary

All practical public-key cryptography uses a KEM-DEM (Key Encapsulation Mechanism / Data Encapsulation Mechanism) hybrid:

  1. KEM: Encapsulate a symmetric key \( K \) using the recipient’s public key; produces ciphertext \( c_1 \).
  2. DEM: Encrypt the actual message \( M \) under \( K \) using an authenticated symmetric cipher (AES-GCM, ChaCha20-Poly1305); produces ciphertext \( c_2 \).
  3. Send \( (c_1, c_2) \).

This architecture separates the algebraic public-key operation (performed once) from the bulk data encryption (symmetric, fast, provably secure). Security of the hybrid scheme follows from IND-CCA2 security of the KEM combined with authenticated encryption security of the DEM.

For post-quantum transition:

  • Replace the classical KEM (X25519, RSA-OAEP) with ML-KEM (Kyber).
  • The DEM (AES-GCM with 256-bit key) is already quantum-safe.
  • This is the minimal change needed for post-quantum security in practice.

Chapter 9: Hash-Based Signatures

Hash-based signature schemes occupy a unique position in post-quantum cryptography: their security rests on nothing more than the collision resistance (and one-wayness) of the underlying hash function, with no algebraic structure whatsoever. This makes them the most conservative post-quantum assumption available. If SHA-256 or SHA3-256 remains secure — a proposition that the entire security community has been trying to falsify for decades — then hash-based signatures are secure. No lattice assumption, no isogeny assumption, no number-theoretic assumption is needed. This conservatism comes at a cost: signatures are larger than algebraic schemes, and most constructions are either one-time (Lamport, Winternitz) or require careful state management (XMSS). SPHINCS+ (SLH-DSA) eliminates the statefulness at the cost of somewhat larger signatures, and was standardized by NIST in 2024 as FIPS 205.

9.1 Lamport One-Time Signatures

The Lamport signature scheme (1979) is conceptually the simplest public-key signature scheme imaginable. Its security reduces directly and tightly to one-way functions — no collision resistance is needed.

Scheme (Lamport One-Time Signature). Fix a one-way function \( f: \{0,1\}^\lambda \to \{0,1\}^\lambda \) (e.g., SHA-256 with \( \lambda = 256 \)) and a hash \( H: \{0,1\}^* \to \{0,1\}^N \) to reduce messages to \( N \)-bit digests. For each of the \( N \) message digest bits, generate two random preimages.
  • KeyGen: For \( i = 1, \ldots, N \): sample \( sk_{i,0}, sk_{i,1} \leftarrow \{0,1\}^\lambda \) uniformly. Compute \( pk_{i,b} = f(sk_{i,b}) \) for \( b \in \{0,1\} \). Private key: \( \{sk_{i,b}\} \). Public key: \( \{pk_{i,b}\} \).
  • Sign\( (sk, m) \): Compute digest \( d = H(m) = d_1 d_2 \cdots d_N \in \{0,1\}^N \). For each bit \( d_i \), reveal \( \sigma_i = sk_{i,d_i} \). Signature: \( \sigma = (\sigma_1, \ldots, \sigma_N) \).
  • Verify\( (pk, m, \sigma) \): Compute \( d = H(m) \). For each \( i \): check \( f(\sigma_i) = pk_{i,d_i} \). Accept iff all checks pass.

The signing procedure reveals exactly one preimage per bit — the preimage corresponding to the actual message bit. This is the fundamental limitation: if the signer uses the same key pair to sign two different messages \( m \ne m' \), the adversary can combine the two signatures to forge a signature on any message whose bit string is a componentwise “mix” of \( d = H(m) \) and \( d' = H(m') \), since both preimages for differing bits have been revealed. Hence Lamport is a one-time signature: one key pair, one message, never reuse.

Example (Lamport Signature on a 4-bit Digest). For illustration, set \( N = 4 \) and \( f = \text{SHA-256} \) (truncated to 8 bits for display). Suppose the key pairs are:
\( i \)\( sk_{i,0} \)\( sk_{i,1} \)\( pk_{i,0} = f(sk_{i,0}) \)\( pk_{i,1} = f(sk_{i,1}) \)
1\( a_0 \)\( a_1 \)\( A_0 \)\( A_1 \)
2\( b_0 \)\( b_1 \)\( B_0 \)\( B_1 \)
3\( c_0 \)\( c_1 \)\( C_0 \)\( C_1 \)
4\( d_0 \)\( d_1 \)\( D_0 \)\( D_1 \)

Message digest: \( H(m) = 1011 \). Signature: \( \sigma = (a_1, b_0, c_1, d_1) \).

Verification: check \( f(a_1) = A_1 \), \( f(b_0) = B_0 \), \( f(c_1) = C_1 \), \( f(d_1) = D_1 \). Each check is a single hash evaluation. An adversary who only sees \( \sigma \) knows \( a_1, b_0, c_1, d_1 \) and \( A_0, B_1, C_0, D_0 \) (from the public key) but does not know \( a_0, b_1, c_0, d_0 \) — their preimages have never been revealed. Forging a signature on a different digest, say \( 1101 \), requires knowing \( b_1 \) (the third-bit preimage for 1), which inverts \( f \) at the point \( B_1 \) — computationally infeasible under the one-wayness of \( f \).

The public key is \( 8 \) strings of 256 bits = 2048 bits total. The signature is 4 strings of 256 bits = 1024 bits total. For \( N = 256 \) (real SHA-256 output), public key = 512 × 256 bits = 16 KB and signature = 256 × 256 bits = 8 KB. These are large but the scheme is conceptually clean and tight.

Winternitz improvement. The Winternitz One-Time Signature (W-OTS) reduces key and signature sizes by signing \( w \) bits at once using a hash chain of length \( 2^w \). Instead of revealing one of two preimages per bit, the signer reveals a link in a chain \( x \to f(x) \to f(f(x)) \to \cdots \to f^{2^w-1}(x) \), with the chain length encoding the message bits. Choosing \( w = 4 \) or \( w = 16 \) reduces signature size by a factor of \( w / (w+1) \) at the cost of \( O(2^w) \) hash evaluations per coefficient during signing and \( O(2^w) \) during verification.

9.2 Merkle Tree Signatures (Many-Time)

Lamport solves the one-message problem. To sign many messages, Merkle (1979) introduced a construction using a binary hash tree to authenticate an entire family of Lamport public keys with a single short root value. The signer generates \( 2^h \) Lamport key pairs for tree height \( h \), arranges them as leaves, and builds a binary tree by hashing pairs upward.

Merkle Signature Scheme (MSS). Let \( H: \{0,1\}^{2\lambda} \to \{0,1\}^\lambda \) be a collision-resistant hash function and \( f \) a one-way function. Fix tree height \( h \) (supports \( 2^h \) signatures per key).
  • KeyGen: For each leaf \( i \in \{0,\ldots,2^h-1\} \), generate a Lamport key pair \( (sk_i, pk_i) \). Build the Merkle tree: leaf nodes are \( L_i = H(\mathit{pk}_i) \); internal nodes are \( \text{Node}(v) = H(\text{Node}(\text{left}(v)) \| \text{Node}(\text{right}(v))) \). The public key is the root \( \text{Root} \).
  • Sign\( (sk, i, m) \): Use the \( i \)-th Lamport key to sign: \( \sigma_\text{Lamp} = \text{Lamport.Sign}(sk_i, m) \). Include an authentication path: the sequence of sibling nodes from leaf \( i \) to the root, enabling recomputation of the root. Output \( (\sigma_\text{Lamp}, i, \text{AuthPath}_i) \).
  • Verify: Verify the Lamport signature under \( pk_i \). Recompute the root from \( H(pk_i) \) and \( \text{AuthPath}_i \). Accept iff the recomputed root equals \( \text{Root} \).

The Merkle root binds all \( 2^h \) leaf keys. Forging a signature on the \( i \)-th message would require forging the Lamport signature (breaks one-wayness of \( f \)) or forging the authentication path (breaks collision resistance of \( H \)). The authentication path has exactly \( h \) hash values, so the signature overhead is \( h \cdot \lambda \) bits plus the Lamport signature.

Example (Merkle Tree of Height \( h = 3 \)). The tree has \( 2^3 = 8 \) leaves and supports up to 8 distinct signatures. The leaf nodes are \( L_0 = H(pk_0), \ldots, L_7 = H(pk_7) \). Internal nodes at level 1: \( N_{00} = H(L_0 \| L_1) \), \( N_{01} = H(L_2 \| L_3) \), \( N_{10} = H(L_4 \| L_5) \), \( N_{11} = H(L_6 \| L_7) \). Level 2: \( N_0 = H(N_{00} \| N_{01}) \), \( N_1 = H(N_{10} \| N_{11}) \). Root: \( R = H(N_0 \| N_1) \).

To sign with leaf 2 (index \( i = 2 \)): include authentication path \( (L_3, N_{00}, N_1) \). The verifier recomputes: \( H(pk_2) = L_2 \), then \( H(L_2 \| L_3) = N_{01} \), then \( H(N_{00} \| N_{01}) = N_0 \), then \( H(N_0 \| N_1) = R \). The three hash evaluations in the authentication path are all the verifier needs, regardless of the tree size.

The total signature size is \( h \cdot \lambda + N \cdot \lambda \) bits: \( h = 3 \) for the auth path and \( N = 256 \) for the Lamport part (with SHA-256, \( \lambda = 256 \)). A real deployment with \( h = 20 \) supports \( 2^{20} \approx 10^6 \) signatures, auth path overhead is \( 20 \times 256 = 5120 \) bits — about 640 bytes — a modest addition to the Lamport core.

9.3 XMSS and SPHINCS+

XMSS (eXtended Merkle Signature Scheme, RFC 8391). Extends MSS with the Winternitz W-OTS+ one-time signature and adds a Merkle tree of trees (hypertree) for increased capacity. XMSS is stateful: the signer must maintain a counter of which leaf index has been used and must never repeat an index. If the state is lost or duplicated (as can happen on a restored backup), security is catastrophically violated — any two signatures under the same Lamport key leak both secret preimages, enabling forgeries. XMSS was standardized by NIST as SP 800-208 (2020) and is appropriate for settings with reliable state management (e.g., hardware security modules).

Remark (SPHINCS+: Stateless Hash-Based Signatures). SPHINCS+ (SLH-DSA, FIPS 205) solves the statefulness problem by incorporating a pseudorandom index selection: instead of sequentially using leaf \( i = 0, 1, 2, \ldots \), the signer selects the leaf index pseudorandomly from the message and a secret randomness value. As long as the index selection is collision-resistant (which follows from the pseudorandomness of the PRNG), the probability of reusing a leaf is negligible, and no external counter is needed.

SPHINCS+ uses a hypertree of \( d \) layers, each a Merkle tree of height \( h/d \), with FORS (Forest Of Random Subsets) for the bottom-level one-time signing. The parameter tradeoffs are:

  • SPHINCS+-128s (\( n=16, h=63, d=7, k=14, a=12 \)): signature size ~8 KB, signing time moderate.
  • SPHINCS+-128f (fast variant): signature size ~17 KB, signing ~10× faster.
Security relies on three hash function properties: one-wayness (for W-OTS+ chains), second-preimage resistance (for tree authentication), and pseudorandomness (for index selection). No algebraic structure is assumed. If SHA-256 and SHAKE-256 remain secure, SPHINCS+ is secure — a statement no other NIST PQC standard can make.

Chapter 10: NIST PQC Standardization — Scheme-by-Scheme Analysis

The four schemes standardized by NIST in 2024 each embody different mathematical tradeoffs. Understanding them individually — not just as names — equips a cryptographic engineer to make informed deployment decisions.

ML-KEM (CRYSTALS-Kyber, FIPS 203). A key encapsulation mechanism based on Module-LWE over the ring \( R_q = \mathbb{Z}_q[x]/(x^{256}+1) \) with \( q = 3329 \). Three parameter sets: Kyber-512 (\( k=2 \), ~128-bit PQ security, 800-byte public key), Kyber-768 (\( k=3 \), ~192-bit), Kyber-1024 (\( k=4 \), ~256-bit). Polynomial multiplication uses the NTT; all operations are designed to run in constant time. The FO transform gives IND-CCA2 security. ML-KEM is the primary recommendation for key exchange in TLS 1.3, SSH, and similar protocols. Its algebraic foundation — Module-LWE — has a worst-case hardness reduction from the shortest independent vectors problem (SIVP) on module lattices. Deployed in: Apple iMessage PQ3, Signal (X3DH upgrade), Chrome/Firefox (hybrid TLS).
ML-DSA (CRYSTALS-Dilithium, FIPS 204). A signature scheme based on Module-LWE and Module-SIS, using the "Fiat-Shamir with aborts" paradigm. Three parameter sets (Dilithium2/3/5) targeting NIST Levels 2/3/5. Signature sizes: 2420/3293/4595 bytes; public key sizes: 1312/1952/2592 bytes. The rejection sampling step makes the signature distribution independent of the secret key, achieving zero-knowledge. ML-DSA is the primary recommendation for general-purpose digital signatures. Its Dilithium2 parameters at 2420-byte signatures are larger than ECDSA's 64 bytes or RSA's 256 bytes, but this is an acceptable cost for post-quantum security. Advantage: relatively simple implementation without floating-point arithmetic, making it suitable for constrained environments.
SLH-DSA (SPHINCS+, FIPS 205). A stateless hash-based signature scheme. Security relies only on the collision resistance of SHA-256 or SHAKE-256 — the most conservative assumption of any NIST PQC standard. Three parameter families (small/fast variants, security levels 1/3/5). SPHINCS+-128s has an 8 KB signature and 32-byte public key; SPHINCS+-128f has a 17 KB signature. Because security depends only on hash functions, SLH-DSA provides an independent safety net: even if every lattice assumption breaks simultaneously, SLH-DSA remains secure. Recommended use: long-term certificates, root-of-trust signing operations, and any setting where signature size is not the binding constraint but maximum conservatism is required.
FN-DSA (FALCON, FIPS 206). A signature scheme based on NTRU lattices and the GPV hash-and-sign framework (Gentry–Peikert–Vaikuntanathan 2008). Two parameter sets: FALCON-512 (NIST Level 1, 666-byte signatures, 897-byte public key) and FALCON-1024 (NIST Level 5, 1280-byte signatures). FN-DSA achieves the smallest signatures of any NIST-standardized post-quantum signature scheme. The core operation — sampling a short lattice vector from a Gaussian distribution using an NTRU trapdoor — requires high-precision floating-point arithmetic, making constant-time implementation challenging. Side-channel attacks on the Gaussian sampler are an active research concern. Recommended use: bandwidth-constrained settings (embedded IoT, blockchain validators) where implementation complexity is manageable and signatures must be small.
HQC (Hamming Quasi-Cyclic) — Alternate Candidate. NIST selected HQC for a fifth standard (anticipated FIPS 2026) to provide diversity beyond lattice assumptions. HQC is based on the hardness of decoding random quasi-cyclic binary linear codes — a problem related to syndrome decoding, which has no known quantum speedup beyond Grover. Its security assumption is independent of all lattice problems, providing a second safety net alongside SLH-DSA. HQC has larger ciphertexts than Kyber (~4 KB vs ~1.1 KB for Kyber-768) but provides assurance against unanticipated attacks on Module-LWE specifically.

The table below summarizes the standardized schemes:

StandardPrimitiveMath BasisPK SizeSig/CT SizeQuantum Safety Net
ML-KEM (FIPS 203)KEMModule-LWE800–1568 B768–1568 BLattices
ML-DSA (FIPS 204)SignatureModule-LWE+SIS1312–2592 B2420–4595 BLattices
SLH-DSA (FIPS 205)SignatureHash functions32–64 B8–50 KBHash only
FN-DSA (FIPS 206)SignatureNTRU lattices897–1793 B666–1280 BNTRU lattices

The multi-scheme approach is deliberate: no single mathematical assumption should be a single point of failure for global cryptographic infrastructure. If a breakthrough against Module-LWE emerges, ML-KEM and ML-DSA would both be broken simultaneously, but SLH-DSA would stand. If hash functions (SHA-256, SHAKE-256) are weakened, SLH-DSA would be affected but all lattice schemes would remain secure. This complementarity mirrors the defense-in-depth principle applied at the algorithmic level.

Back to top