PMATH 432: Mathematical Logic

Andrew Zucker

Estimated study time: 3 hr 25 min

Table of contents

Sources and References

Primary textbooks

Ebbinghaus, H.-D., Flum, J., and Thomas, W. Mathematical Logic, 3rd edition. Springer, 2021. (Cited as EFT throughout.)
Marker, David. An Invitation to Mathematical Logic. Graduate Texts in Mathematics 301, Springer, 2024. (Cited as Marker throughout.)

Online resources

Levin, Oscar. Discrete Mathematics: An Open Introduction, for background on formal grammars and induction.
MIT OpenCourseWare 18.510: Introduction to Mathematical Logic (lecture notes and problem sets).
Stanford Math 161 (Mathematical Logic) course materials.
Cambridge Part III Logic lecture notes (Forster and successors).
Enderton, H. B. A Mathematical Introduction to Logic, 2nd ed., Academic Press, 2001 — supplementary classical reference.

Preface: The Grand Narrative of Mathematical Logic

Around 300 BC, Euclid organized all of geometry into a single deductive system in the Elements. Starting from five postulates — self-evident truths about points, lines, and circles — he derived hundreds of theorems by pure logical inference. The enterprise was a triumph of human reason, a model for mathematical knowledge ever since. But one postulate stood apart: the parallel postulate, which asserted that through a point off a line exactly one parallel line can be drawn. For over two thousand years, mathematicians suspected this was not truly an axiom but a consequence of the other four. Generation after generation attempted to derive it from simpler principles, always failing. The mystery was resolved only in the nineteenth century when Gauss, Bolyai, and Lobachevsky discovered non-Euclidean geometries — consistent mathematical worlds in which the parallel postulate is false. The lesson was startling and irreversible: the axioms we choose are not forced on us by reality. Different axiomatic choices yield different, equally coherent mathematical universes. The very concept of “truth” in mathematics is relative to a system of axioms.

This realization drove mathematicians toward formalization. If axioms can be chosen differently, then logic itself — the rules of inference — must be made explicit and mechanical. Gottlob Frege took the first decisive step in 1879 with his Begriffsschrift (“concept-script”), the first complete formalization of predicate logic. Frege’s aim was to show that arithmetic is reducible to logic alone, a program he called logicism. George Boole had earlier (1847) shown that the logic of propositions could be treated algebraically — the origin of Boolean algebra. But it was Frege’s Begriffsschrift that gave us the quantifiers \(\forall\) and \(\exists\) in recognizable form, the beginning of first-order logic as we know it. Frege’s ambitious Grundgesetze der Arithmetik attempted to carry out the logicist reduction in full detail — and it nearly succeeded, except for one fatal assumption: that every predicate defines a set.

In 1901, Bertrand Russell discovered the paradox that bears his name. Consider the set \(R = \{x : x \notin x\}\) of all sets that do not contain themselves. Is \(R \in R\)? If yes, then by definition \(R \notin R\). If no, then by definition \(R \in R\). Either way, a contradiction. Russell’s paradox was not merely a curiosity — it struck at the heart of Frege’s system, showing that naive comprehension (every predicate defines a set) is logically inconsistent. The news shattered Frege’s program. In the years that followed, Zermelo, Fraenkel, and others developed axiomatic set theories — ZF and ZFC — that carefully restrict comprehension to avoid paradox. Russell and Whitehead produced the enormous Principia Mathematica (1910–1913), encoding mathematics in a type-theoretic system that avoids self-reference. The crisis produced not despair but a flowering: the precise analysis of what sets, functions, and numbers actually are.

David Hilbert, the greatest mathematician of the early twentieth century, responded to the crisis with characteristic ambition. His program (articulated in the 1920s) called for the formalization of all of mathematics in a single axiomatic system, together with a finitistic consistency proof — a proof, using only the most elementary combinatorial reasoning, that the system can never derive a contradiction. If successful, Hilbert’s program would have placed mathematics on an unshakeable foundation. The program seemed well on its way to success when, in 1930, the twenty-four-year-old Kurt Gödel announced his Completeness Theorem: every consistent set of first-order axioms has a model. Then, in 1931, Gödel demolished Hilbert’s program with his Incompleteness Theorems. The First Incompleteness Theorem states that any consistent formal system strong enough to express basic arithmetic contains a true statement that cannot be proved within the system. The Second states that no such system can prove its own consistency. Hilbert’s dream of a complete, self-justifying foundation for mathematics was shown to be impossible in principle. The impact was seismic. Alan Turing, responding in 1936 to Hilbert’s Entscheidungsproblem (the decision problem: is there an algorithm that decides whether any given mathematical statement is provable?), invented his abstract model of computation — the Turing machine — and proved the answer is no. In that single paper, Turing launched computer science and resolved a foundational question of mathematics simultaneously. Today, mathematical logic has grown into four intertwined disciplines: proof theory (the structure of formal derivations), model theory (the relationship between syntax and mathematical structures), set theory (the foundations of mathematics and the arithmetic of the infinite), and computability theory (the boundaries of what algorithms can determine). These four pillars are the subject of PMATH 432.

Chapter 1: First-Order Logic — Language and Structures

1.1 Why Formalize?

Mathematics proceeds by reasoning from axioms to conclusions. But what is reasoning? What is an axiom? Mathematical logic makes these notions precise. The program, begun by Frege and Hilbert and carried through by Gödel, Tarski, and Turing, has three interlocking goals:

Syntax: define a formal language and a notion of proof that can be checked mechanically.
Semantics: define what it means for a sentence to be true in a mathematical structure.
Metatheory: prove theorems about the logical system — soundness, completeness, compactness, and ultimately the incompleteness theorems.

First-order logic (FOL) is the lingua franca. It is expressive enough to axiomatize essentially all of classical mathematics, yet it admits the completeness theorem (Gödel 1930) — a feature lost in second-order logic.

1.2 Signatures and First-Order Languages

A signature (also called a vocabulary or similarity type) is a triple \( \sigma = (F, R, \mathrm{ar}) \) where

\( F \) is a set of function symbols,
\( R \) is a set of relation symbols (disjoint from \( F \)),
\( \mathrm{ar} : F \cup R \to \mathbb{N} \) assigns to each symbol its arity.

Function symbols of arity 0 are constant symbols. Relation symbols of arity 0 are propositional atoms (rarely used in FOL proper).

Examples of signatures.

Groups: \( \sigma_{\mathrm{grp}} = (\{\cdot, e, {}^{-1}\}, \emptyset, \mathrm{ar}) \) where \( \mathrm{ar}(\cdot)=2 \), \( \mathrm{ar}(e)=0 \), \( \mathrm{ar}({}^{-1})=1 \).
Ordered fields: \( \sigma_{\mathrm{of}} = (\{+,\cdot,0,1,-\}, \{<\}, \mathrm{ar}) \).
Graphs: \( \sigma_{\mathrm{gr}} = (\emptyset, \{E\}, \mathrm{ar}) \) with \( \mathrm{ar}(E)=2 \).
Pure sets (set theory): \( \sigma_{\in} = (\emptyset, \{\in\}, \mathrm{ar}) \) with \( \mathrm{ar}(\in)=2 \).

1.2.1 Terms

Fix a countably infinite set \( \mathrm{Var} = \{v_0, v_1, v_2, \ldots\} \) of variables (always disjoint from the signature symbols and logical symbols).

The set of \( \sigma \)-terms, written \( \mathrm{Term}(\sigma) \), is defined inductively:

Every variable \( x \in \mathrm{Var} \) is a term.
If \( c \in F \) with \( \mathrm{ar}(c)=0 \), then \( c \) is a term.
If \( f \in F \) with \( \mathrm{ar}(f)=n \geq 1 \) and \( t_1,\ldots,t_n \) are terms, then \( f(t_1,\ldots,t_n) \) is a term.

Nothing else is a term.

Terms represent individuals — elements of the universe of discourse. They do not carry a truth value.

1.2.2 Atomic Formulas and Formulas

An atomic \( \sigma \)-formula is an expression of one of the following forms:

\( t_1 \doteq t_2 \) (equality), where \( t_1, t_2 \in \mathrm{Term}(\sigma) \).
\( R(t_1,\ldots,t_n) \) where \( R \in \mathcal{R} \) has arity \( n \) and \( t_1,\ldots,t_n \) are terms.

The set of \( \sigma \)-formulas, written \( \mathrm{For}(\sigma) \), is the smallest set such that:

Every atomic formula is in \( \mathrm{For}(\sigma) \).
If \( \varphi \in \mathrm{For}(\sigma) \), then \( \lnot \varphi \in \mathrm{For}(\sigma) \).
If \( \varphi, \psi \in \mathrm{For}(\sigma) \), then \( (\varphi \land \psi), (\varphi \lor \psi), (\varphi \to \psi), (\varphi \leftrightarrow \psi) \in \mathrm{For}(\sigma) \).
If \( \varphi \in \mathrm{For}(\sigma) \) and \( x \in \mathrm{Var} \), then \( \forall x\, \varphi \) and \( \exists x\, \varphi \) are in \( \mathrm{For}(\sigma) \).

We treat \( \bot \) (falsum) as an abbreviation for \( v_0 \doteq v_1 \land \lnot (v_0 \doteq v_1) \), and \( \top \) (verum) as \( \lnot \bot \). We also allow the abbreviation \( \exists! x\, \varphi \) for "there exists a unique \( x \) such that \( \varphi \)", defined as \( \exists x(\varphi \land \forall y(\varphi[y/x] \to y \doteq x)) \).

1.2.3 Free and Bound Variables

The set \( \mathrm{free}(\varphi) \) of free variables of \( \varphi \) is defined inductively:

For an atomic formula \( \varphi \): \( \mathrm{free}(\varphi) \) is the set of variables occurring in \( \varphi \).
\( \mathrm{free}(\lnot \varphi) = \mathrm{free}(\varphi) \).
\( \mathrm{free}(\varphi \land \psi) = \mathrm{free}(\varphi) \cup \mathrm{free}(\psi) \).
\( \mathrm{free}(\varphi \lor \psi) = \mathrm{free}(\varphi) \cup \mathrm{free}(\psi) \).
\( \mathrm{free}(\varphi \to \psi) = \mathrm{free}(\varphi) \cup \mathrm{free}(\psi) \).
\( \mathrm{free}(\varphi \leftrightarrow \psi) = \mathrm{free}(\varphi) \cup \mathrm{free}(\psi) \).
\( \mathrm{free}(\forall x\, \varphi) = \mathrm{free}(\varphi) \setminus \{x\} \).
\( \mathrm{free}(\exists x\, \varphi) = \mathrm{free}(\varphi) \setminus \{x\} \).

A formula with no free variables is called a sentence. A formula \( \varphi(x_1,\ldots,x_n) \) indicates that \( \mathrm{free}(\varphi) \subseteq \{x_1,\ldots,x_n\} \). An occurrence of \( x \) in \( \varphi \) is bound if it lies in the scope of a quantifier \( \forall x \) or \( \exists x \); otherwise it is free.

Example. In \( \forall x\, (x < y \to \exists z\, (x < z \land z < y)) \), the variable \( y \) is free and \( x, z \) are bound. This expresses (in the language of orders) “between \( x \) and \( y \) there exists a \( z \)”, but only \( y \) is a “parameter”.

Worked Example 1.1 (Parse tree and variable analysis). Consider the formula \( \varphi = \forall x\, \exists y\, (x < y) \) in the signature \( \sigma_{\mathrm{of}} \) of ordered fields (with relation symbol \( < \)).

Parse tree:

        ∀x
         |
        ∃y
         |
       x < y     (atomic formula R(t₁,t₂) with t₁ = x, t₂ = y)

Subformulas (by the inductive definition, every proper syntactic subpart that is itself a formula):

\( x < y \) — the innermost atomic formula.
\( \exists y\, (x < y) \) — quantifying \( y \) over the atomic formula.
\( \forall x\, \exists y\, (x < y) \) — the full formula \( \varphi \).

Free variables: Compute inductively.

\[ \mathrm{free}(x < y) = \{x, y\}. \]\[ \mathrm{free}(\exists y\,(x < y)) = \{x,y\} \setminus \{y\} = \{x\}. \]\[ \mathrm{free}(\forall x\, \exists y\,(x < y)) = \{x\} \setminus \{x\} = \emptyset. \]

So \( \varphi \) is a sentence — it has no free variables. Both \( x \) and \( y \) are bound. The formula asserts “for every element \( x \) there exists an element \( y \) greater than it” — which is a property of the entire structure, not dependent on any parameter.

Contrast: In \( \psi = \exists y\,(x < y) \), the variable \( x \) is free. This formula expresses “\( x \) is not a maximum”, a property that depends on the choice of \( x \). We would write \( \psi(x) \) to indicate this dependency.

1.2.4 Substitution

For a term \( t \) and variable \( x \), the substitution \( \varphi[t/x] \) (read: replace \( x \) by \( t \) in \( \varphi \)) is defined inductively. The term \( t \) is free for \( x \) in \( \varphi \) if no free occurrence of \( x \) in \( \varphi \) lies in the scope of a quantifier \( \forall y \) or \( \exists y \) where \( y \) occurs in \( t \). When \( t \) is free for \( x \) in \( \varphi \), the substitution is capture-free.

Capture-freeness can always be arranged by renaming bound variables — an operation called alpha-renaming — which preserves meaning.

1.3 Structures and Interpretations

A \( \sigma \)-structure \( \mathfrak{A} \) (or model of signature \( \sigma \)) consists of:

A nonempty set \( A \), the universe (or domain), also written \( |\mathfrak{A}| \).
For each \( n \)-ary function symbol \( f \in F \): a function \( f^{\mathfrak{A}} : A^n \to A \).
For each \( n \)-ary relation symbol \( R \in \mathcal{R} \): a relation \( R^{\mathfrak{A}} \subseteq A^n \).

Examples.

The structure \( (\mathbb{Z}, +^{\mathbb{Z}}, \cdot^{\mathbb{Z}}, 0^{\mathbb{Z}}, 1^{\mathbb{Z}}, <^{\mathbb{Z}}) \) is a \( \sigma_{\mathrm{of}} \)-structure interpreting the ordered ring of integers.
The structure \( (\mathbb{N}, S, 0) \) with \( S(n)=n+1 \) is a structure for the signature \( (\{S,0\}, \emptyset) \), the setting for Peano arithmetic.
Any group \( (G, \cdot, e, {}^{-1}) \) is a \( \sigma_{\mathrm{grp}} \)-structure.

Worked Example 1.2 (The structure \( \mathfrak{N} \) and truth value computations). Define the structure \[ \mathfrak{N} = (\mathbb{N},\, 0^{\mathfrak{N}},\, S^{\mathfrak{N}},\, +^{\mathfrak{N}},\, \cdot^{\mathfrak{N}}) \]

in the arithmetic signature \( \sigma_{\mathrm{arith}} = \{0, S, +, \cdot\} \) (with \( 0 \) a constant, \( S \) unary, \( + \) and \( \cdot \) binary), where the interpretations are the standard natural-number operations: \( 0^{\mathfrak{N}} = 0 \), \( S^{\mathfrak{N}}(n) = n+1 \), and so forth. We compute truth values of several sentences.

Sentence 1: \( \forall x\, \exists y\, (x + y \doteq 0) \). This says “every natural number has an additive inverse in \( \mathbb{N} \).” Take \( x = 1 \). We need \( y \in \mathbb{N} \) with \( 1 + y = 0 \). No such \( y \) exists in \( \mathbb{N} \) (there are no negative numbers). Therefore \( \mathfrak{N} \not\models \forall x\, \exists y\, (x + y \doteq 0) \). The sentence is false in \( \mathfrak{N} \).

Sentence 2: \( \exists x\, (x + x \doteq x) \). This says “there exists a natural number \( x \) satisfying \( 2x = x \).” Taking \( x = 0 \): \( 0 + 0 = 0 \). This is true in \( \mathbb{N} \). Hence \( \mathfrak{N} \models \exists x\, (x + x \doteq x) \). The sentence is true in \( \mathfrak{N} \), witnessed by \( x = 0 \).

Sentence 3: \( \forall x\, \forall y\, (x \cdot y \doteq y \cdot x) \). Multiplication of natural numbers is commutative: \( mn = nm \) for all \( m, n \in \mathbb{N} \). Hence \( \mathfrak{N} \models \forall x\, \forall y\, (x \cdot y \doteq y \cdot x) \). The sentence is true.

Sentence 4: \( \exists x\, \forall y\, (x + y \doteq y) \). This asks for a left identity for addition. Indeed \( 0 + y = y \) for all \( y \in \mathbb{N} \). So \( \mathfrak{N} \models \exists x\, \forall y\, (x + y \doteq y) \), witnessed by \( x = 0 \).

These computations illustrate how the inductive definition of satisfaction translates formal syntax into concrete arithmetical claims about \( \mathbb{N} \).

1.3.1 Assignments and Interpretations of Terms

An assignment (or variable assignment) into \( \mathfrak{A} \) is a function \( \alpha : \mathrm{Var} \to A \). The value \( t^{\mathfrak{A}}[\alpha] \) of a term \( t \) under \( (\mathfrak{A}, \alpha) \) is defined by:

\( x^{\mathfrak{A}}[\alpha] = \alpha(x) \) for variables \( x \).
\( c^{\mathfrak{A}}[\alpha] = c^{\mathfrak{A}} \) for constants \( c \).
\( f(t_1,\ldots,t_n)^{\mathfrak{A}}[\alpha] = f^{\mathfrak{A}}(t_1^{\mathfrak{A}}[\alpha], \ldots, t_n^{\mathfrak{A}}[\alpha]) \).

1.3.2 Satisfaction

The satisfaction relation \( \mathfrak{A} \models \varphi[\alpha] \) (read: \( \mathfrak{A} \) satisfies \( \varphi \) under \( \alpha \)) is defined inductively on the structure of \( \varphi \):

\( \mathfrak{A} \models (t_1 \doteq t_2)[\alpha] \) iff \( t_1^{\mathfrak{A}}[\alpha] = t_2^{\mathfrak{A}}[\alpha] \).
\( \mathfrak{A} \models R(t_1,\ldots,t_n)[\alpha] \) iff \( (t_1^{\mathfrak{A}}[\alpha],\ldots,t_n^{\mathfrak{A}}[\alpha]) \in R^{\mathfrak{A}} \).
\( \mathfrak{A} \models (\lnot\varphi)[\alpha] \) iff \( \mathfrak{A} \not\models \varphi[\alpha] \).
\( \mathfrak{A} \models (\varphi \land \psi)[\alpha] \) iff \( \mathfrak{A} \models \varphi[\alpha] \) and \( \mathfrak{A} \models \psi[\alpha] \).
\( \mathfrak{A} \models (\varphi \lor \psi)[\alpha] \) iff \( \mathfrak{A} \models \varphi[\alpha] \) or \( \mathfrak{A} \models \psi[\alpha] \).
\( \mathfrak{A} \models (\varphi \to \psi)[\alpha] \) iff \( \mathfrak{A} \not\models \varphi[\alpha] \) or \( \mathfrak{A} \models \psi[\alpha] \).
\( \mathfrak{A} \models (\forall x\, \varphi)[\alpha] \) iff for all \( a \in A \): \( \mathfrak{A} \models \varphi[\alpha(x|a)] \).
\( \mathfrak{A} \models (\exists x\, \varphi)[\alpha] \) iff there exists \( a \in A \): \( \mathfrak{A} \models \varphi[\alpha(x|a)] \).

Here \( \alpha(x|a) \) denotes the assignment that agrees with \( \alpha \) except that \( x \mapsto a \).

The coincidence lemma: the truth value of \( \mathfrak{A} \models \varphi[\alpha] \) depends only on the values of \( \alpha \) on \( \mathrm{free}(\varphi) \). Hence for sentences \( \varphi \), satisfaction is independent of \( \alpha \) and we write simply \( \mathfrak{A} \models \varphi \).

(Coincidence Lemma) If \( \alpha \) and \( \beta \) are assignments that agree on all free variables of \( \varphi \), then \[ \mathfrak{A} \models \varphi[\alpha] \iff \mathfrak{A} \models \varphi[\beta]. \]

By structural induction on \( \varphi \). The base case for atomic formulas follows since term values depend only on the values of their variables (by an analogous induction on terms). The inductive cases for connectives are immediate. For quantifiers: if \( \varphi = \forall x\, \psi \) and \( \alpha, \beta \) agree on \( \mathrm{free}(\forall x\, \psi) = \mathrm{free}(\psi) \setminus \{x\} \), then for any \( a \in A \), the assignments \( \alpha(x|a) \) and \( \beta(x|a) \) agree on all of \( \mathrm{free}(\psi) \), so by the induction hypothesis they give the same truth value for \( \psi \). Taking the universal quantifier over all \( a \) completes the argument.

Worked Example 1.3 (Evaluating formulas in \( (\mathbb{Z}, <) \)). The satisfaction relation connects syntax with concrete mathematical truth. Consider the structure \( \mathfrak{Z} = (\mathbb{Z}, <^{\mathbb{Z}}) \) in the signature \( \sigma_{\mathrm{ord}} = (\emptyset, \{<\}) \).

Formula \( \varphi_1 = \forall x\, \exists y\, (x < y) \): For every integer \( n \), take \( y = n+1 > n \). So \( \mathfrak{Z} \models \varphi_1 \). The sentence means “\( \mathbb{Z} \) has no maximum element” — which is true.

Formula \( \varphi_2 = \forall x\, \exists y\, (y < x) \): For every integer \( n \), take \( y = n-1 < n \). So \( \mathfrak{Z} \models \varphi_2 \). The sentence means “\( \mathbb{Z} \) has no minimum element” — also true.

Formula \( \varphi_3 = \forall x\, \forall y\, (x < y \to \exists z\,(x < z \land z < y)) \): This asserts density. But between consecutive integers \( n \) and \( n+1 \) there is no integer \( z \) with \( n < z < n+1 \). So \( \mathfrak{Z} \not\models \varphi_3 \). The integers are not a dense order.

Contrast with \( \mathfrak{Q} = (\mathbb{Q}, <) \): All three sentences hold in \( \mathfrak{Q} \). Indeed \( \mathfrak{Q} \models \varphi_3 \) since between any two rationals there is another rational. This shows that \( \mathfrak{Z} \not\equiv \mathfrak{Q} \): they are not elementarily equivalent (they disagree on \( \varphi_3 \)).

1.3.3 Models, Validity, and Entailment

\( \mathfrak{A} \) is a model of a sentence \( \varphi \) if \( \mathfrak{A} \models \varphi \).
\( \mathfrak{A} \) is a model of a set of sentences \( T \) (written \( \mathfrak{A} \models T \)) if \( \mathfrak{A} \models \varphi \) for all \( \varphi \in T \). We then say \( \mathfrak{A} \) is a model of the theory \( T \).
A sentence \( \varphi \) is logically valid (written \( \models \varphi \)) if every \( \sigma \)-structure satisfies it.
A set \( T \) of sentences logically entails \( \varphi \) (written \( T \models \varphi \)) if every model of \( T \) is also a model of \( \varphi \).
A set \( T \) is satisfiable if it has at least one model.

1.4 Homomorphisms and Embeddings

Let \( \mathfrak{A}, \mathfrak{B} \) be \( \sigma \)-structures. A function \( h : A \to B \) is a homomorphism \( h : \mathfrak{A} \to \mathfrak{B} \) if:

For each \( n \)-ary function symbol \( f \): \( h(f^{\mathfrak{A}}(a_1,\ldots,a_n)) = f^{\mathfrak{B}}(h(a_1),\ldots,h(a_n)) \).
For each \( n \)-ary relation symbol \( R \): \( (a_1,\ldots,a_n) \in R^{\mathfrak{A}} \Rightarrow (h(a_1),\ldots,h(a_n)) \in R^{\mathfrak{B}} \).

A homomorphism \( h : \mathfrak{A} \to \mathfrak{B} \) is an embedding if it is injective and the condition for relations is strengthened to a biconditional: \[ (a_1,\ldots,a_n) \in R^{\mathfrak{A}} \iff (h(a_1),\ldots,h(a_n)) \in R^{\mathfrak{B}}. \]

An embedding is an isomorphism if it is additionally surjective. We write \( \mathfrak{A} \cong \mathfrak{B} \) if an isomorphism exists.

(Isomorphism Theorem) If \( h : \mathfrak{A} \to \mathfrak{B} \) is an isomorphism and \( \varphi(x_1,\ldots,x_n) \) is any formula, then for all \( a_1,\ldots,a_n \in A \): \[ \mathfrak{A} \models \varphi[a_1,\ldots,a_n] \iff \mathfrak{B} \models \varphi[h(a_1),\ldots,h(a_n)]. \]

In particular, isomorphic structures satisfy the same sentences.

By structural induction on \( \varphi \). The base cases follow from the embedding condition. The connective cases are straightforward. For \( \varphi = \forall x\, \psi \): \( \mathfrak{A} \models \forall x\, \psi[\vec{a}] \) iff for all \( b \in A \), \( \mathfrak{A} \models \psi[b,\vec{a}] \). Since \( h \) is surjective, the elements of \( B \) are exactly the \( h(b) \), so \( \mathfrak{B} \models \forall x\, \psi[h(b),h(\vec{a})] \) iff for all \( b \in A \), \( \mathfrak{B} \models \psi[h(b),h(\vec{a})] \), which by induction hypothesis holds iff \( \mathfrak{A} \models \psi[b,\vec{a}] \). The case for \( \exists \) is similar.

1.5 Substructures and Elementary Substructures

\( \mathfrak{A} \) is a substructure of \( \mathfrak{B} \) (written \( \mathfrak{A} \subseteq \mathfrak{B} \)) if \( A \subseteq B \), every function symbol is interpreted consistently (\( f^{\mathfrak{A}} = f^{\mathfrak{B}} \upharpoonright A^n \)), and \( R^{\mathfrak{A}} = R^{\mathfrak{B}} \cap A^n \) for every relation symbol \( R \).

Note: for a substructure, we need \( A \) to be closed under the functions of \( \mathfrak{B} \).

\( \mathfrak{A} \) is an elementary substructure of \( \mathfrak{B} \) (written \( \mathfrak{A} \preccurlyeq \mathfrak{B} \)) if \( \mathfrak{A} \subseteq \mathfrak{B} \) and for every formula \( \varphi(x_1,\ldots,x_n) \) and every \( a_1,\ldots,a_n \in A \): \[ \mathfrak{A} \models \varphi[a_1,\ldots,a_n] \iff \mathfrak{B} \models \varphi[a_1,\ldots,a_n]. \]

(Tarski–Vaught Test) Let \( \mathfrak{A} \subseteq \mathfrak{B} \). Then \( \mathfrak{A} \preccurlyeq \mathfrak{B} \) if and only if for every formula \( \varphi(x, y_1,\ldots,y_n) \) and every \( a_1,\ldots,a_n \in A \): if \( \mathfrak{B} \models \exists x\, \varphi[a_1,\ldots,a_n] \), then there exists \( a \in A \) such that \( \mathfrak{B} \models \varphi[a, a_1,\ldots,a_n] \).

The forward direction is clear from the definition. For the backward direction, induct on formula complexity. Atomic formulas and Boolean combinations are immediate since \( \mathfrak{A} \subseteq \mathfrak{B} \). For \( \varphi = \exists x\, \psi \): if \( \mathfrak{B} \models \exists x\, \psi[\vec{a}] \), the Tarski–Vaught condition gives \( a \in A \) with \( \mathfrak{B} \models \psi[a,\vec{a}] \), and by induction \( \mathfrak{A} \models \psi[a,\vec{a}] \), so \( \mathfrak{A} \models \exists x\, \psi[\vec{a}] \). Conversely, if \( \mathfrak{A} \models \exists x\, \psi[\vec{a}] \), then there is \( a \in A \subseteq B \) with \( \mathfrak{A} \models \psi[a,\vec{a}] \), and by induction \( \mathfrak{B} \models \psi[a,\vec{a}] \), so \( \mathfrak{B} \models \exists x\, \psi[\vec{a}] \).

Chapter 2: Proof Theory and Completeness

2.1 Formal Proof Systems

The satisfaction relation \( \models \) connects syntax (formulas as strings of symbols) with semantics (truth in structures). This connection is the heart of mathematical logic, but it raises an immediate practical problem: to check whether \( T \models \varphi \), we would need to verify \( \varphi \) in every model of \( T \) — an infinite and typically unchecked task. The purpose of a proof system is to replace semantic verification with a mechanical syntactic procedure: a finite sequence of steps, each of which can be checked algorithmically, ending in \( \varphi \). The central question is whether the proof system is complete — whether every semantic consequence is also a syntactic consequence. Gödel’s 1930 Completeness Theorem answers yes, for first-order logic.

There are several equivalent proof systems for first-order logic: Hilbert-style axiom systems, natural deduction, sequent calculus (Gentzen’s \( \mathbf{LK} \)), and tableaux. We present the Hilbert-style system following EFT, which trades ease of use for minimality of primitives.

2.1.1 The Hilbert Calculus

Fix a signature \( \sigma \). The logical axioms of the Hilbert calculus \( \mathcal{H} \) include:

Propositional axioms:

\[ \varphi \to (\psi \to \varphi) \]\[ (\varphi \to (\psi \to \chi)) \to ((\varphi \to \psi) \to (\varphi \to \chi)) \]\[ (\lnot \psi \to \lnot \varphi) \to (\varphi \to \psi) \]

Quantifier axioms (where \( t \) is free for \( x \) in \( \varphi \)):

\[ \forall x\, \varphi \to \varphi[t/x] \]\[ \varphi[t/x] \to \exists x\, \varphi \]

Quantifier distribution:

\[ \forall x\, (\varphi \to \psi) \to (\forall x\, \varphi \to \forall x\, \psi) \]\[ \forall x\, (\varphi \to \psi) \to (\exists x\, \varphi \to \exists x\, \psi) \]

Vacuous quantification (\( x \notin \mathrm{free}(\varphi) \)):

\[ \varphi \to \forall x\, \varphi \]

Equality axioms:

\[ x \doteq x \]\[ x \doteq y \to (\varphi(x) \to \varphi(y)) \quad \text{(for atomic } \varphi \text{)} \]

The sole rule of inference is Modus Ponens: from \( \varphi \) and \( \varphi \to \psi \), derive \( \psi \). There are also the generalization rule: from \( \varphi \), derive \( \forall x\, \varphi \).

A derivation (or proof) from a set of premises \( \Phi \) is a finite sequence \( \varphi_1, \varphi_2, \ldots, \varphi_n \) where each \( \varphi_i \) is either (i) a logical axiom, (ii) a member of \( \Phi \), or (iii) obtained from earlier formulas by modus ponens or generalization. We write \( \Phi \vdash \varphi \) if there exists a derivation from \( \Phi \) with last line \( \varphi \).

The generalization rule must be applied with care: \( \Phi \vdash \forall x\, \varphi \) if the last step uses generalization only when \( x \) is not free in any formula of \( \Phi \) actually used in the derivation. This restriction prevents proving \( \forall x\, x = c \) from the premise \( x = c \).

Worked Derivation 2.1: Proving \( \vdash \forall x\, P(x) \to \exists x\, P(x) \).

We exhibit a Hilbert-style derivation with no premises. Let \( P \) be a unary relation symbol.

\( \forall x\, P(x) \to P(c) \) — Quantifier axiom (instantiate \( x \) with any term \( c \)).
\( P(c) \to \exists x\, P(x) \) — Quantifier axiom (existential introduction with term \( c \)).
\( (\forall x\, P(x) \to P(c)) \to ((P(c) \to \exists x\, P(x)) \to (\forall x\, P(x) \to \exists x\, P(x))) \) — Propositional axiom (hypothetical syllogism): \( (A \to B) \to ((B \to C) \to (A \to C)) \), an instance of the second propositional axiom.
\( (P(c) \to \exists x\, P(x)) \to (\forall x\, P(x) \to \exists x\, P(x)) \) — Modus Ponens on lines 1 and 3.
\( \forall x\, P(x) \to \exists x\, P(x) \) — Modus Ponens on lines 2 and 4.

This completes the derivation. Semantically, this is obvious: if \( P(a) \) holds for every element \( a \), then in particular there exists some \( a \) (the domain is nonempty) for which \( P(a) \) holds.

Worked Derivation 2.2: Deriving \( \vdash \lnot\lnot\varphi \to \varphi \) (double negation elimination).

This is a fundamental instance of classical (as opposed to intuitionistic) logic. The key propositional axiom is the contrapositive axiom: \( (\lnot\psi \to \lnot\varphi) \to (\varphi \to \psi) \).

\( (\lnot\varphi \to \lnot\lnot\varphi) \to (\lnot\varphi \to \varphi) \to (\lnot\lnot\varphi \to \varphi) \) — Wait: let us proceed more carefully. We use the contrapositive axiom with \( \psi := \varphi \) and \( \varphi := \lnot\varphi \): \( (\lnot\varphi \to \lnot\lnot\varphi) \to (\lnot\varphi \to \varphi) \). Hmm — we need \( \lnot\varphi \to \lnot\lnot\varphi \). This is not immediate.

A cleaner route uses the contrapositive axiom \( (\lnot\psi \to \lnot\varphi) \to (\varphi \to \psi) \) with \( \psi = \varphi \) and \( \varphi = \lnot\varphi \) (substituting \( \lnot\varphi \) for the meta-variable \( \varphi \)):

\( (\lnot\varphi \to \lnot\lnot\varphi) \to (\lnot\varphi \to \varphi) \) — This is an instance of the third propositional axiom with \( \varphi := \lnot\varphi \), \( \psi := \varphi \): \( (\lnot\psi \to \lnot\varphi) \to (\varphi \to \psi) \)... Actually let us use the standard derivation for \( \lnot\lnot\varphi \to \varphi \) directly via the third propositional axiom.

The third propositional axiom is \( (\lnot\psi \to \lnot\varphi) \to (\varphi \to \psi) \). Set \( \psi := \varphi \) and treat \( \varphi \) as \( \lnot\varphi \):

\[ (\lnot\varphi \to \lnot\lnot\varphi) \to (\lnot\varphi \to \varphi). \]

We also need \( \lnot\lnot\varphi \to (\lnot\varphi \to \lnot\lnot\varphi) \) — an instance of axiom 1 with \( \varphi := \lnot\lnot\varphi \) and \( \psi := \lnot\varphi \).

\( \lnot\lnot\varphi \to (\lnot\varphi \to \lnot\lnot\varphi) \) — Axiom 1 (\( A \to (B \to A) \)).
\( (\lnot\varphi \to \lnot\lnot\varphi) \to (\lnot\varphi \to \varphi) \) — Axiom 3 instantiated as above... this requires additional steps. In full detail the derivation is approximately 10 lines. The key idea: from \( \lnot\lnot\varphi \) and the axioms, derive \( \varphi \) by using the contrapositive axiom to convert double negation to a positive statement. The derivation exists and is finite — demonstrating that classical double negation elimination is an admissible derived rule of the Hilbert calculus.

Upshot: In classical logic (with the third propositional axiom), \( \lnot\lnot\varphi \to \varphi \) is provable. In intuitionistic logic (which omits the third axiom), this is not provable — double negation elimination is precisely the principle that distinguishes classical from constructive mathematics.

2.1.2 Derived Rules and Metatheorems

(Deduction Theorem) \( \Phi \cup \{\psi\} \vdash \varphi \) if and only if \( \Phi \vdash \psi \to \varphi \), provided that \( x \) is not free in \( \psi \) whenever generalization on \( x \) is used in the derivation.

The deduction theorem allows us to reason informally with hypotheses in the usual way.

2.1.3 Natural Deduction (Brief Overview)

In Gentzen’s natural deduction system \( \mathbf{NK} \), we have introduction and elimination rules for each connective:

\( \land \)-Intro: from \( \varphi \) and \( \psi \), derive \( \varphi \land \psi \).
\( \land \)-Elim: from \( \varphi \land \psi \), derive \( \varphi \) (or \( \psi \)).
\( \to \)-Intro: if \( \varphi \) proves \( \psi \) (closing an assumption), derive \( \varphi \to \psi \).
\( \forall \)-Intro: if \( \varphi(c) \) is proved with \( c \) a fresh constant (an eigenvariable), derive \( \forall x\, \varphi(x) \).
\( \forall \)-Elim: from \( \forall x\, \varphi \), derive \( \varphi[t/x] \).

Natural deduction is more intuitive and is the basis of the Curry–Howard correspondence with type theory.

2.2 Soundness

(Soundness Theorem) If \( \Phi \vdash \varphi \), then \( \Phi \models \varphi \).

By induction on the length of the derivation. We check that each logical axiom is logically valid, i.e., true in every structure under every assignment. We then check that modus ponens and generalization preserve truth in every structure.

Logical axioms: We verify a few. For \( \varphi \to (\psi \to \varphi) \): if \( \mathfrak{A} \models \varphi[\alpha] \), then certainly \( \mathfrak{A} \models \psi \to \varphi[\alpha] \) regardless of \( \psi \). The quantifier axiom \( \forall x\, \varphi \to \varphi[t/x] \): if \( \mathfrak{A} \models \forall x\, \varphi[\alpha] \), then in particular substituting the value of \( t \) under \( \alpha \) gives \( \mathfrak{A} \models \varphi[t^{\mathfrak{A}}[\alpha]/x][\alpha] \). The equality axiom \( x \doteq y \to (\varphi(x) \to \varphi(y)) \) follows from Leibniz’s law.

Modus Ponens: If \( \mathfrak{A} \models \Phi \), \( \mathfrak{A} \models \varphi[\alpha] \), and \( \mathfrak{A} \models (\varphi \to \psi)[\alpha] \), then \( \mathfrak{A} \models \psi[\alpha] \).

Generalization: Suppose \( \Phi \vdash \varphi \) with \( x \notin \mathrm{free}(\psi) \) for any \( \psi \in \Phi \) used. By the induction hypothesis, \( \Phi \models \varphi \). Given \( \mathfrak{A} \models \Phi \) and any \( a \in A \): the assignment \( \alpha(x|a) \) still satisfies \( \Phi \) (since \( x \) is not free in \( \Phi \)), so \( \mathfrak{A} \models \varphi[\alpha(x|a)] \). Since \( a \) was arbitrary, \( \mathfrak{A} \models \forall x\, \varphi[\alpha] \).

Corollary. If \( \varphi \) is provable (\( \emptyset \vdash \varphi \)), then \( \varphi \) is logically valid. In particular, if \( T \vdash \varphi \) and \( T \) has a model, then \( \varphi \) is true in that model.

Corollary (Consistency). If \( T \) has a model, then \( T \) is consistent — i.e., \( T \not\vdash \bot \).

2.3 The Completeness Theorem

One of the most surprising and powerful results in all of mathematics is Gödel’s Completeness Theorem. It establishes that the syntactic machinery of first-order logic — a finite list of axiom schemas and two rules of inference — is exactly powerful enough to capture semantic truth in all structures. There is no gap between what is true in every model and what is provable. This result was far from obvious: the proof system looks weak (only finitely many axiom schemas, no mention of any specific structure), yet it can derive everything that holds in every conceivable interpretation. The proof, via the Henkin construction, is itself a masterpiece: it builds a model directly out of the proof system, turning syntactic objects (terms and formulas) into semantic objects (elements and truth values).

(Gödel's Completeness Theorem, 1930) Let \( \sigma \) be a countable signature and \( \Phi \) a set of \( \sigma \)-sentences. For any sentence \( \varphi \): \[ \Phi \models \varphi \iff \Phi \vdash \varphi. \]

Equivalently, every consistent set of sentences has a model.

We prove the completeness theorem via the Henkin method, constructing a canonical model from the proof system itself.

2.3.1 Henkin Constants and Extensions

A set \( \Phi \) of sentences is consistent if \( \Phi \not\vdash \bot \). It is maximally consistent if it is consistent and no proper extension (adding any sentence not already in \( \Phi \)) remains consistent.

A maximally consistent set \( \Phi \) has the witness property if for every formula \( \varphi(x) \) and sentence \( \exists x\, \varphi(x) \in \Phi \), there exists a constant \( c \) (a Henkin witness) such that \( \varphi(c) \in \Phi \).

Step 1: Extend the signature. Given a consistent \( \Phi \) in signature \( \sigma \), form \( \sigma' = \sigma \cup \{c_\varphi : \varphi(x) \in \mathrm{For}(\sigma)\} \) — adding one new constant for each formula with one free variable.

Step 2: Add Henkin axioms. Let \( \Phi' = \Phi \cup \{\exists x\, \varphi(x) \to \varphi(c_\varphi) : \varphi(x) \in \mathrm{For}(\sigma)\} \). One verifies that \( \Phi' \) is still consistent (adding a finitely many axioms at a time preserves consistency since inconsistency requires a finite derivation).

Step 3: Maximal extension (Lindenbaum’s lemma). Enumerate all \( \sigma' \)-sentences as \( \psi_0, \psi_1, \psi_2, \ldots \) (possible since \( \sigma' \) is countable). Define:

\[ \Phi_0 = \Phi', \quad \Phi_{n+1} = \begin{cases} \Phi_n \cup \{\psi_n\} & \text{if this is consistent,} \\ \Phi_n \cup \{\lnot \psi_n\} & \text{otherwise.} \end{cases} \]

Set \( \Delta = \bigcup_{n} \Phi_n \). Then \( \Delta \) is maximally consistent and has the witness property.

(Lindenbaum's Lemma) Every consistent set of sentences can be extended to a maximally consistent set.

2.3.2 The Canonical (Term) Model

From \( \Delta \) (maximally consistent with witness property), build the canonical structure \( \mathfrak{A}_\Delta \):

Universe: the set of closed \( \sigma' \)-terms, modulo the equivalence \( s \sim t \) iff \( (s \doteq t) \in \Delta \). Write \( [t] \) for the equivalence class of \( t \).
Functions: \( f^{\mathfrak{A}_\Delta}([t_1],\ldots,[t_n]) = [f(t_1,\ldots,t_n)] \) (well-defined since \( \Delta \) contains the relevant equality axioms).
Relations: \( ([t_1],\ldots,[t_n]) \in R^{\mathfrak{A}_\Delta} \) iff \( R(t_1,\ldots,t_n) \in \Delta \).

(Truth Lemma) For every closed \( \sigma' \)-formula \( \varphi \): \[ \mathfrak{A}_\Delta \models \varphi \iff \varphi \in \Delta. \]

By induction on \( \varphi \). Atomic: \( \mathfrak{A}_\Delta \models R(t_1,\ldots,t_n) \) iff \( R(t_1,\ldots,t_n) \in \Delta \) by definition. Equality similarly. \( \lnot \): Since \( \Delta \) is maximally consistent, \( \lnot \varphi \in \Delta \) iff \( \varphi \notin \Delta \). Conjunction: \( (\varphi \land \psi) \in \Delta \) iff both \( \varphi \in \Delta \) and \( \psi \in \Delta \) (since \( \Delta \) is maximal and consistent). Existential: \( \exists x\, \varphi(x) \in \Delta \) iff (by the witness property) \( \varphi(c_{\varphi}) \in \Delta \), and by induction \( \mathfrak{A}_\Delta \models \varphi(c_\varphi) \), so \( \mathfrak{A}_\Delta \models \exists x\, \varphi(x) \).

Proof of Completeness. Suppose \( \Phi \not\vdash \varphi \). Then \( \Phi \cup \{\lnot \varphi\} \) is consistent. Extend to \( \Delta \) (maximally consistent with witness property) in \( \sigma' \supseteq \sigma \). The canonical model \( \mathfrak{A}_\Delta \) satisfies \( \Phi \) (since \( \Phi \subseteq \Delta \)) and satisfies \( \lnot \varphi \) (since \( \lnot \varphi \in \Delta \)), so \( \Phi \not\models \varphi \). By contrapositive: \( \Phi \models \varphi \Rightarrow \Phi \vdash \varphi \). \(\square\)

2.4 Compactness

(Compactness Theorem) A set \( \Phi \) of sentences is satisfiable if and only if every finite subset of \( \Phi \) is satisfiable.

One direction is trivial: if \( \Phi \) has a model \( \mathfrak{A} \), then so does every finite subset. For the other direction: if every finite subset of \( \Phi \) is satisfiable, then every finite subset is consistent (by soundness). Hence \( \Phi \) itself is consistent (since any derivation of \( \bot \) from \( \Phi \) uses only finitely many premises). By completeness, \( \Phi \) has a model.

Remark. The compactness theorem can also be proved directly via ultraproducts (see Chapter 6), giving an independent proof that does not go through the proof system.

2.4.1 Applications of Compactness

Example 1: Non-standard models of arithmetic. Let \( T = \mathrm{Th}(\mathbb{N}) \) be the complete theory of the standard natural numbers, in the signature \( (0, S, +, \cdot, <) \). Add a new constant symbol \( c \) and the sentences \( \{c > \overline{n} : n \in \mathbb{N}\} \) (where \( \overline{n} = S^n(0) \)). Every finite subset of this extended set is satisfiable (interpret \( c \) as a sufficiently large natural number). By compactness, there is a model where \( c \) is “infinitely large”. This is a non-standard model of arithmetic.

Example 2: Torsion groups. Let \( T_{\mathrm{grp}} \) be the group axioms and suppose \( T_{\mathrm{grp}} \models \varphi \) for every sentence \( \varphi \) of the form “every element has finite order”. Then the class of groups in which every element has finite order (torsion groups) cannot be axiomatized by a first-order theory alone — a compactness argument shows any axiomatization has models with elements of infinite order.

Example 3: Finite fields have characteristic-zero extensions. The theory \( T_{\mathrm{ACF}_0} \) of algebraically closed fields of characteristic 0 (see Chapter 3) is the limit of theories \( T_{\mathrm{ACF}_p} \) in the following sense: if \( \varphi \) holds in all \( \mathrm{ACF}_p \) for all sufficiently large primes \( p \), then \( \varphi \) holds in \( \mathrm{ACF}_0 \). This follows from compactness by adding axioms \( \mathrm{char} \neq p \) for each prime \( p \).

Worked Example 2.3 (Compactness and graph coloring). We use compactness to transfer a global property to a finitistic one.

Theorem. Let \( G \) be an infinite graph such that every finite subgraph of \( G \) is \( k \)-colorable (properly colorable with \( k \) colors). Then \( G \) itself is \( k \)-colorable.

Proof via Compactness. For each vertex \( v \in V(G) \), introduce a constant symbol \( c_v \). Let \( \sigma \) be the signature with constant symbols \( \{c_v : v \in V(G)\} \) and unary relation symbols \( \{C_1, \ldots, C_k\} \) (representing the \( k \) color classes). Form the theory \( T \) consisting of:

For each vertex \( v \): \( C_1(c_v) \lor C_2(c_v) \lor \cdots \lor C_k(c_v) \) (every vertex gets at least one color).
For each vertex \( v \) and each pair \( i \neq j \): \( \lnot(C_i(c_v) \land C_j(c_v)) \) (at most one color).
For each edge \( \{u,v\} \in E(G) \) and each \( i \): \( \lnot(C_i(c_u) \land C_i(c_v)) \) (no monochromatic edge).

A model of \( T \) is exactly a proper \( k \)-coloring of \( G \). Any finite subset \( T_0 \subseteq T \) involves only finitely many constant symbols \( c_{v_1}, \ldots, c_{v_m} \) and hence only finitely many edge constraints. The induced subgraph on \( \{v_1,\ldots,v_m\} \) is a finite subgraph of \( G \), which by assumption is \( k \)-colorable. That coloring gives a model of \( T_0 \). By compactness, \( T \) has a model — a proper \( k \)-coloring of all of \( G \). \(\square\)

This result is significant: it means that for testing \( k \)-colorability, global infinite structure reduces to finitary local structure. It is equivalent to the fact that \( k \)-colorability is a finitary property — expressible by a first-order theory.

Worked Example 2.4 (Non-standard analysis via compactness). One of the most striking applications of compactness is the construction of Abraham Robinson's non-standard real numbers.

Claim. There exists an ordered field extension \( {}^*\!\mathbb{R} \supset \mathbb{R} \) satisfying all first-order sentences true in \( \mathbb{R} \) (the transfer principle), yet containing a positive infinitesimal \( \varepsilon \) — an element with \( 0 < \varepsilon < \frac{1}{n} \) for every \( n \in \mathbb{N} \).

Construction via Compactness. Work in the signature \( \sigma_{\mathrm{of}} \cup \{c_r : r \in \mathbb{R}\} \cup \{\varepsilon\} \), where each \( c_r \) is a constant naming the real number \( r \), and \( \varepsilon \) is a new constant. Form the theory

\[ T = \mathrm{Th}(\mathbb{R}, r)_{r \in \mathbb{R}} \cup \{\varepsilon > 0\} \cup \left\{\varepsilon < c_{1/n} : n \in \mathbb{N}, n \geq 1\right\}. \]

Here \( \mathrm{Th}(\mathbb{R}, r)_{r \in \mathbb{R}} \) is the theory of \( \mathbb{R} \) in the expanded language with a constant for each real number. For any finite subset \( T_0 \subseteq T \), the constraints on \( \varepsilon \) are finitely many: \( \varepsilon < c_{1/n_1}, \ldots, \varepsilon < c_{1/n_k} \) for some \( n_1, \ldots, n_k \). Setting \( \varepsilon := \frac{1}{2\max(n_i)} \) gives a real number satisfying \( T_0 \). By compactness, \( T \) has a model \( {}^*\!\mathbb{R} \). The element \( \varepsilon^{{}^*\!\mathbb{R}} \) is a positive infinitesimal. Since \( {}^*\!\mathbb{R} \models \mathrm{Th}(\mathbb{R}, r)_{r \in \mathbb{R}} \), every first-order statement true in \( \mathbb{R} \) remains true in \( {}^*\!\mathbb{R} \) — the transfer principle. This is the logical foundation for Robinson’s non-standard analysis (1966), which gives a rigorous treatment of Leibniz’s infinitesimals.

Chapter 3: Model Theory

3.1 Theories and Elementary Equivalence

The complete theory (or theory) of a structure \( \mathfrak{A} \) is \[ \mathrm{Th}(\mathfrak{A}) = \{\varphi : \varphi \text{ is a sentence and } \mathfrak{A} \models \varphi\}. \]

Two structures \( \mathfrak{A} \) and \( \mathfrak{B} \) are elementarily equivalent (written \( \mathfrak{A} \equiv \mathfrak{B} \)) if \( \mathrm{Th}(\mathfrak{A}) = \mathrm{Th}(\mathfrak{B}) \).

Isomorphic structures are elementarily equivalent (Isomorphism Theorem), but the converse fails: \( (\mathbb{Q}, <) \equiv (\mathbb{R}, <) \) since both are dense linear orders without endpoints, but they are not isomorphic (different cardinalities).

A theory \( T \) is complete if for every sentence \( \varphi \), either \( T \vdash \varphi \) or \( T \vdash \lnot \varphi \) (equivalently, \( T \models \varphi \) or \( T \models \lnot \varphi \)). Equivalently, all models of \( T \) are elementarily equivalent.

3.2 Löwenheim–Skolem Theorems

(Downward Löwenheim–Skolem Theorem) Let \( \mathfrak{A} \) be a \( \sigma \)-structure with \( |A| \geq \aleph_0 \) and let \( X \subseteq A \) with \( |X| \leq \kappa \) for some infinite cardinal \( \kappa \geq |\sigma| \). Then there is an elementary substructure \( \mathfrak{B} \preccurlyeq \mathfrak{A} \) with \( X \subseteq B \) and \( |B| \leq \kappa \).

We build \( B \) using a Skolem hull construction. Start with \( B_0 = X \cup \{f^{\mathfrak{A}}(\vec{a}) : f \in F, \vec{a} \in X^{\mathrm{ar}(f)}\} \) — the closure of \( X \) under all function symbols. Iterate: at each step \( n \), for each formula \( \varphi(x, \vec{y}) \) and each tuple \( \vec{a} \in B_n^{|\vec{y}|} \), if \( \mathfrak{A} \models \exists x\, \varphi(x, \vec{a}) \), pick a witness \( b \in A \) and add it to \( B_{n+1} \). Set \( B = \bigcup_n B_n \). The Tarski–Vaught criterion is satisfied by construction: any existential witness required is present. The size \( |B| \leq \max(|X|, |\sigma|, \aleph_0) = \kappa \) since at each step we add at most \( |\mathrm{For}(\sigma)| \cdot |B_n| = \kappa \) many elements.

(Upward Löwenheim–Skolem Theorem) Let \( \mathfrak{A} \) be an infinite \( \sigma \)-structure. Then for every cardinal \( \lambda \geq |A| + |\sigma| + \aleph_0 \), there exists \( \mathfrak{B} \equiv \mathfrak{A} \) with \( |B| = \lambda \).

Add \( \lambda \) new constant symbols \( \{c_i : i < \lambda\} \) to \( \sigma \). Expand the theory to \( T' = \mathrm{Th}(\mathfrak{A}) \cup \{c_i \neq c_j : i < j < \lambda\}. \) Every finite subset of \( T' \) is satisfiable (interpret finitely many \( c_i \) as distinct elements of the infinite domain \( A \)). By compactness, \( T' \) has a model \( \mathfrak{B}' \). The reduct \( \mathfrak{B} = \mathfrak{B}' \upharpoonright \sigma \) is a model of \( \mathrm{Th}(\mathfrak{A}) \) with \( |B| \geq \lambda \). Taking an elementary substructure of cardinality exactly \( \lambda \) (using the downward theorem) completes the proof.

Corollary (Skolem’s Paradox). The theory ZFC has a countable model (by the downward Löwenheim–Skolem theorem), even though it proves the existence of uncountable sets. There is no contradiction: the model is countable from the outside, but “sees” itself as uncountable from the inside (no internal bijection with \( \omega \) exists within the model).

3.3 Vaught’s Test

(Vaught's Test) Let \( T \) be a satisfiable theory with no finite models. Suppose that all models of \( T \) of some infinite cardinality \( \kappa \geq |\sigma| + \aleph_0 \) are isomorphic. Then \( T \) is complete.

Suppose \( T \) is not complete. Then there is a sentence \( \varphi \) such that both \( T \cup \{\varphi\} \) and \( T \cup \{\lnot\varphi\} \) are satisfiable. By Löwenheim–Skolem, both have models of cardinality \( \kappa \). But these models satisfy different sentences, hence are not isomorphic — contradicting \( \kappa \)-categoricity.

Example. The theory \(\mathrm{DLO}\) of dense linear orders without endpoints (axioms: linear order, density \( \forall x \forall y (x < y \to \exists z (x < z \land z < y)) \), no endpoints) is \( \aleph_0 \)-categorical. All countable dense linear orders without endpoints are isomorphic to \( (\mathbb{Q}, <) \) by a back-and-forth argument (Cantor, 1895). By Vaught’s test, \(\mathrm{DLO}\) is complete.

3.3.1 Back-and-Forth Systems

Let \( \mathfrak{A}, \mathfrak{B} \) be countable \( \sigma \)-structures. A back-and-forth system is a nonempty family \( \mathcal{I} \) of finite partial isomorphisms (injections \( p : A_0 \to B_0 \) with \( A_0 \subseteq A, B_0 \subseteq B \) preserving all atomic formulas) such that:

(Forth) For every \( p \in \mathcal{I} \) and \( a \in A \), there exists \( q \in \mathcal{I} \) extending \( p \) with \( a \in \mathrm{dom}(q) \).
(Back) For every \( p \in \mathcal{I} \) and \( b \in B \), there exists \( q \in \mathcal{I} \) extending \( p \) with \( b \in \mathrm{ran}(q) \).

If a back-and-forth system exists, then \( \mathfrak{A} \cong \mathfrak{B} \) (when both are countable).

3.4 Algebraically Closed Fields

A field \( K \) is algebraically closed if every nonconstant polynomial \( f \in K[x] \) has a root in \( K \).

The theory \(\mathrm{ACF}\) of algebraically closed fields is axiomatized in the language \( \{+, \cdot, -, 0, 1\} \) by:

Field axioms.
For each \( n \geq 1 \): \( \forall a_0 \cdots a_n\, \exists x\, (x^n + a_{n-1}x^{n-1} + \cdots + a_0 = 0) \) (every monic polynomial of degree \( n \) has a root).

For a prime \( p \), the theory \(\mathrm{ACF}_p = \mathrm{ACF} \cup \{\underbrace{1+\cdots+1}_{p} = 0\}\). The theory \(\mathrm{ACF}_0 = \mathrm{ACF} \cup \{n \cdot 1 \neq 0 : n \geq 1\}\).

For each prime \( p \) and for \( p = 0 \), the theory \( \mathrm{ACF}_p \) is complete and is \( \kappa \)-categorical for every uncountable cardinal \( \kappa \).

Two algebraically closed fields of characteristic \( p \) with the same uncountable transcendence degree over their prime field are isomorphic (classical algebra). For uncountable \( \kappa \), all models of \( \mathrm{ACF}_p \) of cardinality \( \kappa \) have transcendence degree \( \kappa \) over the prime field, so they are all isomorphic. By Vaught's test (applied to any uncountable \( \kappa \)), \( \mathrm{ACF}_p \) is complete.

Corollary (Chevalley–Tarski). A statement expressible as a first-order sentence in the language of fields is true in \( \mathbb{C} \) if and only if it is true in every algebraically closed field of characteristic 0. More strikingly: a first-order sentence true in \( \mathbb{C} \) is also true in \( \overline{\mathbb{F}_p} \) for all sufficiently large primes \( p \).

Example: Ax’s theorem. If a polynomial map \( f : \mathbb{C}^n \to \mathbb{C}^n \) is injective, then it is surjective. (This is false over \( \mathbb{R} \).) This can be proved by first showing it over finite fields \( \mathbb{F}_q \) (where injectivity trivially implies surjectivity), then transferring via compactness and completeness of \( \mathrm{ACF}_0 \).

3.5 Quantifier Elimination

A theory \( T \) admits quantifier elimination if for every formula \( \varphi(\vec{x}) \) there is a quantifier-free formula \( \psi(\vec{x}) \) such that \( T \models \varphi(\vec{x}) \leftrightarrow \psi(\vec{x}) \).

(Tarski) \( \mathrm{ACF}_p \) admits quantifier elimination. Consequently, the definable sets in an algebraically closed field are exactly the constructible sets (Boolean combinations of algebraic varieties).

(Tarski) The theory \( \mathrm{RCF} \) of real closed fields (ordered fields in which every positive element has a square root and every odd-degree polynomial has a root) admits quantifier elimination in the language \( \{+, \cdot, -, 0, 1, <\} \). In particular, the definable sets in \( \mathbb{R} \) are exactly the semialgebraic sets.

Corollary. Since \( \mathrm{RCF} \) admits quantifier elimination and is complete (as \( \mathbb{R} \) is a real closed field), the first-order theory of \( \mathbb{R} \) (as an ordered field) is decidable. This is in sharp contrast to the theory of \( \mathbb{N} \), which is undecidable (Gödel).

3.6 Types

Let \( T \) be a complete theory and \( A \subseteq M \models T \). An \( n \)-type over \( A \) is a set \( p(\vec{x}) \) of formulas with parameters from \( A \) such that \( T \cup \mathrm{diagram}(A) \cup p(\vec{x}) \) is satisfiable. The type is complete if for every formula \( \varphi(\vec{x}) \) over \( A \), either \( \varphi \in p \) or \( \lnot \varphi \in p \).

The type space \( S_n(A) \) is the set of complete \( n \)-types over \( A \), topologized by the basis \( [\varphi] = \{p \in S_n(A) : \varphi \in p\} \). This is the Stone space of the Lindenbaum–Tarski algebra.

A type \( p \in S_n(M) \) is realized in \( M \) if there is \( \vec{a} \in M^n \) with \( M \models \varphi(\vec{a}) \) for all \( \varphi \in p \). A model \( M \) is \( \omega \)-saturated if every type over a finite subset of \( M \) is realized in \( M \).

3.7 Fraïssé Theory

Fraïssé’s theorem provides a way to build canonical countable structures as limits of finite approximations.

Let \( \mathcal{K} \) be a class of finite \( \sigma \)-structures. We say \( \mathcal{K} \) is a Fraïssé class if it satisfies:

Hereditary property (HP): If \( \mathfrak{A} \in \mathcal{K} \) and \( \mathfrak{B} \) embeds into \( \mathfrak{A} \), then \( \mathfrak{B} \in \mathcal{K} \).
Joint embedding property (JEP): For any \( \mathfrak{A}, \mathfrak{B} \in \mathcal{K} \), there exists \( \mathfrak{C} \in \mathcal{K} \) into which both embed.
Amalgamation property (AP): For any \( \mathfrak{A}, \mathfrak{B}, \mathfrak{C} \in \mathcal{K} \) and embeddings \( f : \mathfrak{A} \to \mathfrak{B} \), \( g : \mathfrak{A} \to \mathfrak{C} \), there exists \( \mathfrak{D} \in \mathcal{K} \) and embeddings \( f' : \mathfrak{B} \to \mathfrak{D} \), \( g' : \mathfrak{C} \to \mathfrak{D} \) with \( f' \circ f = g' \circ g \).
Essential countability: \( \mathcal{K} \) has countably many isomorphism types.

(Fraïssé's Theorem) For every Fraïssé class \( \mathcal{K} \), there exists a unique (up to isomorphism) countable structure \( \mathfrak{F} \), called the Fraïssé limit of \( \mathcal{K} \), such that:

Every finite substructure of \( \mathfrak{F} \) is in \( \mathcal{K} \).
Every \( \mathfrak{A} \in \mathcal{K} \) embeds into \( \mathfrak{F} \).
\( \mathfrak{F} \) is homogeneous: every isomorphism between finite substructures of \( \mathfrak{F} \) extends to an automorphism of \( \mathfrak{F} \).

Examples of Fraïssé limits.

The Fraïssé limit of the class of all finite linear orders is \( (\mathbb{Q}, <) \).
The Fraïssé limit of the class of all finite graphs is the random graph (Rado graph), which is \( \aleph_0 \)-categorical.
The Fraïssé limit of the class of all finite sets (empty signature) is a countably infinite set.
The Fraïssé limit of finite triangle-free graphs is the Henson graph \( H_3 \).

Chapter 4: Set Theory

4.1 Motivation and the Cumulative Hierarchy

Cantor’s paradise of infinite sets requires an axiomatic foundation to avoid paradoxes (Russell, Burali-Forti). The Zermelo–Fraenkel axioms with Choice (ZFC) provide the standard foundation. All mathematical objects are sets; the only primitive relation is \( \in \) (membership).

The cumulative hierarchy is the universe of sets built inductively:

\[ V_0 = \emptyset, \quad V_{\alpha+1} = \mathcal{P}(V_\alpha), \quad V_\lambda = \bigcup_{\alpha < \lambda} V_\alpha \text{ (for limit } \lambda). \]

The universe of all sets is \( V = \bigcup_\alpha V_\alpha \). Every set \( x \) belongs to some \( V_\alpha \); the least such \( \alpha \) is the rank of \( x \).

4.2 The ZFC Axioms

All axioms are sentences in the language \( \{\in\} \):

1. Extensionality:

\[ \forall x\, \forall y\, (\forall z\, (z \in x \leftrightarrow z \in y) \to x = y). \]

A set is determined by its elements.

2. Empty Set:

\[ \exists x\, \forall y\, \lnot (y \in x). \]

There is an empty set \( \emptyset \).

3. Pairing:

\[ \forall x\, \forall y\, \exists z\, \forall w\, (w \in z \leftrightarrow w = x \lor w = y). \]

For any \( x, y \) there is the set \( \{x, y\} \).

4. Union:

\[ \forall \mathcal{F}\, \exists A\, \forall x\, (x \in A \leftrightarrow \exists B\, (B \in \mathcal{F} \land x \in B)). \]

For any family \( \mathcal{F} \), the union \( \bigcup \mathcal{F} \) exists.

5. Power Set:

\[ \forall x\, \exists y\, \forall z\, (z \in y \leftrightarrow z \subseteq x). \]

The power set \( \mathcal{P}(x) \) exists.

6. Separation Schema (Aussonderung): For each formula \( \varphi(x, p) \):

\[ \forall A\, \forall p\, \exists B\, \forall x\, (x \in B \leftrightarrow x \in A \land \varphi(x, p)). \]

We can form \( \{x \in A : \varphi(x, p)\} \).

7. Replacement Schema (Fraenkel): For each formula \( \varphi(x, y, p) \):

\[ \forall A\, \forall p\, \left(\forall x \in A\, \exists! y\, \varphi(x,y,p)\right) \to \exists B\, \forall x \in A\, \exists y \in B\, \varphi(x,y,p). \]

The image of a set under a definable function is a set.

8. Infinity:

\[ \exists x\, (\emptyset \in x \land \forall y \in x\, (y \cup \{y\} \in x)). \]

An inductive set exists (its smallest element is \( \omega = \mathbb{N} \)).

9. Foundation (Regularity):

\[ \forall x\, (x \neq \emptyset \to \exists y \in x\, (y \cap x = \emptyset)). \]

Every nonempty set has a \( \in \)-minimal element. Equivalently: there is no infinite descending \( \in \)-chain.

10. Choice (AC):

\[ \forall \mathcal{F}\, \left(\emptyset \notin \mathcal{F} \to \exists f\, (f : \mathcal{F} \to \bigcup \mathcal{F} \land \forall A \in \mathcal{F}\, f(A) \in A)\right). \]

Every family of nonempty sets has a choice function.

4.3 Ordinals

A set \( \alpha \) is an ordinal if:

\( \alpha \) is transitive: \( \forall x \in \alpha\, x \subseteq \alpha \).
\( \in \) is a strict linear order on \( \alpha \).

(By foundation, this order is well-founded.) We write \( \mathbf{ON} \) for the class of all ordinals.

Basic facts about ordinals:

Every element of an ordinal is an ordinal.
For ordinals \( \alpha, \beta \): \( \alpha < \beta \iff \alpha \in \beta \iff \alpha \subsetneq \beta \).
Every well-ordered set is order-isomorphic to a unique ordinal.
The smallest ordinals are \( 0 = \emptyset, 1 = \{0\}, 2 = \{0,1\}, \ldots, \omega = \{0,1,2,\ldots\}, \omega+1 = \omega \cup \{\omega\}, \ldots \)
Successor ordinals: \( \alpha + 1 = \alpha \cup \{\alpha\} \).
Limit ordinals: \( \lambda \neq 0 \) with no predecessor; \( \lambda = \sup_{\alpha < \lambda} \alpha = \bigcup \lambda \).

(Transfinite Induction) Let \( P(\alpha) \) be a property of ordinals. If \( P(0) \), and whenever \( P(\beta) \) for all \( \beta < \alpha \) implies \( P(\alpha) \), then \( P(\alpha) \) holds for all ordinals \( \alpha \).

(Transfinite Recursion) For any operation \( G : V \to V \), there is a unique function \( F : \mathbf{ON} \to V \) such that \( F(\alpha) = G(F \upharpoonright \alpha) \) for all ordinals \( \alpha \).

4.3.1 Ordinal Arithmetic

Ordinal addition, multiplication, and exponentiation are defined by transfinite recursion:

\( \alpha + 0 = \alpha \), \( \alpha + (\beta+1) = (\alpha + \beta) + 1 \), \( \alpha + \lambda = \sup_{\beta<\lambda}(\alpha+\beta) \) for limit \( \lambda \).
\( \alpha \cdot 0 = 0 \), \( \alpha \cdot (\beta+1) = (\alpha \cdot \beta) + \alpha \), \( \alpha \cdot \lambda = \sup_{\beta<\lambda}(\alpha \cdot \beta) \) for limit \( \lambda \).

Warning: Ordinal arithmetic is not commutative. For example, \( 1 + \omega = \omega \neq \omega + 1 \).

4.3.2 Cantor Normal Form

(Cantor Normal Form) Every nonzero ordinal \( \alpha \) can be written uniquely in the form \[ \alpha = \omega^{\beta_1} \cdot k_1 + \omega^{\beta_2} \cdot k_2 + \cdots + \omega^{\beta_n} \cdot k_n \]

where \( \beta_1 > \beta_2 > \cdots > \beta_n \) are ordinals and \( k_1, k_2, \ldots, k_n \) are positive integers.

4.4 Cardinals

A cardinal is an ordinal \( \kappa \) such that \( |\kappa| \neq |\alpha| \) for all \( \alpha < \kappa \) (i.e., no smaller ordinal has the same cardinality). Under AC, every set is in bijection with a unique cardinal.

The sequence of infinite cardinals: \( \aleph_0 = \omega < \aleph_1 < \aleph_2 < \cdots \). For limit ordinals \( \lambda \): \( \aleph_\lambda = \sup_{\alpha<\lambda} \aleph_\alpha \).

(Cantor's Theorem) For every set \( X \): \( |X| < |\mathcal{P}(X)| \).

There is no surjection from \( X \) onto \( \mathcal{P}(X) \). Suppose \( f : X \to \mathcal{P}(X) \) is any function. Define \( D = \{x \in X : x \notin f(x)\} \in \mathcal{P}(X) \). For any \( x \in X \): if \( x \in D \) then \( x \notin f(x) \) so \( f(x) \neq D \); if \( x \notin D \) then \( x \in f(x) \) so \( f(x) \neq D \). Thus \( D \notin \mathrm{ran}(f) \), so \( f \) is not surjective.

Corollary. The cardinals form a proper class: \( \aleph_0 < 2^{\aleph_0} < 2^{2^{\aleph_0}} < \cdots \)

4.4.1 Cardinal Arithmetic

Under AC:

\( \kappa + \lambda = \lambda + \kappa = \kappa \cdot \lambda = \kappa \cdot \lambda = \max(\kappa, \lambda) \) for infinite cardinals.
Cardinal exponentiation \( \kappa^\lambda \) is more subtle. The continuum hypothesis (CH) states \( 2^{\aleph_0} = \aleph_1 \). Gödel showed CH is consistent with ZFC (1938); Cohen showed its negation is also consistent (1963). Thus CH is independent of ZFC.

4.4.2 Cofinality

The cofinality \( \mathrm{cf}(\kappa) \) of a cardinal \( \kappa \) is the least cardinality of a cofinal subset: a subset \( X \subseteq \kappa \) such that \( \sup X = \kappa \). A cardinal is regular if \( \mathrm{cf}(\kappa) = \kappa \), and singular otherwise.

Facts: \( \aleph_0 \) and all successor cardinals \( \aleph_{\alpha+1} \) are regular. \( \aleph_\omega \) is singular with \( \mathrm{cf}(\aleph_\omega) = \aleph_0 \).

Chapter 5: Computability Theory and Gödel’s Incompleteness Theorems

5.0 Historical Context: Hilbert’s Entscheidungsproblem and Turing’s Answer

In 1928, David Hilbert posed the Entscheidungsproblem (decision problem): is there a mechanical procedure — an algorithm — that, given any first-order sentence, determines whether it is logically valid (true in all structures)? The question crystallized Hilbert’s hope that mathematics was not only complete and consistent, but also decidable: any mathematical question, in principle, could be resolved by a sufficiently systematic calculation. In the years before 1936, the question was open. There was not even a rigorous definition of “algorithm” or “mechanical procedure”. Both Alan Turing and Alonzo Church independently answered the Entscheidungsproblem in the negative in 1936 — and in doing so, gave the world a precise definition of computability.

Alan Turing’s approach was to define an abstract model of a computing machine: a finite control unit reading and writing symbols on an infinite tape, moving left or right one step at a time. The Turing machine was simple enough to reason about mathematically, yet powerful enough to simulate any algorithmic computation. Turing showed that no Turing machine can decide whether an arbitrary Turing machine halts on a given input — the Halting Problem is undecidable. As a consequence, the set of logically valid first-order sentences (the set of tautologies of predicate logic) is also undecidable: there is no algorithm that correctly classifies every first-order sentence as valid or not. Alonzo Church arrived at the same conclusion via the lambda calculus and his definition of “effectively computable” functions.

The Church–Turing thesis states that every effectively computable function (in the informal, intuitive sense) is Turing-computable. This is not a theorem — it cannot be proved from within mathematics — but it is universally accepted as capturing the true boundary of computation. Together with Gödel’s incompleteness theorems, Turing’s result permanently transformed our understanding of the limits of mathematical knowledge: not only is there no complete formal system for arithmetic (Gödel), but there is also no algorithm that can decide what is provable in even a simple formal system (Turing). The two results are deeply related, as we shall see.

5.1 Primitive Recursive and Recursive Functions

The class of primitive recursive functions \( f : \mathbb{N}^k \to \mathbb{N} \) is the smallest class containing:

The zero function \( Z(n) = 0 \).
The successor function \( S(n) = n+1 \).
The projection functions \( \pi^k_i(n_1,\ldots,n_k) = n_i \).

and closed under:

Composition: if \( g : \mathbb{N}^m \to \mathbb{N} \) and \( h_1,\ldots,h_m : \mathbb{N}^k \to \mathbb{N} \) are primitive recursive, so is \( f(\vec{n}) = g(h_1(\vec{n}),\ldots,h_m(\vec{n})) \).
Primitive recursion: if \( g : \mathbb{N}^k \to \mathbb{N} \) and \( h : \mathbb{N}^{k+2} \to \mathbb{N} \) are primitive recursive, then so is \( f \) defined by \( f(\vec{n}, 0) = g(\vec{n}) \) and \( f(\vec{n}, m+1) = h(\vec{n}, m, f(\vec{n}, m)) \).

Examples. Addition, multiplication, exponentiation, the predecessor function, and all bounded quantifications are primitive recursive. The Ackermann function is recursive but not primitive recursive.

Worked Example 5.1 (Addition is primitive recursive). We show that \( \mathrm{add}(m, n) = m + n \) is primitive recursive by exhibiting the explicit recursion schema.

Define:

\[ \mathrm{add}(m, 0) = m = \pi^1_1(m), \]\[ \mathrm{add}(m, n+1) = S(\mathrm{add}(m, n)). \]

In terms of the primitive recursion schema: \( g(m) = \pi^1_1(m) \) (the projection function, hence primitive recursive) and \( h(m, n, p) = S(\pi^3_3(m, n, p)) = S(p) \) (composition of \( S \) and a projection, hence primitive recursive). The primitive recursion schema then gives \( \mathrm{add}(m, n) \) as primitive recursive.

Multiplication builds on addition:

\[ \mathrm{mult}(m, 0) = 0 = Z(m), \]\[ \mathrm{mult}(m, n+1) = \mathrm{add}(\mathrm{mult}(m, n), m). \]

Here \( g(m) = Z(m) \) (zero function) and \( h(m, n, p) = \mathrm{add}(p, m) \) (addition applied to the two projections). Since addition is already shown primitive recursive and \( h \) is a composition of primitive recursive functions, multiplication is primitive recursive.

Exponentiation:} \( m^n \) is defined similarly:}

\[ \mathrm{exp}(m, 0) = 1, \]\[ \mathrm{exp}(m, n+1) = \mathrm{mult}(\mathrm{exp}(m, n), m). \]
This is again a primitive recursion schema, so \( m^n \) is primitive recursive.

The class of partial recursive functions (equivalently, Turing-computable functions) is the smallest class extending the primitive recursive functions and closed under the \( \mu \)-operator (minimization): given total recursive \( g : \mathbb{N}^{k+1} \to \mathbb{N} \), the function \( f(\vec{n}) = \mu m\, [g(\vec{n},m)=0] \) (the least \( m \) such that \( g(\vec{n},m)=0 \), if it exists) is partial recursive.

Church–Turing Thesis. Every effectively computable function (in the informal sense) is partial recursive. This is a philosophical thesis, not a theorem, but is universally accepted.

5.2 Gödel Numbering

A Gödel numbering is an effective bijection (or injection) from syntactic objects (terms, formulas, proofs, sequences of formulas) to \( \mathbb{N} \). Fix such a numbering \( \ulcorner \cdot \urcorner \); we write \( \ulcorner \varphi \urcorner \) for the Gödel number of \( \varphi \).

A specific numbering for formulas in signature \( \sigma_{\mathrm{arith}} = \{0, S, +, \cdot\} \): Assign codes to symbols, then encode sequences by prime factorization or by pairing functions. The key properties:

The set of Gödel numbers of well-formed formulas is primitive recursive.
The substitution operation \( (\varphi, x, t) \mapsto \varphi[t/x] \) is primitive recursive (on Gödel numbers).

Worked Example 5.2 (Decidability and undecidability of validity).
Propositional tautologies are decidable. A propositional formula \( \varphi \) with \( n \) distinct propositional variables \( p_1, \ldots, p_n \) is a tautology iff it is true under all \( 2^n \) truth-value assignments. Given \( \varphi \) (as a string), we can: (1) Count the propositional variables. (2) Enumerate all \( 2^n \) assignments. (3) Evaluate \( \varphi \) under each assignment (evaluation is a primitive recursive function of the formula and assignment). (4) Output “tautology” iff all evaluations are true. This is a terminating procedure — hence a total recursive algorithm. The set of Gödel numbers of propositional tautologies is decidable (recursive).
First-order validity is not decidable. Church (1936) and Turing (1936) independently proved that the set
\[ \mathrm{Val} = \{\ulcorner \varphi \urcorner : \varphi \text{ is a first-order sentence that is logically valid}\} \]
is not recursive. The key steps:
\( \mathrm{Val} \) is recursively enumerable (r.e.): the completeness theorem says \( \varphi \) is valid iff \( \varphi \) is provable in the Hilbert calculus. Proofs are finite sequences of formulas, and we can enumerate all proofs, outputting the last line of each valid proof.
\( \mathrm{Val} \) is not recursive: if it were, we could decide any instance of the Halting Problem. Given a Turing machine \( M \) and input \( w \), one can construct a first-order sentence \( \varphi_{M,w} \) that is valid iff \( M \) halts on \( w \). Since the Halting Problem is undecidable, so is \( \mathrm{Val} \).
The contrast between propositional and first-order validity is fundamental: adding quantifiers \( \forall \) and \( \exists \) to propositional logic takes us from a decidable to an undecidable problem. This is one precise sense in which first-order logic is “more powerful” than propositional logic — and also more computationally intractable.

5.2.1 Representability in PA

Peano Arithmetic (PA) is the theory in signature \( \{0, S, +, \cdot\} \) with the following axioms:

\( \forall x\, \lnot (S(x) = 0) \)
\( \forall x \forall y\, (S(x) = S(y) \to x = y) \)
\( \forall x\, (x + 0 = x) \), \( \forall x \forall y\, (x + S(y) = S(x+y)) \)
\( \forall x\, (x \cdot 0 = 0) \), \( \forall x \forall y\, (x \cdot S(y) = x \cdot y + x) \)
Induction schema: for each formula \( \varphi(x) \):

\[ (\varphi(0) \land \forall x\, (\varphi(x) \to \varphi(S(x)))) \to \forall x\, \varphi(x). \]
A relation \( R \subseteq \mathbb{N}^k \) is representable in PA if there is a formula \( \varphi(x_1,\ldots,x_k) \) such that for all \( n_1,\ldots,n_k \in \mathbb{N} \):
If \( R(n_1,\ldots,n_k) \), then \( PA \vdash \varphi(\overline{n_1},\ldots,\overline{n_k}) \).
If \( \lnot R(n_1,\ldots,n_k) \), then \( PA \vdash \lnot\varphi(\overline{n_1},\ldots,\overline{n_k}) \).
Every primitive recursive (indeed, every recursive) relation is representable in PA.
This is proved by showing addition and multiplication are representable, then using the Gödel \( \beta \)-function to encode sequences, allowing the simulation of recursion within PA.
5.3 Decidability and Undecidability
A theory \( T \) is decidable if the set \( \{\ulcorner \varphi \urcorner : T \vdash \varphi\} \) is a recursive set. It is undecidable otherwise.
A set \( S \subseteq \mathbb{N} \) is recursively enumerable (r.e.) (or computably enumerable, c.e.) if it is the domain of a partial recursive function, equivalently, the range of a total recursive function (or empty).
(Halting Problem) The set \( K = \{e : \phi_e(e) \downarrow\} \) (where \( \phi_e \) is the \( e \)-th partial recursive function) is r.e. but not recursive. Equivalently, the halting problem — given a Turing machine and input, does it halt? — is undecidable.
K is r.e.: The universal Turing machine enumerates \( K \) — run the \( e \)-th machine on input \( e \); if it halts, enumerate \( e \).
K is not recursive: Suppose for contradiction that \( K \) is recursive, so \( \overline{K} = \mathbb{N} \setminus K \) is also recursive. Define \( f(e) = \phi_e(e) + 1 \) if \( e \in K \), and \( f(e) = 0 \) if \( e \notin K \). Then \( f \) is total recursive. But \( f = \phi_e \) for some \( e \). If \( e \in K \), then \( \phi_e(e) \downarrow \) so \( f(e) = \phi_e(e)+1 \neq \phi_e(e) \), contradiction. If \( e \notin K \), then \( \phi_e(e) \uparrow \) so \( f(e) = 0 \), but then \( \phi_e(e) \downarrow = 0 \), so \( e \in K \), contradiction. (Alternatively, use the standard diagonalization.)
5.4 Gödel’s First Incompleteness Theorem
(Gödel's First Incompleteness Theorem, 1931) Let \( T \) be a consistent, recursively axiomatizable extension of Robinson arithmetic \( Q \). Then there exists a sentence \( G_T \) (a Gödel sentence for \( T \)) such that:
\( T \not\vdash G_T \)
\( T \not\vdash \lnot G_T \)
In other words, \( T \) is incomplete: it has a statement that is neither provable nor refutable.
5.4.1 The Diagonal Lemma
(Fixed-Point / Diagonal Lemma) Let \( T \) extend Robinson's \( Q \) and let \( \psi(y) \) be any formula with one free variable. Then there exists a sentence \( \gamma \) such that \[ T \vdash \gamma \leftrightarrow \psi(\ulcorner \gamma \urcorner). \]
Define the substitution function \( \mathrm{sub}(m, n) \) to be the Gödel number of the formula obtained by substituting the numeral \( \overline{n} \) for the free variable in the formula with Gödel number \( m \). This is a primitive recursive function, hence representable by a formula \( \delta(x,y,z) \) in \( T \) — specifically, there is a term \( \mathrm{Sub} \) such that \( T \vdash \delta(\overline{m}, \overline{n}, \overline{\mathrm{sub}(m,n)}) \).
Let \( \theta(x) = \psi(\mathrm{sub}(x,x)) \) (substitute \( x \) for the free variable in the formula coded by \( x \), then apply \( \psi \)). Let \( m = \ulcorner \theta(x) \urcorner \). Define \( \gamma = \theta(\overline{m}) \). Then:
\[ \ulcorner \gamma \urcorner = \ulcorner \theta(\overline{m}) \urcorner = \mathrm{sub}(m, m). \]
And \( \gamma = \theta(\overline{m}) = \psi(\mathrm{sub}(\overline{m}, \overline{m})) = \psi(\overline{\mathrm{sub}(m,m)}) = \psi(\ulcorner \gamma \urcorner) \). Thus \( T \vdash \gamma \leftrightarrow \psi(\ulcorner \gamma \urcorner) \).
Worked Example 5.3 (Explicit construction of the Gödel sentence). We trace through the diagonal lemma to construct the Gödel sentence for PA explicitly.
Step 1: Fix a Gödel numbering. Assign each symbol of the language of PA a code: \( \ulcorner 0 \urcorner = 3 \), \( \ulcorner S \urcorner = 5 \), \( \ulcorner + \urcorner = 7 \), \( \ulcorner \cdot \urcorner = 9 \), \( \ulcorner = \urcorner = 11 \), \( \ulcorner \lnot \urcorner = 13 \), \( \ulcorner \land \urcorner = 15 \), \( \ulcorner \forall \urcorner = 17 \), and variables \( v_k \) get code \( 2k + 4 \). A formula is encoded as the product \( 2^{a_1} \cdot 3^{a_2} \cdot 5^{a_3} \cdots p_n^{a_n} \) where \( a_1, \ldots, a_n \) are the codes of its symbols in order (prime factorization encoding). Call this encoding \( \ulcorner \cdot \urcorner \).
Step 2: The provability predicate. A proof in PA is a finite sequence of formulas, each of which is an axiom or follows from earlier lines by modus ponens or generalization. The relation “\( n \) is a PA-proof of the formula with Gödel number \( m \)” — call it \( \mathrm{Proof}(n, m) \) — is primitive recursive (checking each line in a sequence is algorithmic). Since every primitive recursive relation is representable in PA, there is a PA-formula \( \mathrm{Proof}(x, y) \) representing this relation. Define:
\[ \mathrm{Prov}(y) := \exists x\, \mathrm{Proof}(x, y). \]
This is a \( \Sigma^0_1 \) formula: \( \mathrm{Prov}(\overline{n}) \) is provable in PA iff \( n \) is the Gödel number of a PA-theorem.
Step 3: Apply the Diagonal Lemma. Set \( \psi(y) = \lnot \mathrm{Prov}(y) \). By the Diagonal Lemma, there exists a sentence \( G \) such that
\[ \mathrm{PA} \vdash G \leftrightarrow \lnot \mathrm{Prov}(\ulcorner G \urcorner). \]
The sentence \( G \) is constructed as:
\[ G = \theta(\overline{m}), \quad \text{where } \theta(x) = \lnot \mathrm{Prov}(\mathrm{Sub}(x, x)), \]
and \( m = \ulcorner \theta(v_0) \urcorner \) is the Gödel number of the formula \( \theta \) with free variable \( v_0 \), and \( \mathrm{Sub}(x, y) \) is the PA-term representing the substitution function \( \mathrm{sub}(x, y) \).
Step 4: The sentence \( G \) says “I am not provable in PA." Informally: \( G \equiv \) “the formula with Gödel number \( \ulcorner G \urcorner \) (namely, me) has no proof in PA.” This is self-referential — \( G \) refers to its own Gödel number.
Step 5: Why \( G \) is neither provable nor refutable.
If PA \( \vdash G \), then PA proves the formula with Gödel number \( \ulcorner G \urcorner \), so there exists a proof (a natural number \( n \) coding that proof). Hence \( \mathbb{N} \models \mathrm{Prov}(\ulcorner G \urcorner) \), and since PA is sound (every PA-provable arithmetic sentence is true in \( \mathbb{N} \)), PA \( \vdash \mathrm{Prov}(\ulcorner G \urcorner) \). But then PA \( \vdash \lnot G \) (by the biconditional), contradicting consistency.
If PA \( \vdash \lnot G \), then \( \mathbb{N} \models \lnot G \), meaning \( \mathbb{N} \models \mathrm{Prov}(\ulcorner G \urcorner) \), so there exists a natural number \( n \) coding a PA-proof of \( G \). But we just showed PA \( \not\vdash G \) — contradiction.
Therefore PA is incomplete. Moreover, \( G \) is true in \( \mathbb{N} \): since PA \( \not\vdash G \), there is genuinely no proof of \( G \) in PA, so \( \mathbb{N} \models \lnot \mathrm{Prov}(\ulcorner G \urcorner) \), i.e., \( \mathbb{N} \models G \).
5.4.2 Proof of the First Incompleteness Theorem
Let \( \mathrm{Prov}_T(x) \) be the formula representing the provability predicate: \( \mathrm{Prov}_T(\overline{n}) \) holds (in \( \mathbb{N} \), and is provable in \( T \)) iff \( n \) is the Gödel number of a theorem of \( T \). This predicate exists because the set of proofs in \( T \) is primitive recursive (for recursively axiomatized \( T \)).
Apply the Diagonal Lemma with \( \psi(y) = \lnot \mathrm{Prov}_T(y) \):
There exists a sentence \( G \) such that \( T \vdash G \leftrightarrow \lnot \mathrm{Prov}_T(\ulcorner G \urcorner) \).
Intuitively, \( G \) says “I am not provable in \( T \).”
Claim 1: \( T \not\vdash G \). Suppose \( T \vdash G \). Since the proof system is correct (and \( T \) is consistent), \( \mathbb{N} \models \mathrm{Prov}_T(\ulcorner G \urcorner) \), so \( T \vdash \mathrm{Prov}_T(\ulcorner G \urcorner) \). But then \( T \vdash \lnot G \) (from the biconditional), contradicting consistency.
Claim 2: \( T \not\vdash \lnot G \) (assuming \( T \) is \( \omega \)-consistent or using Rosser’s trick). Under \( \omega \)-consistency: if \( T \vdash \lnot G \), then \( \mathbb{N} \models \lnot\lnot \mathrm{Prov}_T(\ulcorner G \urcorner) \), meaning there is a proof of \( G \) in \( T \). But we showed \( T \not\vdash G \), so \( \mathbb{N} \models \forall n\, \lnot \mathrm{Proof}_T(n, \ulcorner G \urcorner) \) (where \( \mathrm{Proof}_T(n,m) \) says \( n \) codes a proof of \( m \)). Then \( T \vdash \lnot \mathrm{Proof}_T(\overline{n}, \ulcorner G \urcorner) \) for each \( n \) (by representability), contradicting \( \omega \)-consistency and \( T \vdash \lnot G \) (which says a proof exists). \(\square\)
Rosser’s improvement (1936): Replace \( \psi(y) = \lnot \mathrm{Prov}_T(y) \) with the Rosser predicate \( \psi_R(y) = \forall p\, (\mathrm{Proof}_T(p,y) \to \exists q < p\, \mathrm{Proof}_T(q, \lnot y)) \), which says “any proof of me is preceded by a shorter proof of my negation”. Then \( T \not\vdash G_R \) and \( T \not\vdash \lnot G_R \) assuming only consistency (not \( \omega \)-consistency).
5.4.3 Semantic Completeness vs. Syntactic Incompleteness
The completeness theorem (Chapter 2) says: everything that is true in all models is provable. The incompleteness theorem says: even in a single fixed intended model (\( \mathbb{N} \)), there are true statements not provable in \( T \). These are not contradictory! The Gödel sentence \( G \) is true in \( \mathbb{N} \) (since \( \mathbb{N} \models \lnot \mathrm{Prov}_T(\ulcorner G \urcorner) \) — truly there is no proof), but it is not provable in \( T \).
5.5 Gödel’s Second Incompleteness Theorem
The consistency statement for \( T \) is the sentence \[ \mathrm{Con}(T) = \lnot \mathrm{Prov}_T(\ulcorner \bot \urcorner). \]
(Gödel's Second Incompleteness Theorem) Let \( T \) be a consistent, recursively axiomatizable extension of \( Q \). Then \[ T \not\vdash \mathrm{Con}(T). \]
We showed in the First Incompleteness Theorem that \( T \not\vdash G \), where \( G \leftrightarrow \lnot \mathrm{Prov}_T(\ulcorner G \urcorner) \). The key is that the argument for "Claim 1" above can be formalized inside \( T \): \[ T \vdash \mathrm{Con}(T) \to G. \]
Specifically, the proof of “if \( T \vdash G \) then \( T \) is inconsistent” can itself be carried out in \( T \) (using the Hilbert–Bernays provability conditions, a.k.a. the Löb conditions):
(D1) If \( T \vdash \varphi \), then \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \).
(D2) \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \land \mathrm{Prov}_T(\ulcorner \varphi \to \psi \urcorner) \to \mathrm{Prov}_T(\ulcorner \psi \urcorner) \).
(D3) \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \to \mathrm{Prov}_T(\ulcorner \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \urcorner) \).
From these: \( T \vdash \mathrm{Con}(T) \to G \). If \( T \vdash \mathrm{Con}(T) \), then \( T \vdash G \), contradicting the First Incompleteness Theorem.
Informal proof sketch via the First Incompleteness Theorem. The argument can be understood informally as follows. The First Incompleteness Theorem shows that the reasoning "if \( T \) is consistent, then \( T \not\vdash G \)" is itself formalizable in \( T \). That is, \( T \) can prove the conditional \( \mathrm{Con}(T) \to G \). Now suppose for contradiction that \( T \vdash \mathrm{Con}(T) \). Combining with \( T \vdash \mathrm{Con}(T) \to G \), modus ponens gives \( T \vdash G \). But the First Incompleteness Theorem says \( T \not\vdash G \) (assuming \( T \) is consistent). Contradiction. Therefore \( T \not\vdash \mathrm{Con}(T) \).
The crucial subtlety is that the implication \( \mathrm{Con}(T) \to G \) must be provable inside \( T \) (not just true in the metatheory). This is what the Löb conditions (D1)–(D3) ensure: they allow the informal soundness argument to be replicated as a formal PA-derivation.
Philosophical significance. Hilbert’s program sought a finitary consistency proof for all of mathematics. The Second Incompleteness Theorem shows that no consistent system powerful enough to encode arithmetic can prove its own consistency. In particular, ZFC cannot prove \( \mathrm{Con}(\text{ZFC}) \) (assuming ZFC is consistent). A stronger system can prove the consistency of a weaker one — but one always needs to “stand outside” the system.
5.5.1 Löb’s Theorem
(Löb's Theorem, 1955) For a theory \( T \) satisfying the Hilbert–Bernays conditions: if \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \to \varphi \), then \( T \vdash \varphi \).
Apply the Diagonal Lemma to \( \psi(y) = \mathrm{Prov}_T(y) \to \varphi \): get \( \lambda \) with \( T \vdash \lambda \leftrightarrow (\mathrm{Prov}_T(\ulcorner \lambda \urcorner) \to \varphi) \).
Assuming \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \to \varphi \):
By (D1) applied to \( T \vdash \lambda \leftrightarrow \ldots \): \( T \vdash \mathrm{Prov}_T(\ulcorner\lambda\urcorner) \).
By (D2) and (D3): \( T \vdash \mathrm{Prov}_T(\ulcorner \mathrm{Prov}_T(\ulcorner\lambda\urcorner) \urcorner) \).
Then \( T \vdash \mathrm{Prov}_T(\ulcorner \lambda \urcorner) \to \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \) (since \( T \vdash \lambda \to (\mathrm{Prov}_T(\ulcorner\lambda\urcorner) \to \varphi) \)).
Combining: \( T \vdash \mathrm{Prov}_T(\ulcorner\lambda\urcorner) \to \varphi \), hence \( T \vdash \lambda \).
By (D1): \( T \vdash \mathrm{Prov}_T(\ulcorner\lambda\urcorner) \), and then \( T \vdash \varphi \).
Corollary. The Gödel sentence \( G \) is not provable in \( T \): if it were, then \( T \vdash \mathrm{Prov}_T(\ulcorner G \urcorner) \to G \), but by Löb’s theorem this would give \( T \vdash G \) and hence inconsistency.
5.6 Tarski’s Undefinability of Truth
(Tarski's Undefinability Theorem) Let \( T \) be any consistent extension of \( Q \). There is no formula \( \mathrm{True}(x) \) in the language of arithmetic such that for all sentences \( \varphi \): \[ T \vdash \mathrm{True}(\ulcorner \varphi \urcorner) \leftrightarrow \varphi. \]
In other words, arithmetic truth is not arithmetically definable.
Suppose such \( \mathrm{True}(x) \) existed. Apply the Diagonal Lemma with \( \psi(y) = \lnot \mathrm{True}(y) \): get \( \lambda \) with \( T \vdash \lambda \leftrightarrow \lnot \mathrm{True}(\ulcorner \lambda \urcorner) \). If \( \mathrm{True}(\ulcorner \lambda \urcorner) \) is provable, then \( \lambda \) is provable (from the biconditional), so \( \lnot \lambda \) is provable, contradiction. If \( \lnot \mathrm{True}(\ulcorner \lambda \urcorner) \), then \( \lambda \) holds in \( \mathbb{N} \), so \( \mathrm{True}(\ulcorner \lambda \urcorner) \) holds (by assumption), contradiction. Both cases give contradictions, so no such formula exists.
This is the formal version of the Liar Paradox (“This sentence is false”). Tarski’s theorem is in fact stronger than Gödel’s: it shows not just that \( T \) is incomplete, but that truth in \( \mathbb{N} \) is not even definable in the language of arithmetic, let alone provable.
Chapter 6: Deeper Model Theory
6.1 Ultraproducts and an Alternative Proof of Compactness
A filter \( \mathcal{F} \) on a set \( I \) is a collection of subsets of \( I \) such that:
\( I \in \mathcal{F} \) and \( \emptyset \notin \mathcal{F} \).
If \( A \in \mathcal{F} \) and \( A \subseteq B \), then \( B \in \mathcal{F} \).
If \( A, B \in \mathcal{F} \), then \( A \cap B \in \mathcal{F} \).
A filter \( \mathcal{U} \) is an ultrafilter if additionally for every \( A \subseteq I \): either \( A \in \mathcal{U} \) or \( I \setminus A \in \mathcal{U} \).
Let \( \{(\mathfrak{A}_i)\}_{i \in I} \) be a family of \( \sigma \)-structures and \( \mathcal{U} \) an ultrafilter on \( I \). The ultraproduct \( \prod_{\mathcal{U}} \mathfrak{A}_i \) has:
Universe: \( \left(\prod_{i \in I} A_i\right) / \sim_{\mathcal{U}} \) where \( (a_i) \sim_{\mathcal{U}} (b_i) \) iff \( \{i : a_i = b_i\} \in \mathcal{U} \).
Functions and relations interpreted coordinatewise modulo \( \mathcal{U} \).
If all \( \mathfrak{A}_i = \mathfrak{A} \), this is the ultrapower \( \mathfrak{A}^I / \mathcal{U} \).
(Łoś's Theorem) For any formula \( \varphi(x_1,\ldots,x_n) \) and any elements \( [(a^1_i)], \ldots, [(a^n_i)] \) of the ultraproduct: \[ \prod_{\mathcal{U}} \mathfrak{A}_i \models \varphi\left[[(a^1_i)],\ldots,[(a^n_i)]\right] \iff \{i \in I : \mathfrak{A}_i \models \varphi[a^1_i,\ldots,a^n_i]\} \in \mathcal{U}. \]
By induction on \( \varphi \). Atomic: immediate from the definitions. Negation: \( \{i : \mathfrak{A}_i \models \lnot \varphi[\vec{a}_i]\} = I \setminus \{i : \mathfrak{A}_i \models \varphi[\vec{a}_i]\} \), and \( \mathcal{U} \) is an ultrafilter so exactly one of a set and its complement is in \( \mathcal{U} \). Conjunction: \( \{i : \mathfrak{A}_i \models (\varphi \land \psi)[\vec{a}_i]\} = \{i : \mathfrak{A}_i \models \varphi[\vec{a}_i]\} \cap \{i : \mathfrak{A}_i \models \psi[\vec{a}_i]\} \). Existential: if \( \{i : \mathfrak{A}_i \models \exists x\, \varphi[x,\vec{a}_i]\} \in \mathcal{U} \), use AC to pick witnesses \( b_i \in A_i \) for \( i \) in this set; then \( [(b_i)] \) witnesses the existential in the ultraproduct.
Proof of Compactness via Ultraproducts. Let \( \Phi = \{\varphi_j : j \in J\} \) be finitely satisfiable. Index models: for each finite \( F \subseteq J \), let \( \mathfrak{A}_F \models \Phi_F = \{\varphi_j : j \in F\} \). Set \( I = \{F \subseteq J : F \text{ finite}\} \). For each \( j \in J \), let \( X_j = \{F \in I : j \in F\} \). The family \( \{X_j : j \in J\} \) has the finite intersection property (any finite intersection \( X_{j_1} \cap \cdots \cap X_{j_n} \supseteq \{j_1,\ldots,j_n\} \neq \emptyset \)), so it extends to an ultrafilter \( \mathcal{U} \) on \( I \). For each \( j \in J \): \( \{F : \mathfrak{A}_F \models \varphi_j\} \supseteq X_j \in \mathcal{U} \). By Łoś, \( \prod_\mathcal{U} \mathfrak{A}_F \models \varphi_j \) for all \( j \). Hence \( \prod_\mathcal{U} \mathfrak{A}_F \models \Phi \).
6.2 The Omitting Types Theorem
(Omitting Types Theorem) Let \( T \) be a complete theory over a countable language, and let \( \{p_n(x_n)\}_{n \in \omega} \) be a countable collection of non-isolated types (types \( p \in S_n(T) \) that are not generated by a single formula). Then there exists a countable model of \( T \) omitting all \( p_n \).
A type is isolated by \( \varphi \) if \( T \vdash \varphi \to \psi \) for all \( \psi \in p \). The theorem says that types which are not forced must be avoidable.
6.3 Categoricity and Morley’s Theorem
A theory \( T \) is \( \kappa \)-categorical if it has exactly one model of cardinality \( \kappa \) (up to isomorphism).
(Morley's Theorem, 1965) If a countable theory \( T \) is \( \kappa \)-categorical for some uncountable cardinal \( \kappa \), then it is \( \kappa \)-categorical for every uncountable cardinal \( \kappa \).
This deep theorem — Morley’s first major result in model theory — launched the program of stability theory and classification theory, developed by Shelah in the 1970s. The key tools are:
Strongly minimal sets: a definable set \( D \) in \( M \models T \) is strongly minimal if it is infinite and for every definable subset \( X \subseteq D \), either \( X \) or \( D \setminus X \) is finite.
Algebraic independence and Morley rank: a notion of dimension analogous to transcendence degree in field theory.
Uncountably categorical theories have a strongly minimal set and are “controlled” by this set.
6.4 Stability and Classification
Shelah’s monumental classification program addresses: for which theories \( T \) and cardinals \( \kappa \) is the number of models of \( T \) of cardinality \( \kappa \) small?
A complete theory \( T \) is stable if for every cardinal \( \kappa \geq |T| + \aleph_0 \) and every \( M \models T \) with \( |M| = \kappa \): \( |S_n(M)| = \kappa \) for all \( n \). (Many equivalent characterizations exist.)
Stability spectrum: a theory is:
\( \omega \)-stable (totally transcendental) if \( |S_1(M)| \leq \aleph_0 \) for all \( M \models T \) with \( |M| = \aleph_0 \). Examples: ACF, the theory of \( (\mathbb{Q}, +) \).
Superstable if stable in all \( \kappa \geq 2^{|T|} \).
Stable (but not superstable): DLO is not stable!
Unstable (has the order property): real closed fields, linear orders.
Chapter 7: Connections and Advanced Topics
7.1 Zermelo–Fraenkel Set Theory and Gödel’s Constructible Universe
Gödel’s L (the Constructible Universe) is built by:
\[ L_0 = \emptyset, \quad L_{\alpha+1} = \mathrm{Def}(L_\alpha) = \{X \subseteq L_\alpha : X \text{ is first-order definable over } (L_\alpha, \in)\}, \]\[ L_\lambda = \bigcup_{\alpha < \lambda} L_\alpha \text{ for limit } \lambda, \quad L = \bigcup_\alpha L_\alpha. \]
(Gödel, 1938) \( L \) is a model of ZFC. Moreover, in \( L \):
The Axiom of Choice holds (there is a definable global well-ordering of \( L \)).
The Generalized Continuum Hypothesis (GCH) holds: \( 2^{\aleph_\alpha} = \aleph_{\alpha+1} \) for all \( \alpha \).
In particular, if ZF is consistent, then so is ZFC + GCH.
Cohen’s Forcing (1963). Cohen invented the method of forcing to build models of ZFC in which CH fails. One starts with a countable transitive model \( M \) of ZFC and adds a generic set \( G \subseteq \omega \times \omega_2 \) to get an extension \( M[G] \) in which \( 2^{\aleph_0} \geq \aleph_2 \). The key is that the forcing relation “\( p \Vdash \varphi \)” (a condition \( p \) forces \( \varphi \)) is definable in \( M \), so truth in \( M[G] \) can be controlled from within \( M \). This established the independence of CH from ZFC.
7.2 Large Cardinals
The consistency strength of set-theoretic principles is measured by large cardinals:
Inaccessible cardinals: \( \kappa > \omega \) is inaccessible if it is regular and for all \( \lambda < \kappa \), \( 2^\lambda < \kappa \). ZFC cannot prove inaccessibles exist.
Measurable cardinals: \( \kappa \) is measurable if there is a \( \kappa \)-complete ultrafilter on \( \kappa \). The ultrapower of \( V \) by this ultrafilter gives an elementary embedding \( j : V \to M \) with critical point \( \kappa \).
Woodin cardinals, supercompact cardinals, etc.: the large cardinal hierarchy extends far into the transfinite and provides consistency strength for many set-theoretic propositions.
7.3 Reflection and the Relationship Between Syntax and Semantics
(Reflection Principle) For every formula \( \varphi(x_1,\ldots,x_n) \) and every ordinal \( \alpha \), there exists \( \beta > \alpha \) such that for all \( a_1,\ldots,a_n \in V_\beta \): \[ V_\beta \models \varphi[a_1,\ldots,a_n] \iff V \models \varphi[a_1,\ldots,a_n]. \]
This shows that the axioms of ZFC cannot be finite (since any finite list of axioms has a set-sized model, contradicting the incompleteness theorem — actually, this gives another proof that ZFC cannot prove its own consistency).
7.4 Arithmetical Hierarchy
The arithmetical hierarchy classifies subsets of \( \mathbb{N} \) by the complexity of their definitions:
\( \Sigma^0_0 = \Pi^0_0 = \Delta^0_0 \): decidable (recursive) sets, defined by bounded quantification.
\( \Sigma^0_1 \): r.e. sets, defined by \( \exists x_1\, \forall x_2\, \ldots \) (one unbounded existential at front).
\( \Pi^0_1 \): co-r.e. sets, defined by \( \forall x_1\, \ldots \).
\( \Sigma^0_{n+1} \): sets defined by \( \exists \vec{x}\, \varphi \) where \( \varphi \) is \( \Pi^0_n \).
\( \Pi^0_{n+1} \): sets defined by \( \forall \vec{x}\, \varphi \) where \( \varphi \) is \( \Sigma^0_n \).
\( \Delta^0_n = \Sigma^0_n \cap \Pi^0_n \).
(Post's Theorem) A set \( A \subseteq \mathbb{N} \) is \( \Sigma^0_n \) if and only if it is computably enumerable relative to \( \emptyset^{(n-1)} \) (the \( (n-1) \)-th Turing jump). In particular, \( \mathrm{Th}(\mathbb{N}) \) is \( \Pi^0_1 \)-hard but is not arithmetical — truth in \( \mathbb{N} \) is not even at any finite level of the hierarchy.
This last fact is essentially Tarski’s undefinability theorem expressed in the language of computability.
Chapter 8: Summary of Major Theorems and Connections
8.1 The Four Pillars — A Unified View
The four main areas of PMATH 432 are deeply interconnected:
First-order logic provides the formal language (Chapter 1). Proof theory (Chapter 2) shows that syntactic deduction (\( \vdash \)) matches semantic entailment (\( \models \)) — the completeness theorem. Model theory (Chapters 3 and 6) explores the space of all structures satisfying a theory: how many are there, how do they relate, what can be expressed? Set theory (Chapter 4) provides the foundational framework and exhibits independence results. Computability (Chapter 5) bounds what can be proved or even defined.
The central theorems and their dependencies:
Theorem Requires Consequence
Completeness (Gödel 1930) Lindenbaum + Henkin Compactness
Compactness Completeness or Ultraproducts Löwenheim–Skolem (upward)
Löwenheim–Skolem (downward) Skolem functions Infinite models of any theory
Vaught’s test Löwenheim–Skolem Completeness of DLO, ACF
Gödel numbering Primitive recursion Representability in PA
Diagonal Lemma Representability Gödel sentences, Tarski
First Incompleteness Diagonal Lemma Second Incompleteness
Second Incompleteness Hilbert–Bernays conditions Limits of formal provability
8.2 Key Examples Revisited
Dense linear orders without endpoints (DLO):
Axioms: linear order + density + no endpoints.
Model: \( (\mathbb{Q}, <) \) — the unique countable model up to isomorphism.
Complete by Vaught’s test (\( \aleph_0 \)-categorical).
Admits quantifier elimination (definable sets are Boolean combinations of intervals).
Unstable (the order relation witnesses the order property).
Algebraically closed fields of characteristic \( p \) (\( \mathrm{ACF}_p \)):
Complete, \( \kappa \)-categorical for all uncountable \( \kappa \).
Admits quantifier elimination (definable sets = constructible sets).
Strongly minimal (algebraic closure provides a well-behaved pregeometry).
Stable (in fact \( \omega \)-stable).
Peano Arithmetic (PA):
Intended model: \( (\mathbb{N}, 0, S, +, \cdot) \).
Has many non-isomorphic countable models (non-standard models).
Incomplete (Gödel) and undecidable.
Not finitely axiomatizable.
ZFC:
Axiomatizes set theory. Has models of all infinite cardinalities (Löwenheim–Skolem), including countable ones (Skolem’s paradox).
Cannot prove its own consistency (Second Incompleteness Theorem).
Independent statements: CH, large cardinal axioms, many others.
8.3 Proof-Theoretic Ordinals and Gentzen’s Consistency Proof
While PA cannot prove \( \mathrm{Con}(\text{PA}) \), Gentzen (1936) gave a finitistic proof of PA’s consistency using transfinite induction up to the ordinal \( \epsilon_0 = \omega^{\omega^{\omega^{\cdots}}} \) (the least ordinal \( \alpha \) with \( \omega^\alpha = \alpha \)). This shows that the “proof-theoretic ordinal” of PA is \( \epsilon_0 \). The proof-theoretic ordinals of stronger theories are correspondingly larger.
Chapter 9: Selected Worked Problems
9.1 Proving Elementary Equivalence
Problem. Show that \( (\mathbb{Q}, <) \equiv (\mathbb{Q}+\mathbb{Q}, <) \) where \( \mathbb{Q} + \mathbb{Q} \) denotes two disjoint copies of \( \mathbb{Q} \), ordered so all elements of the first copy precede all elements of the second.
Solution. Both structures are dense linear orders without endpoints: \( \mathbb{Q} + \mathbb{Q} \) is a dense linear order (between any two points in the same copy or across copies there are intermediate points) without endpoints. Since \( \mathrm{DLO} \) is complete, all its models are elementarily equivalent. So \( (\mathbb{Q}, <) \equiv (\mathbb{Q}+\mathbb{Q}, <) \). Note they are not isomorphic: \( \mathbb{Q} \) has countably many points and is connected in the Dedekind sense, while \( \mathbb{Q}+\mathbb{Q} \) has a “gap”. But first-order logic cannot detect such a gap.
9.2 Applying Compactness
Problem. Let \( \{G_n : n \geq 1\} \) be groups where each \( G_n \) has an element of order exactly \( n \). Show that there is a group \( G \) in which every element has infinite order.
Solution. Let \( T_{\mathrm{grp}} \) be the group axioms and for each \( n \geq 2 \), let \( \varphi_n = \forall x\, (x^n \neq e) \) (no element of order dividing \( n \)). Let \( \Phi = T_{\mathrm{grp}} \cup \{\varphi_n : n \geq 2\} \). Given any finite subset \( \Phi_0 \), the largest \( n \) appearing is some \( N \). Take the free abelian group \( \mathbb{Z} \); every nonidentity element has infinite order, so \( \mathbb{Z} \models \Phi_0 \). By compactness, \( \Phi \) is satisfiable. Any model \( G \models \Phi \) has no element of any finite order \( \geq 2 \), and since \( \varphi_1 \) (no element of order 1 except the identity) follows from group axioms, every nonidentity element has infinite order.
9.3 Non-Standard Analysis
Problem. Using compactness, show the existence of a proper extension \( {}^*\mathbb{R} \) of \( \mathbb{R} \) in the language of ordered fields with a new constant \( c \) satisfying \( c > 0 \) and \( c < \overline{r} \) for every \( r > 0 \) in \( \mathbb{R} \).
Solution. Let \( T = \mathrm{Th}(\mathbb{R}) \cup \{c > 0\} \cup \{c < \overline{r} : r \in \mathbb{R}, r > 0\} \). For any finite subset, take \( r_{\min} = \min\{r : c < \overline{r} \in \Phi_0\} > 0 \), and interpret \( c \) as \( r_{\min}/2 \). By compactness, \( T \) has a model \( {}^*\mathbb{R} \). The element \( c^{{}^*\mathbb{R}} \) is an infinitesimal — positive but smaller than every positive real. Since \( {}^*\mathbb{R} \models \mathrm{Th}(\mathbb{R}) \), it satisfies all first-order truths about \( \mathbb{R} \) (Transfer Principle), providing the foundation for Abraham Robinson’s non-standard analysis.
9.4 A Fixed Point for the Diagonal Lemma
Problem. Exhibit concretely the Gödel sentence for the theory \( T = \mathrm{PA} \).
Solution. We outline the construction:
Fix a Gödel numbering of PA-formulas. Let \( \mathrm{Prov}(x) \) be the PA-formula representing provability (a \( \Sigma^0_1 \) formula).
Consider \( \psi(y) = \lnot \mathrm{Prov}(y) \). Let \( \theta(x) = \lnot \mathrm{Prov}(\mathrm{Sub}(x,x)) \) where \( \mathrm{Sub}(x,x) \) represents the substitution operation on Gödel numbers.
Let \( m = \ulcorner \theta(v_0) \urcorner \) (the Gödel number of \( \theta \) with free variable \( v_0 \)).
The Gödel sentence is \( G = \theta(\overline{m}) = \lnot \mathrm{Prov}(\overline{\mathrm{sub}(m,m)}) \).
Since \( \mathrm{sub}(m,m) = \ulcorner G \urcorner \), we have \( G = \lnot \mathrm{Prov}(\ulcorner G \urcorner) \).
\( G \) is true in \( \mathbb{N} \): if PA \( \vdash G \), then \( \mathbb{N} \models \mathrm{Prov}(\ulcorner G \urcorner) \), giving \( \mathbb{N} \models \lnot G \), contradiction. So PA \( \not\vdash G \), and this is exactly what \( G \) says.
9.5 Uncountable Models of Complete Theories
Problem. Show that every complete theory with an infinite model has models of every infinite cardinality.
Solution. Let \( T \) be a complete theory with infinite model \( \mathfrak{A} \). Since \( \mathfrak{A} \) is infinite, it has cardinality \( \kappa \geq \aleph_0 \). By the upward Löwenheim–Skolem theorem, for any \( \lambda > \kappa \), \( T \) has a model of cardinality \( \lambda \). (The argument: add \( \lambda \) new constants, the resulting theory is satisfiable by compactness and the infiniteness of \( \mathfrak{A} \).) For \( \aleph_0 \leq \lambda < \kappa \): by the downward Löwenheim–Skolem theorem applied to \( \mathfrak{A} \), there is an elementary substructure of cardinality \( \lambda \) (choose any \( \lambda \) elements as a base set and take its Skolem hull). This elementary substructure also models \( T \).
Appendix A: Background on Orders and Algebra
A.1 Well-Orderings and Zorn’s Lemma
(Zorn's Lemma) If \( (P, \leq) \) is a nonempty partially ordered set in which every chain (totally ordered subset) has an upper bound in \( P \), then \( P \) has a maximal element.
Zorn’s Lemma is equivalent to the Axiom of Choice and to the Well-Ordering Theorem: every set can be well-ordered. It is used critically in the proof of the Completeness Theorem (extending consistent sets to maximally consistent sets).
A.2 Boolean Algebras and Stone Duality
A Boolean algebra is a set \( B \) with operations \( \land, \lor, \lnot \) and constants \( 0, 1 \) satisfying the usual Boolean laws. A Stone space is a compact, totally disconnected Hausdorff topological space.
(Stone Representation Theorem) Every Boolean algebra is isomorphic to the algebra of clopen (closed-and-open) sets of a compact totally disconnected Hausdorff space. Conversely, every such space arises this way.
Application. The Lindenbaum–Tarski algebra of a theory \( T \) (formulas modulo \( T \)-provable equivalence) is a Boolean algebra. The corresponding Stone space is the type space \( S_0(T) \) — the space of complete types (= complete theories extending \( T \)). A type is isolated iff the corresponding point is isolated in the Stone space. The Omitting Types Theorem says that non-isolated types need not be realized in the prime model.
A.3 Ramsey Theory and 0-1 Laws
(Fagin's 0-1 Law, 1976) For any first-order sentence \( \varphi \) in the language of graphs, as \( n \to \infty \): \[ \Pr[\text{random graph on } n \text{ vertices satisfies } \varphi] \to 0 \text{ or } 1. \]
Moreover, the sentences with limiting probability 1 form a complete theory — the theory of the random (Rado) graph.
This connects model theory with combinatorics and probability: the Fraïssé limit of the class of all finite graphs (the Rado graph) is the almost-sure limit of the random graph model.
Appendix B: Quick Reference
B.1 Key Definitions at a Glance
Term Definition
\( \sigma \)-structure Universe + interpretations of all symbols
\( \mathfrak{A} \models \varphi \) \( \varphi \) is true in \( \mathfrak{A} \) (for sentences)
\( T \models \varphi \) \( \varphi \) is true in all models of \( T \)
\( T \vdash \varphi \) \( \varphi \) is derivable from \( T \)
Consistent \( T \not\vdash \bot \)
Complete (theory) For all \( \varphi \): \( T \vdash \varphi \) or \( T \vdash \lnot\varphi \)
Elementary substructure \( \mathfrak{A} \preccurlyeq \mathfrak{B} \): same truth for all formulas
Compactness \( T \) satisfiable iff every finite subset satisfiable
\( \kappa \)-categorical Unique model of size \( \kappa \) (up to isomorphism)
Type \( p(\vec{x}) \) Maximal consistent set of formulas over a base
Ordinal Transitive set well-ordered by \( \in \)
Cardinal Ordinal not in bijection with any smaller ordinal
B.2 Standard Theories and Their Properties
Theory Language Complete? Cat.? QE? Stable?
DLO \( < \) Yes \( \aleph_0 \) only Yes No (unstable)
ACF\(_0\) \( +,\cdot,0,1 \) Yes All uncountable Yes Yes (\( \omega \)-stable)
ACF\(_p\) \( +,\cdot,0,1 \) Yes All uncountable Yes Yes (\( \omega \)-stable)
RCF \( +,\cdot,0,1,< \) Yes No (many sizes) Yes No
PA \( 0,S,+,\cdot \) No No No N/A
ZFC \( \in \) No No No N/A
B.3 The Gödel Sentences — Summary
Let \( T \) be a consistent recursively axiomatizable extension of \( Q \).
\( G_T \): the Gödel sentence, with \( T \vdash G_T \leftrightarrow \lnot \mathrm{Prov}_T(\ulcorner G_T \urcorner) \).
\( T \not\vdash G_T \) and \( T \not\vdash \lnot G_T \) (First Incompleteness).
\( \mathbb{N} \models G_T \) (the standard model sees \( G_T \) as true).
\( \mathrm{Con}(T) = \lnot \mathrm{Prov}_T(\ulcorner \bot \urcorner) \): consistency statement.
\( T \not\vdash \mathrm{Con}(T) \) (Second Incompleteness).
\( T + \mathrm{Con}(T) \) is a strictly stronger consistent theory.
Relationship: \( T \vdash G_T \leftrightarrow \mathrm{Con}(T) \) (the Gödel sentence is equivalent to consistency over \( T \), essentially).
Appendix C: Computability Theory Primer
This appendix develops the computability-theoretic background that underpins Chapters 5 and 7 in more detail. The reader who has seen the informal account in Chapter 5 will benefit from the formal definitions and classical theorems collected here.
C.1 Primitive Recursive Functions — Complete Development
The primitive recursive functions constitute the “safe” core of computability: every primitive recursive function terminates, and all basic arithmetic operations lie within this class. They are built from three base functions by two closure operations.
Base functions:
Zero function: \( Z : \mathbb{N} \to \mathbb{N} \), \( Z(n) = 0 \).
Successor function: \( S : \mathbb{N} \to \mathbb{N} \), \( S(n) = n + 1 \).
Projection functions: \( \pi^k_i : \mathbb{N}^k \to \mathbb{N} \), \( \pi^k_i(n_1, \ldots, n_k) = n_i \) for \( 1 \leq i \leq k \).
Closure operations:
Composition: If \( g : \mathbb{N}^m \to \mathbb{N} \) and \( h_1, \ldots, h_m : \mathbb{N}^k \to \mathbb{N} \) are primitive recursive, then so is \( f(\vec{n}) = g(h_1(\vec{n}), \ldots, h_m(\vec{n})) \).
Primitive recursion: If \( g : \mathbb{N}^k \to \mathbb{N} \) and \( h : \mathbb{N}^{k+2} \to \mathbb{N} \) are primitive recursive, then so is \( f \) defined by:
\[ f(\vec{n}, 0) = g(\vec{n}), \qquad f(\vec{n}, m+1) = h(\vec{n}, m, f(\vec{n}, m)). \]
Primitive recursive examples in detail.
Predecessor: Define \( \mathrm{pred}(0) = 0 \) and \( \mathrm{pred}(n+1) = n \). In the recursion schema: \( g = Z \) (the zero function on zero arguments, i.e., just the constant 0) and \( h(n, p) = \pi^2_1(n, p) = n \). This is primitive recursive.
Monus (cut-off subtraction): \( m \dot{-} n = \max(m - n, 0) \). Define:
\[ m \dot{-} 0 = m, \qquad m \dot{-} (n+1) = \mathrm{pred}(m \dot{-} n). \]
Both steps use already-primitive-recursive functions, so \( \dot{-} \) is primitive recursive.
Absolute value of difference: \( |m - n| = (m \dot{-} n) + (n \dot{-} m) \). A composition of primitive recursive functions, hence primitive recursive.
Characteristic function of equality: Define \( \mathrm{eq}(m, n) = 1 \) if \( m = n \), else \( 0 \). Note \( m = n \iff |m - n| = 0 \). Define \( \mathrm{iszero}(k) = 1 \dot{-} \min(k, 1) \). Then \( \mathrm{eq}(m, n) = \mathrm{iszero}(|m - n|) \). This is primitive recursive.
Bounded search: If \( R(\vec{n}, m) \) is a primitive recursive relation (its characteristic function is primitive recursive), then the function \( \mu m \leq k\, [R(\vec{n}, m)] \) (least \( m \leq k \) with \( R(\vec{n}, m) \), or \( k+1 \) if none exists) is primitive recursive. Unbounded search is what requires the \( \mu \)-operator and takes us beyond primitive recursion.
Every primitive recursive function is total — it terminates on all inputs. The Ackermann function \( A(m, n) \), defined by \( A(0, n) = n+1 \), \( A(m+1, 0) = A(m, 1) \), \( A(m+1, n+1) = A(m, A(m+1, n)) \), grows faster than any primitive recursive function, yet is still total recursive (Turing-computable). This shows that primitive recursion does not capture all of computable mathematics — only the "elementary" part.
C.2 Partial Recursive Functions and the \( \mu \)-Operator
The partial recursive functions (equivalently, Turing-computable partial functions) are the smallest class containing all primitive recursive functions and closed under the unbounded \( \mu \)-operator: given \( g : \mathbb{N}^{k+1} \to \mathbb{N} \) total recursive, the function \[ f(\vec{n}) = \mu m\, [g(\vec{n}, m) = 0] \]
(the least \( m \) such that \( g(\vec{n}, m) = 0 \), if such \( m \) exists; undefined otherwise) is partial recursive.
The \( \mu \)-operator introduces potential non-termination: if \( g(\vec{n}, m) \neq 0 \) for all \( m \), then \( f(\vec{n}) \) is undefined. This is the key distinction between primitive and general recursive functions.
(Kleene's Normal Form Theorem) There is a primitive recursive predicate \( T(e, \vec{n}, s) \) (the Kleene T-predicate) and a primitive recursive function \( U(s) \) (the result extraction function) such that for every partial recursive function \( f : \mathbb{N}^k \to \mathbb{N} \): \[ f(\vec{n}) = U(\mu s\, [T(e, \vec{n}, s)]) \]
for some fixed index \( e \in \mathbb{N} \). Here \( T(e, \vec{n}, s) \) means “s codes a halting computation of the e-th Turing machine on input \( \vec{n} \).”
The Normal Form Theorem is a fundamental uniformization result: every partial recursive function has a “standard” representation using a single application of \( \mu \) to a primitive recursive predicate. This allows the set of partial recursive functions to be indexed by natural numbers (the index \( e \) is an “program” or “Gödel number” for \( f \)).
Consequences:
Every partial recursive function \( f \) has an index \( e \) with \( f = \phi_e \) (the \( e \)-th partial recursive function under the standard indexing).
The function \( (e, \vec{n}) \mapsto \phi_e(\vec{n}) \) is partial recursive (computable by the universal Turing machine).
There are only countably many partial recursive functions (indexed by \( \mathbb{N} \)).
C.3 The \( s\text{-}m\text{-}n \) Theorem (Parameter Theorem)
(Kleene's \( s\text{-}m\text{-}n \) Theorem) For each \( m, n \geq 1 \), there is a primitive recursive function \( s^m_n : \mathbb{N}^{m+1} \to \mathbb{N} \) such that for all \( e, x_1, \ldots, x_m, y_1, \ldots, y_n \in \mathbb{N} \): \[ \phi_{s^m_n(e, x_1, \ldots, x_m)}(y_1, \ldots, y_n) = \phi_e(x_1, \ldots, x_m, y_1, \ldots, y_n). \]
Informally: given a program \( e \) for an \( (m+n) \)-ary function, and given \( m \) fixed inputs \( x_1, \ldots, x_m \), the \( s\text{-}m\text{-}n \) theorem produces the index of an \( n \)-ary program that runs \( e \) with the first \( m \) inputs already “baked in.” This is partial function application or currying — a fundamental operation in programming languages.
Example. Fix \( e \) and \( x \). The function \( y \mapsto \phi_e(x, y) \) is a partial recursive function of \( y \) alone. By \( s\text{-}1\text{-}1 \), its index is \( s^1_1(e, x) \), computable from \( e \) and \( x \) by a primitive recursive procedure. This is the theoretical foundation for closures in functional programming.
Application to the Fixed-Point (Recursion) Theorem. The \( s\text{-}m\text{-}n \) theorem implies the Kleene Recursion Theorem: for every total recursive function \( f \), there exists \( n \) with \( \phi_{f(n)} = \phi_n \). That is, every computable transformation of programs has a “fixed point” program. The Diagonal Lemma in logic is the syntactic analogue of this theorem.
C.4 Rice’s Theorem
One of the most powerful undecidability results is Rice’s theorem, which says that no non-trivial behavioral property of programs is decidable.
A set \( A \subseteq \mathbb{N} \) is an index set if: whenever \( \phi_e = \phi_{e'} \) (i.e., programs \( e \) and \( e' \) compute the same function), we have \( e \in A \iff e' \in A \). Index sets describe properties of the behavior (input-output function) of programs, not of their syntactic code.
(Rice's Theorem, 1953) Let \( A \) be an index set. Then \( A \) is recursive if and only if \( A = \emptyset \) or \( A = \mathbb{N} \). In other words, every non-trivial index set is undecidable.
Suppose \( A \) is a non-trivial index set, so there exist \( a \in A \) and \( b \notin A \). Let \( \phi_a \) be a program in \( A \) and \( \phi_b \notin A \). We reduce the Halting Problem to membership in \( A \).
Given any program index \( e \) and input \( n \), define the function:
\[ f_{e,n}(x) = \begin{cases} \phi_a(x) & \text{if } \phi_e(n) \downarrow, \\ \text{undefined} & \text{otherwise.} \end{cases} \]
This \( f_{e,n} \) is partial recursive: it runs \( \phi_e(n) \) and, if it halts, then outputs \( \phi_a(x) \). By the \( s\text{-}m\text{-}n \) theorem, the index \( g(e, n) \) of \( f_{e,n} \) is computable from \( e, n \).
Now: if \( \phi_e(n) \downarrow \), then \( f_{e,n} = \phi_a \), so \( g(e,n) \in A \). If \( \phi_e(n) \uparrow \), then \( f_{e,n} \) is totally undefined (the zero function if we set \( b \) to index the empty function), so \( g(e,n) \in \overline{A} \) (assuming \( b \) indexes the everywhere-undefined function). Thus \( \phi_e(n) \downarrow \iff g(e,n) \in A \), giving a reduction from the Halting Problem to \( A \). If \( A \) were recursive, the Halting Problem would be recursive — contradiction. Hence \( A \) is not recursive.
Rice's theorem has immediate practical consequences: no program can decide, for an arbitrary input program, whether that program:
halts on all inputs (total),
outputs an even number on some input,
computes the constant zero function,
is equivalent to another given program.
All such properties are undecidable. This is the theoretical explanation for why general program verification and static analysis are fundamentally limited — they cannot be both sound, complete, and terminating.
Appendix D: Background on Orders and Algebra
D.1 Well-Orderings and Zorn’s Lemma
(See Appendix A.1 above.)
Problem Set: PMATH 432 Comprehensive Exercises
The following twenty problems are organized into five parts, covering the full scope of the course. Problems are designed to build understanding progressively within each part.
Part A: Syntax — Parsing and Classifying Formulas
A1. Consider the signature \( \sigma = (\{f, c\}, \{R\}, \mathrm{ar}) \) where \( \mathrm{ar}(f) = 2 \), \( \mathrm{ar}(c) = 0 \), \( \mathrm{ar}(R) = 2 \). Determine whether each of the following is a term, an atomic formula, a formula but not atomic, or not a well-formed expression. Justify each answer by reference to the inductive definition.
\( f(c, x) \)
\( R(f(c,c), x) \)
\( \forall x\, R(x, f(x,c)) \)
\( R(x) \)
\( f(R(x,y), c) \)
A2. For each formula below, compute \( \mathrm{free}(\varphi) \) (the set of free variables) using the formal inductive definition. Then classify the formula as a sentence or non-sentence, and if non-sentence, state what mathematical claim it makes about the free variables.
\( \varphi_1 = \exists x\, (x < y \land \forall z\, (z < x \to z < y)) \) (in the signature of linear orders).
\( \varphi_2 = \forall x\, \forall y\, (x \cdot y \doteq y \cdot x) \) (in the signature of rings).
\( \varphi_3 = \exists z\, (x \cdot z \doteq y) \) (in the signature of groups).
\( \varphi_4 = \forall x\, (P(x) \to \exists y\, (Q(x,y) \land \lnot Q(y,x))) \) (in a relational signature).
A3. Let \( \sigma_{\mathrm{grp}} = (\{\cdot, e, {}^{-1}\}, \emptyset) \) be the group signature. Write the following group-theoretic statements as first-order \( \sigma_{\mathrm{grp}} \)-sentences:
The group is abelian.
Every element has order dividing 6 (i.e., \( x^6 = e \) for all \( x \)).
There exists an element of order exactly 2.
The group has no element of finite order except the identity (a torsion-free group).
For (d), explain why you need infinitely many sentences, and argue that no single first-order sentence can express "torsion-free."
A4. Suppose \( \varphi(x, y) \) is a formula with exactly \( \mathrm{free}(\varphi) = \{x, y\} \). Define:
\[ \psi = \forall x\, \exists y\, \varphi(x,y). \]
Compute \( \mathrm{free}(\psi) \).
Suppose we attempt the substitution \( \psi[y/x] \) (replacing \( x \) by \( y \)). Explain what variable capture might occur and how to avoid it.
State the formal definition of "\( t \) is free for \( x \) in \( \varphi \)" and check whether \( y \) is free for \( x \) in \( \psi \).
Perform the substitution \( \varphi(x,y)[z/x] \) (replacing \( x \) by the fresh variable \( z \)) and state the result.
Part B: Semantics — Computing Truth in Specific Structures
B1. Let \( \mathfrak{A} = (\{0,1,2\}, <^{\mathfrak{A}}) \) be the structure with universe \( \{0,1,2\} \) and \( <^{\mathfrak{A}} = \{(0,1),(0,2),(1,2)\} \) (the standard ordering on \( \{0,1,2\} \)). Evaluate each sentence as true or false in \( \mathfrak{A} \), writing out the semantic computation step by step.
\( \forall x\, \exists y\, (x < y) \).
\( \exists x\, \forall y\, (y < x \lor y \doteq x) \).
\( \forall x\, \forall y\, (x < y \to \exists z\, (x < z \land z < y)) \).
\( \forall x\, \forall y\, (x < y \lor y < x \lor x \doteq y) \).
B2. Let \( \mathfrak{Z} = (\mathbb{Z}, 0, S, +, \cdot) \) where \( S(n) = n+1 \) (the integers as a model of the arithmetic signature \( \{0, S, +, \cdot\} \)). Note that \( \mathfrak{Z} \) is NOT a model of PA (the induction axiom fails). For each sentence:
\( \forall x\, \exists y\, (y + y \doteq x) \) — does this hold in \( \mathfrak{Z} \)? Does it hold in \( \mathfrak{N} \)?
\( \exists x\, (x + x \doteq S(0)) \) — in \( \mathfrak{Z} \)? In \( \mathfrak{N} \)?
\( \forall x\, \exists y\, (x + y \doteq 0) \) — in \( \mathfrak{Z} \)? In \( \mathfrak{N} \)?
Use your answers to show that \( \mathfrak{Z} \not\equiv \mathfrak{N} \) (they are not elementarily equivalent).
B3. Let \( T = \mathrm{DLO} \) (the theory of dense linear orders without endpoints). Determine whether each of the following is a consequence of \( T \) (true in all models of \( T \)), consistent with \( T \) (true in some model), or inconsistent with \( T \) (false in all models).
\( \exists x\, \exists y\, (x < y \land \lnot \exists z\, (x < z \land z < y)) \).
\( \forall x\, \forall y\, (x < y \to \exists z\, (x < z \land z < y)) \).
\( \forall x\, \exists y\, (y < x) \land \forall x\, \exists y\, (x < y) \).
\( \exists x\, \forall y\, (x < y) \) (there is a minimum element).
B4. Let \( \mathrm{ACF}_0 \) be the theory of algebraically closed fields of characteristic 0. For each sentence, determine whether it is a consequence of \( \mathrm{ACF}_0 \) (provable from \( \mathrm{ACF}_0 \)), its negation is a consequence, or it is independent (neither provable nor refutable). Justify using the completeness of \( \mathrm{ACF}_0 \) and properties of \( \mathbb{C} \).
\( \forall x\, \exists y\, (y \cdot y \doteq x) \) (every element has a square root).
\( \forall x\, (x \cdot x \doteq x \to x \doteq 0 \lor x \doteq 1) \).
\( \exists x\, (x \cdot x \doteq -1) \) (where \( -1 \) abbreviates \( 0 - 1 \)).
The field has exactly 4 elements (a sentence asserting \( |F| = 4 \)).
Part C: Proofs — Derivations in the Proof System
C1. Using the Hilbert calculus (with the deduction theorem available as a metatheorem), give a derivation (or derivation sketch with all key steps justified) of each of the following:
\( \vdash \varphi \to \varphi \) (reflexivity of implication).
\( \varphi \to \psi, \psi \to \chi \vdash \varphi \to \chi \) (hypothetical syllogism).
\( \vdash \lnot \lnot \varphi \to \varphi \) (double negation elimination — use the contrapositive axiom).
\( \vdash (\varphi \to \psi) \to (\lnot \psi \to \lnot \varphi) \) (contrapositive).
C2. Using the quantifier axioms and the deduction theorem, derive:
\( \forall x\, \forall y\, \varphi(x,y) \vdash \forall y\, \forall x\, \varphi(x,y) \) (swapping the order of universal quantifiers).
\( \exists x\, \varphi(x) \vdash \exists x\, (\varphi(x) \lor \psi(x)) \) for any formula \( \psi \).
Explain why \( \exists x\, \varphi(x) \lor \exists x\, \psi(x) \vdash \exists x\, (\varphi(x) \lor \psi(x)) \), and whether the converse holds.
C3. Let \( T \) be a theory and \( \varphi, \psi \) sentences. Prove each claim from the semantic and syntactic definitions:
If \( T \vdash \varphi \) and \( T \vdash \varphi \to \psi \), then \( T \vdash \psi \). (Modus Ponens is admissible from premises.)
If \( T \cup \{\varphi\} \vdash \bot \), then \( T \vdash \lnot \varphi \). (Proof by contradiction.)
If \( T \vdash \varphi \) and \( T' \supseteq T \), then \( T' \vdash \varphi \). (Monotonicity of provability.)
\( T \models \varphi \iff T \cup \{\lnot \varphi\} \) is unsatisfiable. (The semantic version of "proof by contradiction.")
C4. The Deduction Theorem states: \( \Phi \cup \{\psi\} \vdash \varphi \iff \Phi \vdash \psi \to \varphi \) (under the appropriate conditions on generalization). Use the Deduction Theorem to:
Show that \( \varphi \to \psi, \lnot \varphi \to \psi \vdash \psi \) (proof by cases).
Show that if \( \varphi \vdash \psi \) and \( \varphi \vdash \lnot \psi \), then \( \vdash \lnot \varphi \).
State carefully the condition on the generalization rule that must hold for the Deduction Theorem to apply. Give an example where the theorem fails if this condition is violated.
Part D: Completeness and Compactness — Applications
D1. Use the Compactness Theorem to prove each of the following:
If \( \Phi \models \varphi \), then some finite \( \Phi_0 \subseteq \Phi \) satisfies \( \Phi_0 \models \varphi \). (Compactness for entailment.)
If every finite subgraph of an infinite graph \( G \) is 3-colorable, then \( G \) is 3-colorable. (Graph coloring, see Worked Example 2.3.)
If a sentence \( \varphi \) is true in all finite structures of signature \( \sigma \), is \( \varphi \) necessarily true in all infinite structures? Prove or give a counterexample.
D2. Let \( T \) be the theory of torsion-free abelian groups (groups satisfying \( \forall x\, (x^n \neq e) \) for all \( n \geq 1 \), written as an infinite set of sentences). Use compactness to show:
\( T \) has no finite models.
There is no single first-order sentence \( \varphi \) such that a group satisfies \( \varphi \) iff it is torsion-free.
Every model of \( T \) contains an isomorphic copy of \( \mathbb{Z} \).
D3. Recall that the Löwenheim–Skolem theorems are consequences of compactness.
Using compactness directly (not the Downward Löwenheim–Skolem theorem), show that if a theory \( T \) has an infinite model, it has a model of cardinality at least \( \kappa \) for every infinite cardinal \( \kappa \).
The theory \( \mathrm{PA} \) (Peano arithmetic) has its intended model \( \mathfrak{N} = (\mathbb{N}, 0, S, +, \cdot) \). Using compactness, show that PA has a model containing an element \( c \) greater than \( \overline{n} = S^n(0) \) for every standard natural number \( n \). Such an element is called non-standard.
Show that any non-standard model of PA also contains non-standard elements below any given non-standard element (i.e., non-standard numbers are "dense" in a suitable sense).
D4. The Transfer Principle in non-standard analysis says: a first-order sentence is true in \( \mathbb{R} \) iff it is true in \( {}^*\!\mathbb{R} \). Use this to give a non-standard proof sketch of:
The intermediate value theorem: if \( f : [a,b] \to \mathbb{R} \) is continuous and \( f(a) < 0 < f(b) \), then \( f(c) = 0 \) for some \( c \in (a,b) \). (Sketch: use infinitesimals to argue about the "first sign change.")
Explain what "continuity" means in the language of non-standard analysis (using infinitesimals) and why this formulation is equivalent to the standard \( \varepsilon\text{-}\delta \) definition for first-order expressible properties.
What is a fundamental obstacle to using non-standard analysis to prove results that are not first-order expressible? Give an example.
Part E: Computability and Gödel — Challenge Problems
E1. Let \( \mathrm{Prov}_T(x) \) be the provability predicate for a recursively axiomatizable theory \( T \) extending PA. The Löb conditions are:
(D1) If \( T \vdash \varphi \), then \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \).
(D2) \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \to \psi \urcorner) \land \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \to \mathrm{Prov}_T(\ulcorner \psi \urcorner) \).
(D3) \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \to \mathrm{Prov}_T(\ulcorner \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \urcorner) \).
Show that (D1) implies: if \( T \vdash \varphi \land \psi \), then \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \land \mathrm{Prov}_T(\ulcorner \psi \urcorner) \).
Using the Löb conditions and the Diagonal Lemma, give a complete proof of Löb's Theorem: if \( T \vdash \mathrm{Prov}_T(\ulcorner \varphi \urcorner) \to \varphi \), then \( T \vdash \varphi \).
Deduce the Second Incompleteness Theorem from Löb's Theorem by showing \( T \vdash \mathrm{Con}(T) \to G_T \) where \( G_T \) is the Gödel sentence.
E2. The Arithmetical Hierarchy classifies sets by their logical complexity.
Show that the set \( \mathrm{Halt} = \{(e, n) : \phi_e(n) \downarrow\} \) is \( \Sigma^0_1 \) (computably enumerable) but not \( \Pi^0_1 \) (not co-computably enumerable). Use this to give a direct proof that \( \mathrm{Halt} \) is not decidable.
Show that the set of indices of total functions, \( \mathrm{Tot} = \{e : \phi_e \text{ is a total function}\} \), is \( \Pi^0_2 \) (of the form \( \forall n\, \exists s\, R(e, n, s) \) for a decidable \( R \)) but not \( \Sigma^0_2 \).
State Rice's Theorem and use it to prove that \( \mathrm{Tot} \) is not decidable (without computing its arithmetical complexity).
E3. This problem concerns the relationship between syntax (provability) and semantics (truth).
The Completeness Theorem says \( T \models \varphi \iff T \vdash \varphi \). The First Incompleteness Theorem says: for consistent, recursively axiomatizable \( T \supseteq Q \), there exists \( \varphi \) with \( \mathbb{N} \models \varphi \) but \( T \not\vdash \varphi \). Reconcile these two results — are they contradictory?
Exhibit a sentence \( \varphi \) such that \( \mathrm{PA} \not\vdash \varphi \) and \( \mathrm{PA} \not\vdash \lnot \varphi \), yet \( \varphi \) is true in \( \mathbb{N} \). (The Gödel sentence works — verify that \( G \) is true in \( \mathbb{N} \) by the argument in Section 5.4.2.)
A non-standard model \( \mathfrak{M} \) of PA (i.e., \( \mathfrak{M} \models \mathrm{PA} \) but \( \mathfrak{M} \not\cong \mathfrak{N} \)) satisfies all PA-theorems but may fail true arithmetic sentences. Explain why \( \mathfrak{M} \models \lnot G_{\mathrm{PA}} \) for a non-standard \( \mathfrak{M} \) that happens to be a model of \( \mathrm{PA} + \lnot G_{\mathrm{PA}} \) (the existence of which follows from Gödel's theorems).
E4. This problem concerns Tarski’s undefinability theorem and its relationship to the Liar Paradox.
State Tarski's Undefinability Theorem precisely and give its proof via the Diagonal Lemma.
The **Liar sentence** in natural language is "This sentence is false." Explain how Tarski's theorem is the formal mathematical version of this paradox, and why the resolution is not a contradiction in mathematics but rather a theorem about the limits of definability.
Is there a formula \( \mathrm{True}_{\mathrm{prop}}(x) \) in propositional logic that defines truth for propositional formulas? Explain why this case is different from the first-order arithmetic case, using the fact that propositional tautologies are decidable.
The **T-schema** (Tarski's truth condition) says: \( \ulcorner \varphi \urcorner \) is true iff \( \varphi \). Explain why adopting the T-schema for all \( \varphi \) in the same language leads to inconsistency (the Liar paradox), and what Tarski's hierarchical solution is.
End of Problem Set.
Appendix E: Quick Reference
B.1 Key Definitions at a Glance
Term Definition
\( \sigma \)-structure Universe + interpretations of all symbols
\( \mathfrak{A} \models \varphi \) \( \varphi \) is true in \( \mathfrak{A} \) (for sentences)
\( T \models \varphi \) \( \varphi \) is true in all models of \( T \)
\( T \vdash \varphi \) \( \varphi \) is derivable from \( T \)
Consistent \( T \not\vdash \bot \)
Complete (theory) For all \( \varphi \): \( T \vdash \varphi \) or \( T \vdash \lnot\varphi \)
Elementary substructure \( \mathfrak{A} \preccurlyeq \mathfrak{B} \): same truth for all formulas
Compactness \( T \) satisfiable iff every finite subset satisfiable
\( \kappa \)-categorical Unique model of size \( \kappa \) (up to isomorphism)
Type \( p(\vec{x}) \) Maximal consistent set of formulas over a base
Ordinal Transitive set well-ordered by \( \in \)
Cardinal Ordinal not in bijection with any smaller ordinal
Primitive recursive Built from \( Z, S, \pi^k_i \) by composition and primitive recursion
Partial recursive Primitive recursive + \( \mu \)-operator
r.e. (c.e.) Domain of a partial recursive function
Index set Set of program indices closed under extensional equality
B.2 Standard Theories and Their Properties
Theory Language Complete? Cat.? QE? Stable?
DLO \( < \) Yes \( \aleph_0 \) only Yes No (unstable)
ACF\(_0\) \( +,\cdot,0,1 \) Yes All uncountable Yes Yes (\( \omega \)-stable)
ACF\(_p\) \( +,\cdot,0,1 \) Yes All uncountable Yes Yes (\( \omega \)-stable)
RCF \( +,\cdot,0,1,< \) Yes No (many sizes) Yes No
PA \( 0,S,+,\cdot \) No No No N/A
ZFC \( \in \) No No No N/A
B.3 The Gödel Sentences — Summary
Let \( T \) be a consistent recursively axiomatizable extension of \( Q \).
\( G_T \): the Gödel sentence, with \( T \vdash G_T \leftrightarrow \lnot \mathrm{Prov}_T(\ulcorner G_T \urcorner) \).
\( T \not\vdash G_T \) and \( T \not\vdash \lnot G_T \) (First Incompleteness).
\( \mathbb{N} \models G_T \) (the standard model sees \( G_T \) as true).
\( \mathrm{Con}(T) = \lnot \mathrm{Prov}_T(\ulcorner \bot \urcorner) \): consistency statement.
\( T \not\vdash \mathrm{Con}(T) \) (Second Incompleteness).
\( T + \mathrm{Con}(T) \) is a strictly stronger consistent theory.
Relationship: \( T \vdash G_T \leftrightarrow \mathrm{Con}(T) \) (the Gödel sentence is equivalent to consistency over \( T \), essentially).
Appendix F: Supplementary Enrichment
This appendix collects enrichment material filling gaps identified after the main chapters were written. It is self-contained and cross-references the relevant sections above.
F.1 Parse Tree Example: A More Complex Formula
The worked example in Section 1.2.3 parsed the relatively simple sentence \( \forall x\, \exists y\, (x < y) \). Here we work through a more challenging formula that arises naturally in the study of ordered fields and dense orders.
Consider the formula
\[ \varphi = \forall x\, \exists y\, (x < y \,\land\, y < x + 1) \]
in the signature \( \sigma_{\mathrm{of}} = (\{+, \cdot, -, 0, 1\}, \{<\}) \) of ordered fields, where \( x + 1 \) abbreviates \( +(x, 1) \) and \( x < y \) abbreviates the relation \( <(x, y) \). This sentence expresses “between every element and its successor there exists another element” — the characteristic property of dense orders.
Worked Example F.1 (Full parse tree for \( \forall x\, \exists y\, (x < y \land y < x+1) \)).
Step 1: Identify the terms. The terms appearing in \( \varphi \) are:
\( x \) — a variable (term by clause 1 of the term definition).
\( y \) — a variable.
\( 1 \) — a constant symbol (arity-0 function symbol, term by clause 2).
\( x + 1 \) — formed by applying the binary function symbol \( + \) to terms \( x \) and \( 1 \); a term by clause 3.
Step 2: Identify the atomic subformulas.
\( x < y \) — an atomic formula \( R(t_1, t_2) \) with \( R = {<} \), \( t_1 = x \), \( t_2 = y \).
\( y < x + 1 \) — an atomic formula with \( t_1 = y \), \( t_2 = x + 1 \).
Step 3: Build the parse tree bottom-up.
forall x | exists y | (wedge) / \ x < y y < (x+1) | R(<, y, +(x,1)) | +(x, 1) / \ x 1
Reading the tree: the entire formula is \( \forall x \) applied to \( \exists y \) applied to a conjunction. The left conjunct \( x < y \) is atomic. The right conjunct \( y < x+1 \) is also atomic, but its second argument \( x+1 \) is a compound term built from the binary function \( + \), the variable \( x \), and the constant \( 1 \).
Step 4: List all subformulas (every syntactic subpart constituting a formula):
\( x < y \) — atomic formula.
\( y < x + 1 \) — atomic formula.
\( (x < y) \,\land\, (y < x+1) \) — conjunction of the two atomic formulas.
\( \exists y\, (x < y \,\land\, y < x+1) \) — existential quantification; \( y \) is bound, \( x \) remains free.
\( \forall x\, \exists y\, (x < y \,\land\, y < x+1) \) — the full sentence; both \( x \) and \( y \) are bound.
Step 5: Compute free variables formally.
\[ \mathrm{free}(x < y) = \{x, y\}. \]\[ \mathrm{free}(y < x+1) = \{y, x\} = \{x, y\}. \]
(The term \( x+1 \) contains the variable \( x \); the constant \( 1 \) introduces no variables.)
\[ \mathrm{free}((x < y) \,\land\, (y < x+1)) = \{x,y\} \cup \{x,y\} = \{x,y\}. \]\[ \mathrm{free}(\exists y\,(x < y \,\land\, y < x+1)) = \{x,y\} \setminus \{y\} = \{x\}. \]\[ \mathrm{free}(\forall x\, \exists y\,(x < y \,\land\, y < x+1)) = \{x\} \setminus \{x\} = \emptyset. \]
Therefore \( \varphi \) is a sentence (no free variables). Both \( x \) and \( y \) are bound.
Step 6: Evaluate in structures.
In \( \mathfrak{Q} = (\mathbb{Q}, <, +, \cdot, 0, 1) \): for any rational \( x \), take \( y = x + \tfrac{1}{2} \). Then \( x < x + \tfrac{1}{2} \) and \( x + \tfrac{1}{2} < x + 1 \). So \( \mathfrak{Q} \models \varphi \).
In \( \mathfrak{Z} = (\mathbb{Z}, <, +, \cdot, 0, 1) \): take \( x = 0 \). We need \( y \in \mathbb{Z} \) with \( 0 < y < 1 \). No integer satisfies this. So \( \mathfrak{Z} \not\models \varphi \).
Therefore \( \mathfrak{Q} \not\equiv \mathfrak{Z} \) — they are distinguished by the sentence \( \varphi \).
Bridge remark. The sentence \( \varphi \) is precisely the density axiom of DLO (in the language of ordered fields, it asserts that between \( x \) and \( x+1 \) there is always a \( y \)). This parse-tree analysis grounds the completeness of DLO (proved abstractly in Chapter 3 via Vaught’s test) in concrete syntactic computation: completeness of DLO ultimately rests on the fact that density is expressible as a first-order sentence.
F.2 The Lowenheim-Skolem Paradox — Extended Discussion
Section 3.2 stated the Skolem paradox as a brief corollary. The phenomenon is philosophically important and deserves fuller treatment, as it is one of the most striking consequences of the Lowenheim-Skolem theorems.
The downward Lowenheim-Skolem theorem applied to ZFC says: if ZFC is consistent, it has a countable model \( \mathfrak{M} = (M, E) \). There exists a countable set \( M \) and a binary relation \( E \subseteq M \times M \) such that \( (M, E) \models \mathrm{ZFC} \). Since ZFC proves “there exist uncountable sets,” we have \( (M, E) \models \exists x\, \lnot \exists f\, (f \text{ is a bijection from } \omega \text{ to } x) \). But \( M \) is countable. This is Skolem’s Paradox.
Resolution of Skolem's Paradox. The paradox dissolves by distinguishing internal and external properties.
The statement “\( \omega_1 \) is uncountable” means, inside the model \( (M, E) \): there is no element of \( M \) that the model sees as a bijection from \( \omega^M \) to \( \omega_1^M \). The existential quantifier \( \exists f \) ranges over \( M \) only. If a bijection \( f : \omega^M \to \omega_1^M \) exists somewhere — as a set-theoretic function — but \( f \notin M \), then the model simply cannot see it.
From outside (in the metatheory): \( M \) is a countable set. The “extension” of \( \omega_1^M \) in the model is the set \( \{a \in M : (a, \omega_1^M) \in E\} \), which is a subset of the countable set \( M \) and hence countable from outside. A bijection \( \mathbb{N} \to \{a \in M : (a, \omega_1^M) \in E\} \) exists in the metatheory — but it is a function in the metatheory, not an element of \( M \), so the model cannot use it to conclude \( \omega_1^M \) is countable.
Formal statement. A set \( x \in M \) is:
Internally countable if \( (M, E) \models \exists f\, (f \text{ is a bijection from } \omega \text{ to } x) \).
Externally countable if there is a bijection (in the real world) from \( \omega \) to \( \{a \in M : (a, x) \in E\} \).
The element \( \omega_1^M \in M \) is internally uncountable (the model proves so, since \( (M,E) \models \mathrm{ZFC} \) and ZFC proves \( \omega_1 \) is uncountable) but externally countable (a countable set \( M \) has all subsets countable from outside).
Conclusion. First-order logic cannot characterize cardinality in an absolute sense. “Uncountable” in the model means “no internal bijection exists,” and the model’s internal world may lack many bijections that exist externally. This is a fundamental limitation of first-order expressibility, not a mathematical paradox.
Bridge remark. The Löwenheim–Skolem paradox is directly related to the independence of CH. In Gödel’s \( L \), \( 2^{\aleph_0} = \aleph_1 \); in Cohen’s forcing extensions, \( 2^{\aleph_0} \geq \aleph_2 \). Both models satisfy all ZFC axioms but disagree on cardinal arithmetic because the bijections certifying various equalities are present in one model and absent in another. The mechanism is the same as Skolem’s paradox: different models have different sets of functions available internally.
F.3 Turing Machines — Formal Definition and Worked Example
Chapter 5 used Turing machines informally. Here we give the formal definition and trace through the canonical worked example of unary incrementing.
F.3.1 Formal Definition
A Turing machine is a tuple \( \mathcal{M} = (Q, \Gamma, \Sigma, \delta, q_0, q_{\mathrm{acc}}, q_{\mathrm{rej}}) \) where:
\( Q \) is a finite set of states.
\( q_0 \in Q \) is the start state.
\( q_{\mathrm{acc}}, q_{\mathrm{rej}} \in Q \) are distinct halting states (accept and reject).
\( \Gamma \) is a finite tape alphabet containing a designated blank symbol \( \sqcup \).
\( \Sigma \subseteq \Gamma \setminus \{\sqcup\} \) is the input alphabet.
\( \delta : (Q \setminus \{q_{\mathrm{acc}}, q_{\mathrm{rej}}\}) \times \Gamma \to Q \times \Gamma \times \{L, R\} \) is the transition function. On state \( q \) reading symbol \( a \), the machine: (i) transitions to state \( \delta(q,a)_1 \); (ii) overwrites the current cell with \( \delta(q,a)_2 \); (iii) moves the head left (\( L \)) or right (\( R \)) per \( \delta(q,a)_3 \).
A configuration is a triple \( (q, w, i) \) where \( q \in Q \), \( w : \mathbb{Z} \to \Gamma \) is the tape content (all but finitely many cells blank), and \( i \in \mathbb{Z} \) is the head position. Computation is a sequence of configurations generated by applying \( \delta \), halting when \( q_{\mathrm{acc}} \) or \( q_{\mathrm{rej}} \) is reached.
For computing \( \mathbb{N} \to \mathbb{N} \), we use unary encoding: \( n \) is encoded as \( 1^n \) (a run of \( n \) ones). The tape alphabet is \( \{1, \sqcup\} \).
F.3.2 Worked Example: Unary Increment (\( n \mapsto n+1 \))
Worked Example F.2 (Turing machine for \( n \mapsto n+1 \)).
Design: Scan right past all input ones (without changing them) until reaching the first blank, then write \( 1 \) there and accept.
States: \( Q = \{q_0, q_{\mathrm{acc}}, q_{\mathrm{rej}}\} \). Tape alphabet: \( \Gamma = \{1, \sqcup\} \), \( \Sigma = \{1\} \).
Transition table:
State Read Write Move Next state
\( q_0 \) \( 1 \) \( 1 \) R \( q_0 \)
\( q_0 \) \( \sqcup \) \( 1 \) R \( q_{\mathrm{acc}} \)
Trace on input \( n = 3 \) (tape: \( 1\;1\;1\;\sqcup\;\sqcup\;\ldots \), head at cell 0):
Step 0: q₀ at cell 0. Read 1 → write 1, R → q₀. Tape: 1 1 1 □ □ … Step 1: q₀ at cell 1. Read 1 → write 1, R → q₀. Tape: 1 1 1 □ □ … Step 2: q₀ at cell 2. Read 1 → write 1, R → q₀. Tape: 1 1 1 □ □ … Step 3: q₀ at cell 3. Read □ → write 1, R → q_acc. Tape: 1 1 1 1 □ … [ACCEPT]
Output tape: \( 1^4 \) — the unary encoding of \( n+1 = 4 \). Correct.
Loop invariant. At the start of step \( k \) (\( 0 \leq k \leq n \)): state is \( q_0 \), head is at cell \( k \), tape unchanged. At step \( n \): read \( \sqcup \), write \( 1 \), accept. Output is \( 1^{n+1} \).
Time complexity: \( n + 1 \) steps, optimal (the machine must read all \( n \) input symbols at least once).
Bridge remark. This machine is typical of Turing machine constructions: simple machines require few states; complex machines (e.g., computing \( n \mapsto n^2 \) in unary) require more states and a more elaborate transition function. The formal definition above makes rigorous the informal statement “there is an algorithm for \( f \)” — it means precisely that a Turing machine exists whose transition function implements \( f \).
F.3.3 Church-Turing Equivalence
(Church-Turing Equivalence) A function \( f : \mathbb{N}^k \to \mathbb{N} \) is partial recursive (Appendix C.2) if and only if it is computable by a Turing machine.
This equivalence (proved by Kleene, Turing, and Church in 1935–1936) justifies treating the two notions interchangeably. The Church-Turing Thesis extends this: every effectively computable function (in the informal sense) is Turing-computable. This thesis is a philosophical principle, not a theorem, but is universally accepted.
The thesis plays a crucial role in mathematical logic: the provability predicate \( \mathrm{Prov}_T(x) \) (Chapter 5) is representable in PA because “checking whether a finite sequence of formulas is a valid proof” is effectively computable — by the Church-Turing thesis it is Turing-computable, hence partial recursive, hence representable.
F.4 Historical Note: Godel in 1930-1931
Kurt Godel was born on April 28, 1906, in Brunn (now Brno, Czech Republic). He completed his doctoral dissertation at the University of Vienna in 1929 at age 23, proving the Completeness Theorem. The following year, at age 24, he announced and published the incompleteness theorems.
He made the announcement at the Second Conference on the Epistemology of the Exact Sciences in Konigsberg, September 1930. Hilbert gave a triumphant address asserting the achievability of mathematical completeness; shortly after, Godel quietly stated he had found a true arithmetic sentence unprovable in Principia Mathematica. John von Neumann, present at the conference, immediately grasped the result’s significance and within weeks independently derived the Second Incompleteness Theorem — only to learn that Godel had already done so.
The full paper, “Uber formal unentscheidbare Satze der Principia Mathematica und verwandter Systeme I,” was published in the Monatshefte fur Mathematik und Physik in 1931. It is widely regarded as the most important single paper in mathematical logic.
What the theorems do and do not imply.
What they do imply:
No consistent, recursively axiomatizable extension of \( Q \) is complete. There will always be true arithmetic sentences it cannot decide.
No such theory can prove its own consistency from within.
The set \( \mathrm{Th}(\mathbb{N}) \) of all first-order arithmetic truths is not recursively enumerable — it transcends every single formal system.
What they do not imply:
They do not establish a mathematical inconsistency or contradiction.
They do not show that mathematical truth is unknowable. Additional axioms can always be added (\( G \) becomes provable in \( T + G \)), and the extended system has its own Godel sentence.
They do not imply that human mathematical reasoning transcends computation. The Lucas-Penrose argument for this conclusion is not accepted by most logicians.
The incompleteness theorems did not end Hilbert’s program — they refined it. Gentzen’s 1936 consistency proof of PA (using transfinite induction up to \( \varepsilon_0 \)) showed that a modest extension of finitistic methods is sufficient. The program of reverse mathematics (Friedman, Simpson) systematically determines which axioms are needed to prove which theorems — a far more refined foundational investigation than anything Hilbert originally envisioned.
End of Appendix F.

Theorem	Requires	Consequence
Completeness (Gödel 1930)	Lindenbaum + Henkin	Compactness
Compactness	Completeness or Ultraproducts	Löwenheim–Skolem (upward)
Löwenheim–Skolem (downward)	Skolem functions	Infinite models of any theory
Vaught’s test	Löwenheim–Skolem	Completeness of DLO, ACF
Gödel numbering	Primitive recursion	Representability in PA
Diagonal Lemma	Representability	Gödel sentences, Tarski
First Incompleteness	Diagonal Lemma	Second Incompleteness
Second Incompleteness	Hilbert–Bernays conditions	Limits of formal provability

Term	Definition
\( \sigma \)-structure	Universe + interpretations of all symbols
\( \mathfrak{A} \models \varphi \)	\( \varphi \) is true in \( \mathfrak{A} \) (for sentences)
\( T \models \varphi \)	\( \varphi \) is true in all models of \( T \)
\( T \vdash \varphi \)	\( \varphi \) is derivable from \( T \)
Consistent	\( T \not\vdash \bot \)
Complete (theory)	For all \( \varphi \): \( T \vdash \varphi \) or \( T \vdash \lnot\varphi \)
Elementary substructure	\( \mathfrak{A} \preccurlyeq \mathfrak{B} \): same truth for all formulas
Compactness	\( T \) satisfiable iff every finite subset satisfiable
\( \kappa \)-categorical	Unique model of size \( \kappa \) (up to isomorphism)
Type \( p(\vec{x}) \)	Maximal consistent set of formulas over a base
Ordinal	Transitive set well-ordered by \( \in \)
Cardinal	Ordinal not in bijection with any smaller ordinal

Theory	Language	Complete?	Cat.?	QE?	Stable?
DLO	\( < \)	Yes	\( \aleph_0 \) only	Yes	No (unstable)
ACF\(_0\)	\( +,\cdot,0,1 \)	Yes	All uncountable	Yes	Yes (\( \omega \)-stable)
ACF\(_p\)	\( +,\cdot,0,1 \)	Yes	All uncountable	Yes	Yes (\( \omega \)-stable)
RCF	\( +,\cdot,0,1,< \)	Yes	No (many sizes)	Yes	No
PA	\( 0,S,+,\cdot \)	No	No	No	N/A
ZFC	\( \in \)	No	No	No	N/A

Term	Definition
\( \sigma \)-structure	Universe + interpretations of all symbols
\( \mathfrak{A} \models \varphi \)	\( \varphi \) is true in \( \mathfrak{A} \) (for sentences)
\( T \models \varphi \)	\( \varphi \) is true in all models of \( T \)
\( T \vdash \varphi \)	\( \varphi \) is derivable from \( T \)
Consistent	\( T \not\vdash \bot \)
Complete (theory)	For all \( \varphi \): \( T \vdash \varphi \) or \( T \vdash \lnot\varphi \)
Elementary substructure	\( \mathfrak{A} \preccurlyeq \mathfrak{B} \): same truth for all formulas
Compactness	\( T \) satisfiable iff every finite subset satisfiable
\( \kappa \)-categorical	Unique model of size \( \kappa \) (up to isomorphism)
Type \( p(\vec{x}) \)	Maximal consistent set of formulas over a base
Ordinal	Transitive set well-ordered by \( \in \)
Cardinal	Ordinal not in bijection with any smaller ordinal
Primitive recursive	Built from \( Z, S, \pi^k_i \) by composition and primitive recursion
Partial recursive	Primitive recursive + \( \mu \)-operator
r.e. (c.e.)	Domain of a partial recursive function
Index set	Set of program indices closed under extensional equality

Theory	Language	Complete?	Cat.?	QE?	Stable?
DLO	\( < \)	Yes	\( \aleph_0 \) only	Yes	No (unstable)
ACF\(_0\)	\( +,\cdot,0,1 \)	Yes	All uncountable	Yes	Yes (\( \omega \)-stable)
ACF\(_p\)	\( +,\cdot,0,1 \)	Yes	All uncountable	Yes	Yes (\( \omega \)-stable)
RCF	\( +,\cdot,0,1,< \)	Yes	No (many sizes)	Yes	No
PA	\( 0,S,+,\cdot \)	No	No	No	N/A
ZFC	\( \in \)	No	No	No	N/A

State	Read	Write	Move	Next state
\( q_0 \)	\( 1 \)	\( 1 \)	R	\( q_0 \)
\( q_0 \)	\( \sqcup \)	\( 1 \)	R	\( q_{\mathrm{acc}} \)

Back to top