PMATH 465/665: Smooth Manifolds
Spiro Karigiannis
Estimated study time: 6 hr 28 min
Table of contents
These notes develop the foundational theory of smooth manifolds — the natural setting for modern differential geometry, geometric analysis, and mathematical physics. The primary reference is John M. Lee’s Introduction to Smooth Manifolds (2nd edition, Springer GTM 218), with supplementary material drawn from Boothby’s An Introduction to Differentiable Manifolds and Riemannian Geometry. Topics begin with topological manifolds and smooth structures, proceeding through the calculus of smooth maps, and culminating in the theory of tangent vectors, the tangent bundle, and vector fields with their Lie-algebraic structure.
Chapter 1: Topological Manifolds and Smooth Structures
1.1 Topological Manifolds
The starting point for the theory of smooth manifolds is the purely topological notion of a space that “locally looks like Euclidean space.” The idea is simple and powerful: while a manifold may have complicated global topology (the sphere, the torus, projective space), every sufficiently small piece of it is indistinguishable from an open subset of \(\mathbb{R}^n\). This local Euclidean character is what allows us to do calculus on manifolds, by transferring problems to \(\mathbb{R}^n\) via coordinate charts. Before we can discuss smoothness, however, we must pin down the correct topological hypotheses.
- \(M\) is Hausdorff: for every pair of distinct points \(p, q \in M\), there exist disjoint open sets \(U \ni p\) and \(V \ni q\).
- \(M\) is second countable: there exists a countable basis for the topology of \(M\).
- \(M\) is locally Euclidean of dimension \(n\): every point \(p \in M\) has a neighbourhood \(U\) that is homeomorphic to an open subset of \(\mathbb{R}^n\).
Each of these three conditions plays an essential role and deserves comment.
The Hausdorff condition excludes pathological examples such as the “line with two origins,” obtained by taking two copies of \(\mathbb{R}\) and identifying all points except the origin. The resulting space is locally Euclidean and second countable, but the two origins cannot be separated by disjoint open sets. Without the Hausdorff condition, many basic results of analysis and geometry would fail.
Second countability ensures that the topology of \(M\) is not too large. It guarantees the existence of partitions of unity (which we discuss in Chapter 2), the embeddability of \(M\) into Euclidean space (by the Whitney embedding theorem), and the existence of Riemannian metrics. It also implies that \(M\) is paracompact and has at most countably many connected components.
The locally Euclidean condition is the heart of the definition. The integer \(n\) is called the dimension of \(M\). By the theorem on invariance of domain (a deep result in algebraic topology), if a nonempty topological space is locally homeomorphic to both \(\mathbb{R}^m\) and \(\mathbb{R}^n\), then \(m = n\), so the dimension is well-defined on each connected component.
1.2 Coordinate Charts and Atlases
The homeomorphisms witnessing the locally Euclidean property are the fundamental tools that allow us to transfer calculus from \(\mathbb{R}^n\) to a manifold. We formalize them as follows.
Given a point \(p \in U\), the coordinates of \(p\) are the \(n\) real numbers \((x^1(p), \ldots, x^n(p)) \in \mathbb{R}^n\). A chart \((U, \varphi)\) is said to be centred at \(p\) if \(\varphi(p) = 0\).
When two charts overlap, we need to understand how their coordinate systems are related. This leads to the notion of transition maps.
This is a homeomorphism between open subsets of \(\mathbb{R}^n\).
Transition maps are the key to defining smoothness on manifolds. Since \(\psi \circ \varphi^{-1}\) is a map between open subsets of \(\mathbb{R}^n\), it makes sense to ask whether it is smooth (\(C^\infty\)) in the ordinary multivariable calculus sense.
- The coordinate domains cover \(M\): \(\bigcup_{\alpha \in A} U_\alpha = M\).
- Any two charts in \(\mathcal{A}\) are smoothly compatible.
1.3 Smooth Structures and Smooth Manifolds
A smooth atlas provides enough structure to do calculus, but there is an aesthetic and technical issue: many different atlases can give rise to the “same” smooth structure. Two atlases \(\mathcal{A}\) and \(\mathcal{B}\) on \(M\) are said to be compatible if their union \(\mathcal{A} \cup \mathcal{B}\) is again a smooth atlas. Compatibility is an equivalence relation on the set of all smooth atlases on \(M\). Rather than working with equivalence classes, it is more convenient to single out a canonical representative.
Every smooth atlas \(\mathcal{A}\) is contained in a unique maximal atlas — simply take the collection of all charts on \(M\) that are smoothly compatible with every chart in \(\mathcal{A}\). Thus, specifying a smooth structure is equivalent to specifying any smooth atlas; the maximal atlas it generates is the smooth structure.
In practice, one almost never works with the full maximal atlas. Instead, one specifies a small atlas (often just two or three charts) and declares that the smooth structure is the one generated by that atlas. The key point is that the maximal atlas exists and is unique, so there is no ambiguity.
1.4 Examples of Smooth Manifolds
We now present the most important examples of smooth manifolds. These examples will serve as a testing ground for all the theory that follows.
is a topological \(n\)-manifold in the subspace topology inherited from \(\mathbb{R}^{n+1}\). It is Hausdorff and second countable because \(\mathbb{R}^{n+1}\) is. To show it is locally Euclidean and to exhibit a smooth atlas, we use stereographic projection.
Let \(N = (0, \ldots, 0, 1)\) and \(S = (0, \ldots, 0, -1)\) denote the north and south poles of \(S^n\). Define
\[ U_N = S^n \setminus \{N\}, \qquad U_S = S^n \setminus \{S\}. \]The stereographic projection from the north pole is the map \(\sigma_N \colon U_N \to \mathbb{R}^n\) defined by
\[ \sigma_N(x^1, \ldots, x^{n+1}) = \frac{1}{1 - x^{n+1}}(x^1, \ldots, x^n). \]Geometrically, \(\sigma_N(x)\) is the point where the line from \(N\) through \(x\) meets the equatorial hyperplane \(\{x^{n+1} = 0\}\).
Similarly, the stereographic projection from the south pole is \(\sigma_S \colon U_S \to \mathbb{R}^n\) defined by
Both maps are homeomorphisms onto \(\mathbb{R}^n\), so \(\{(U_N, \sigma_N), (U_S, \sigma_S)\}\) is an atlas for \(S^n\).
The transition map for the stereographic atlas is a central computation that we carry out in full detail, since it illustrates the general technique and will be useful later.
which is the inversion in the unit sphere in \(\mathbb{R}^n\).
So \(x^i = u^i(1 - x^{n+1})\). From the constraint \(|x|^2 = 1\), we get
\[ \sum_{i=1}^n (u^i)^2 (1 - x^{n+1})^2 + (x^{n+1})^2 = 1. \]Setting \(|u|^2 = \sum (u^i)^2\) and \(t = x^{n+1}\), this gives \(|u|^2(1-t)^2 + t^2 = 1\). Expanding:
\[ |u|^2 - 2|u|^2 t + |u|^2 t^2 + t^2 = 1, \]\[ (|u|^2 + 1)t^2 - 2|u|^2 t + (|u|^2 - 1) = 0. \]Using the quadratic formula (or factoring):
\[ t = \frac{2|u|^2 \pm \sqrt{4|u|^4 - 4(|u|^2+1)(|u|^2-1)}}{2(|u|^2+1)} = \frac{2|u|^2 \pm 2}{2(|u|^2+1)}. \]The solution \(t = 1\) corresponds to the north pole, which we exclude. So
\[ x^{n+1} = t = \frac{|u|^2 - 1}{|u|^2 + 1}, \qquad 1 - t = \frac{2}{|u|^2 + 1}, \]and therefore
\[ x^i = u^i \cdot \frac{2}{|u|^2 + 1}. \]The inverse is thus
\[ \sigma_N^{-1}(u) = \frac{1}{|u|^2 + 1}\left(2u^1, \ldots, 2u^n, |u|^2 - 1\right). \]Now we apply \(\sigma_S\):
\[ \sigma_S(\sigma_N^{-1}(u)) = \frac{1}{1 + x^{n+1}}(x^1, \ldots, x^n) = \frac{1}{1 + \frac{|u|^2 - 1}{|u|^2 + 1}} \cdot \frac{2u}{|u|^2 + 1}. \]We have
\[ 1 + \frac{|u|^2 - 1}{|u|^2 + 1} = \frac{|u|^2 + 1 + |u|^2 - 1}{|u|^2 + 1} = \frac{2|u|^2}{|u|^2 + 1}. \]Therefore
\[ \sigma_S \circ \sigma_N^{-1}(u) = \frac{|u|^2 + 1}{2|u|^2} \cdot \frac{2u}{|u|^2 + 1} = \frac{u}{|u|^2}. \]This is a smooth map on \(\mathbb{R}^n \setminus \{0\}\), and it is its own inverse (applying the formula twice gives back \(u\)). Hence both transition maps are smooth, and the stereographic atlas defines a smooth structure on \(S^n\). \(\blacksquare\)
where \(x \sim y\) if and only if \(x = \lambda y\) for some \(\lambda \in \mathbb{R} \setminus \{0\}\). The equivalence class of a point \((x^0, x^1, \ldots, x^n)\) is denoted \([x^0 : x^1 : \cdots : x^n]\) and called homogeneous coordinates.
For each \(i = 0, 1, \ldots, n\), let \(U_i = \{[x^0 : \cdots : x^n] \in \mathbb{R}P^n : x^i \neq 0\}\). Define \(\varphi_i \colon U_i \to \mathbb{R}^n\) by
\[ \varphi_i([x^0 : \cdots : x^n]) = \left(\frac{x^0}{x^i}, \ldots, \widehat{\frac{x^i}{x^i}}, \ldots, \frac{x^n}{x^i}\right), \]where the hat means that the \(i\)-th entry is omitted. The sets \(U_0, \ldots, U_n\) cover \(\mathbb{R}P^n\), each \(\varphi_i\) is a homeomorphism, and the transition maps are smooth rational functions. Thus \(\mathbb{R}P^n\) is a smooth \(n\)-manifold. Note that \(\mathbb{R}P^n\) is compact (it is the continuous image of \(S^n\) under the quotient map).
is a smooth atlas for \(M \times N\), where \((\varphi_\alpha \times \psi_\beta)(p, q) = (\varphi_\alpha(p), \psi_\beta(q))\). The transition maps are products of the individual transition maps and hence smooth.
It is a compact smooth \(n\)-manifold. The \(2\)-torus \(T^2 = S^1 \times S^1\) is the familiar “doughnut” surface that can be embedded in \(\mathbb{R}^3\).
The next class of examples is among the most important in all of mathematics and physics.
Important examples include:
- \(\mathrm{GL}(n, \mathbb{R})\): the general linear group of invertible \(n \times n\) real matrices. This is an open subset of \(\mathbb{R}^{n^2}\) (where the determinant is nonzero), hence a smooth manifold of dimension \(n^2\).
- \(\mathrm{GL}(n, \mathbb{C})\): the general linear group of invertible \(n \times n\) complex matrices, a smooth manifold of dimension \(2n^2\).
- \(\mathrm{O}(n)\): the orthogonal group \(\{A \in \mathrm{GL}(n, \mathbb{R}) : A^T A = I\}\), a smooth manifold of dimension \(n(n-1)/2\).
- \(\mathrm{SO}(n) = \{A \in \mathrm{O}(n) : \det A = 1\}\): the special orthogonal group, the connected component of the identity in \(\mathrm{O}(n)\).
- \(\mathrm{U}(n)\): the unitary group \(\{A \in \mathrm{GL}(n, \mathbb{C}) : A^* A = I\}\), dimension \(n^2\).
- \(\mathrm{SU}(n) = \{A \in \mathrm{U}(n) : \det A = 1\}\): the special unitary group, dimension \(n^2 - 1\).
1.5 Exotic Smooth Structures
A natural question arises: can a given topological manifold admit more than one smooth structure? The answer, surprisingly, is yes, and this phenomenon has deep implications.
1.6 Manifolds with Boundary
For many applications — particularly in integration theory and Stokes’ theorem — it is necessary to consider manifolds that have edges. The model for such spaces is the upper half-space.
Its boundary is \(\partial \mathbb{H}^n = \{x \in \mathbb{R}^n : x^n = 0\} \cong \mathbb{R}^{n-1}\), and its interior is \(\mathrm{Int}(\mathbb{H}^n) = \{x \in \mathbb{R}^n : x^n > 0\}\).
A smooth structure on \(M\) is defined exactly as before, using charts taking values in open subsets of \(\mathbb{H}^n\) and requiring smooth compatibility. A manifold with boundary together with a smooth structure is a smooth manifold with boundary.
Chapter 2: Smooth Maps and Partitions of Unity
2.1 Smooth Functions on Manifolds
Having established the notion of a smooth manifold, we now develop the theory of smooth maps between manifolds. The key idea is simple: a map between manifolds is smooth if and only if its local coordinate representations (obtained via charts) are smooth maps between open subsets of Euclidean space. This is precisely why we needed the transition maps to be smooth — it ensures that the notion of smoothness does not depend on the choice of charts.
is a smooth function on the open set \(\varphi(U) \subseteq \mathbb{R}^n\).
The function \(f \circ \varphi^{-1}\) is called the coordinate representation of \(f\) in the chart \((U, \varphi)\). We write \(\hat{f} = f \circ \varphi^{-1}\) when the chart is understood. Note that if the condition holds for one chart at \(p\), it holds for every chart at \(p\), because for any other chart \((V, \psi)\) with \(p \in V\), we have
\[ f \circ \psi^{-1} = (f \circ \varphi^{-1}) \circ (\varphi \circ \psi^{-1}), \]and this is a composition of smooth maps (since \(\varphi \circ \psi^{-1}\) is a smooth transition map).
The algebra \(C^\infty(M)\) is a fundamental invariant of the smooth manifold \(M\). In fact, two smooth manifolds \(M\) and \(N\) are diffeomorphic if and only if \(C^\infty(M)\) and \(C^\infty(N)\) are isomorphic as \(\mathbb{R}\)-algebras (this is a consequence of Milnor’s exercise; see Lee, Problem 2-2).
2.2 Smooth Maps Between Manifolds
We now extend the notion of smoothness from functions to maps between manifolds.
is a smooth map between open subsets of Euclidean spaces.
Again, the smoothness of transition maps ensures that if this condition holds for one pair of charts, it holds for every pair. The set of smooth maps from \(M\) to \(N\) is denoted \(C^\infty(M, N)\).
is a composition of smooth maps between open subsets of Euclidean spaces, hence smooth. The identity map is smooth because its coordinate representation in any chart is the identity. \(\blacksquare\)
Thus smooth manifolds and smooth maps form a category, the smooth category, often denoted \(\mathbf{Man}^\infty\) or \(\mathbf{Diff}\). The isomorphisms in this category are diffeomorphisms.
Diffeomorphism is the natural notion of equivalence in the smooth category. The central question of differential topology is to classify smooth manifolds up to diffeomorphism.
2.3 Bump Functions and Cutoff Functions
Before discussing partitions of unity, we need to establish the existence of smooth functions with special localization properties. These are possible because, unlike analytic functions, smooth functions can have compact support.
the closure of the set where \(f\) is nonzero.
The fundamental building block is the following function on \(\mathbb{R}\).
is smooth on all of \(\mathbb{R}\).
From this, we can construct smooth functions with prescribed support.
Bump functions and cutoff functions are essential tools for transferring local constructions to global ones. They will be used repeatedly throughout the course, most immediately in the construction of partitions of unity.
2.4 Partitions of Unity
Partitions of unity are the primary technical tool for passing from local to global constructions on manifolds. Many objects — Riemannian metrics, differential forms, connections — are easy to define locally in coordinates, but assembling them into a globally defined, smooth object requires a way to “glue” using smooth weights. This is precisely what partitions of unity provide.
- \(\mathrm{supp}(\psi_\alpha) \subseteq U_\alpha\) for each \(\alpha\).
- The collection \(\{\mathrm{supp}(\psi_\alpha)\}\) is locally finite: every point of \(M\) has a neighbourhood that intersects only finitely many of the supports.
- \(\sum_{\alpha \in A} \psi_\alpha(p) = 1\) for every \(p \in M\). (This sum is well-defined because of local finiteness.)
The existence of partitions of unity on smooth manifolds is one of the most important facts in the subject, and it is here that the Hausdorff and second countable conditions in the definition of a manifold earn their keep.
Step 1: Paracompactness. Since \(M\) is second countable and locally compact Hausdorff, it is paracompact: every open cover has a locally finite refinement. In fact, \(M\) admits a countable, locally finite refinement by precompact open sets (sets whose closures are compact).
Step 2: Subordinate bump functions. For each set in the locally finite refinement, we use bump functions (Proposition 2.8) to construct a smooth function supported in that set. More precisely, for each element \(V_i\) of the refinement (with \(V_i \subseteq U_{\alpha(i)}\) for some \(\alpha(i)\)), we can find a smooth function \(\rho_i \geq 0\) with \(\mathrm{supp}(\rho_i) \subseteq V_i\) and \(\rho_i > 0\) on some neighbourhood of a carefully chosen compact subset.
Step 3: Normalization. The function \(\rho = \sum_i \rho_i\) is well-defined and smooth (by local finiteness) and everywhere positive (since every point of \(M\) is in the support of some \(\rho_i\)). Setting \(\psi_\alpha = \sum_{i : \alpha(i) = \alpha} \rho_i / \rho\) gives the desired partition of unity. \(\blacksquare\)
2.5 Applications of Partitions of Unity
Partitions of unity have numerous applications. We mention several of the most important.
Another important application is the embedding of manifolds into Euclidean space.
The proof of the full Whitney embedding theorem is beyond our scope, but partitions of unity are an essential ingredient.
Chapter 3: Tangent Vectors and the Tangent Bundle
3.1 Motivation: What Is a Tangent Vector?
If \(M\) is a smooth submanifold of \(\mathbb{R}^N\), we have an intuitive picture of tangent vectors: they are velocity vectors of curves lying in \(M\), or equivalently, arrows in the ambient space that are “tangent” to \(M\) at a point. However, for abstract smooth manifolds not given as subsets of any Euclidean space, we need an intrinsic definition that does not rely on an ambient space. There are several equivalent approaches:
- Geometric: tangent vectors as equivalence classes of curves.
- Algebraic: tangent vectors as derivations on germs of smooth functions.
- Physicist’s: tangent vectors as gadgets that transform by the Jacobian matrix under coordinate changes.
We follow Lee’s approach and adopt the algebraic definition, then show it is equivalent to the others. The key insight is that a tangent vector should be determined by how it acts on smooth functions — that is, by the directional derivatives it computes.
Why adopt the algebraic definition? On a submanifold of \(\mathbb{R}^N\) we can identify a tangent vector at \(p\) with an element of \(\mathbb{R}^N\) — but an abstract manifold comes with no ambient space to house such arrows. The geometric definition (equivalence classes of curves) is appealingly concrete, but working with equivalence classes is cumbersome. The derivation definition sidesteps both difficulties: it is purely intrinsic, works on any smooth manifold, and immediately gives the tangent space the structure of a vector space in a clean way. The price is a moment of abstraction; the reward is a definition that generalises effortlessly to the tangent bundle, to pushforwards, and eventually to the entire tensor calculus.
To see why derivations capture the right concept, think about what a tangent vector \(v\) at \(p \in \mathbb{R}^n\) actually does to a smooth function \(f\): it computes the directional derivative \(D_v f(p) = \sum_i v^i \frac{\partial f}{\partial x^i}(p)\). This operation is \(\mathbb{R}\)-linear in \(f\) and satisfies the Leibniz rule \(D_v(fg)(p) = f(p)D_v g + g(p) D_v f\). Crucially, this Leibniz rule — and nothing else — distinguishes first-order differential operators (directional derivatives) from higher-order ones (Laplacian, etc.). On an abstract manifold, the Leibniz rule is exactly the right axiom to impose on a “tangent vector acting on functions,” because it forces the operator to be local (its value at \(p\) depends only on the germ of \(f\) at \(p\)) and to compute a genuine first-order directional derivative in any coordinate system.
3.2 Derivations and the Tangent Space
for all \(f, g \in C^\infty(M)\).
Elements of \(T_pM\) are called tangent vectors at \(p\).
Before computing the dimension of the tangent space, we establish some useful consequences of the Leibniz rule.
- If \(c \in \mathbb{R}\) is a constant function, then \(v(c) = 0\).
- If \(f(p) = g(p) = 0\), then \(v(fg) = 0\).
- If \(f \equiv g\) on some neighbourhood of \(p\), then \(v(f) = v(g)\).
(2) This follows directly from the Leibniz rule: \(v(fg) = f(p) v(g) + g(p) v(f) = 0\).
(3) We need bump functions for this. Let \(\psi\) be a bump function with \(\psi \equiv 1\) near \(p\) and \(\mathrm{supp}(\psi)\) contained in the neighbourhood where \(f = g\). Then \(\psi f = \psi g\) everywhere on \(M\), so \(v(\psi f) = v(\psi g)\). But \(v(\psi f) = \psi(p) v(f) + f(p) v(\psi) = v(f) + f(p) v(\psi)\), and similarly \(v(\psi g) = v(g) + g(p) v(\psi)\). Since \(f(p) = g(p)\), we conclude \(v(f) = v(g)\). \(\blacksquare\)
Property (3) shows that derivations are local: the value of \(v(f)\) depends only on the germ of \(f\) at \(p\). This is what one expects of a “directional derivative.”
3.3 Coordinate Bases and the Dimension of \(T_pM\)
Let \((U, \varphi)\) be a smooth chart around \(p\) with local coordinates \(x^1, \ldots, x^n\). For each \(i = 1, \ldots, n\), we define a derivation at \(p\) by
\[ \left.\frac{\partial}{\partial x^i}\right|_p (f) = \frac{\partial (f \circ \varphi^{-1})}{\partial r^i}\bigg|_{\varphi(p)}, \]where \(r^1, \ldots, r^n\) are the standard coordinates on \(\mathbb{R}^n\). In other words, \(\frac{\partial}{\partial x^i}\big|_p\) computes the \(i\)-th partial derivative of the coordinate representation of \(f\). One checks directly that this satisfies the Leibniz rule and is thus an element of \(T_pM\).
form a basis for \(T_pM\). In particular, \(\dim T_pM = n = \dim M\).
Linear independence. We have \(\frac{\partial}{\partial x^i}\big|_p(x^j) = \frac{\partial r^j}{\partial r^i}\big|_0 = \delta^j_i\). If \(\sum_i a^i \frac{\partial}{\partial x^i}\big|_p = 0\), then applying this to \(x^j\) gives \(a^j = 0\) for each \(j\).
Spanning. Let \(v \in T_pM\) be any derivation. Set \(v^i = v(x^i)\). We claim that \(v = \sum_i v^i \frac{\partial}{\partial x^i}\big|_p\).
By Taylor’s theorem with remainder in \(\mathbb{R}^n\), for any \(\hat{f} \in C^\infty\) near \(0\) we can write
\[ \hat{f}(x) = \hat{f}(0) + \sum_{i=1}^n x^i \, g_i(x), \]where \(g_i\) are smooth functions satisfying \(g_i(0) = \frac{\partial \hat{f}}{\partial r^i}(0)\). (One obtains this from \(\hat{f}(x) - \hat{f}(0) = \int_0^1 \frac{d}{dt}\hat{f}(tx)\,dt = \sum_i x^i \int_0^1 \frac{\partial \hat{f}}{\partial r^i}(tx)\,dt\).)
Pulling back to \(M\): \(f = f(p) + \sum_i x^i \cdot (g_i \circ \varphi)\) near \(p\). Applying \(v\):
\[ v(f) = v(f(p)) + \sum_i v(x^i \cdot (g_i \circ \varphi)) = 0 + \sum_i \left[x^i(p)\, v(g_i \circ \varphi) + (g_i \circ \varphi)(p)\, v(x^i)\right]. \]Since \(\varphi(p) = 0\), we have \(x^i(p) = 0\), and \((g_i \circ \varphi)(p) = g_i(0) = \frac{\partial \hat{f}}{\partial r^i}(0) = \frac{\partial}{\partial x^i}\big|_p(f)\). Therefore
\[ v(f) = \sum_i v^i \frac{\partial}{\partial x^i}\bigg|_p(f). \]Since this holds for every \(f\), we have \(v = \sum_i v^i \frac{\partial}{\partial x^i}\big|_p\). \(\blacksquare\)
3.4 The Pushforward (Differential)
One of the most important operations in differential geometry is the pushforward (or differential) of a smooth map. Given a smooth map \(F \colon M \to N\), we want to associate to it a linear map on tangent spaces that captures the “linear approximation” of \(F\) at each point. In \(\mathbb{R}^n\), this role is played by the Jacobian matrix; the pushforward generalizes this to manifolds.
(also written \(dF_p\), \(DF_p\), or \((F_*)_p\)) defined by
\[ (F_{*,p}(v))(g) = v(g \circ F) \]for all \(v \in T_pM\) and \(g \in C^\infty(N)\).
We must verify that \(F_{*,p}(v)\) is indeed a derivation at \(F(p)\). Linearity is clear: \(F_{*,p}(v)(g_1 + g_2) = v((g_1+g_2) \circ F) = v(g_1 \circ F) + v(g_2 \circ F)\). For the Leibniz rule:
\[ F_{*,p}(v)(g_1 g_2) = v((g_1 g_2) \circ F) = v((g_1 \circ F)(g_2 \circ F)), \]and since \(v\) is a derivation at \(p\),
\[ = (g_1 \circ F)(p) \, v(g_2 \circ F) + (g_2 \circ F)(p) \, v(g_1 \circ F) = g_1(F(p)) \, F_{*,p}(v)(g_2) + g_2(F(p)) \, F_{*,p}(v)(g_1). \]So \(F_{*,p}(v) \in T_{F(p)}N\).
3.5 Pushforward in Local Coordinates
The power of the pushforward becomes concrete when we compute it in coordinates. The result is exactly the Jacobian matrix, confirming that the pushforward is the correct generalization of the total derivative.
Before stating the proposition, let us make the connection vivid with a simple example. Consider \(F \colon \mathbb{R}^2 \to \mathbb{R}^2\) defined by \(F(x,y) = (x^2 - y^2, 2xy)\) (which is complex squaring under the identification \(\mathbb{R}^2 \cong \mathbb{C}\)). In the standard coordinates, the pushforward at a point \((x,y)\) is represented by the Jacobian:
\[ [F_{*,(x,y)}] = \begin{pmatrix} 2x & -2y \\ 2y & 2x \end{pmatrix}. \]The basis vector \(\frac{\partial}{\partial x}\big|_{(x,y)}\) maps to \(2x\frac{\partial}{\partial u}\big|_{F(x,y)} + 2y\frac{\partial}{\partial v}\big|_{F(x,y)}\), and \(\frac{\partial}{\partial y}\big|_{(x,y)}\) maps to \(-2y\frac{\partial}{\partial u}\big|_{F(x,y)} + 2x\frac{\partial}{\partial v}\big|_{F(x,y)}\). This is the linear approximation to \(F\) at \((x,y)\) — the best linear map from \(T_{(x,y)}\mathbb{R}^2\) to \(T_{F(x,y)}\mathbb{R}^2\) approximating \(F\) near that point.
Then
\[ F_{*,p}\left(\frac{\partial}{\partial x^i}\bigg|_p\right) = \sum_{j=1}^n \frac{\partial \hat{F}^j}{\partial x^i}(\varphi(p)) \, \frac{\partial}{\partial y^j}\bigg|_{F(p)}. \]In other words, the matrix of \(F_{*,p}\) with respect to the coordinate bases is the Jacobian matrix of \(\hat{F}\) at \(\varphi(p)\).
Now \(g \circ F \circ \varphi^{-1} = (g \circ \psi^{-1}) \circ (\psi \circ F \circ \varphi^{-1}) = \hat{g} \circ \hat{F}\). By the chain rule in \(\mathbb{R}^n\):
\[ \frac{\partial (\hat{g} \circ \hat{F})}{\partial r^i}\bigg|_{\varphi(p)} = \sum_{j=1}^n \frac{\partial \hat{g}}{\partial r^j}\bigg|_{\hat{F}(\varphi(p))} \cdot \frac{\partial \hat{F}^j}{\partial r^i}\bigg|_{\varphi(p)} = \sum_j \frac{\partial \hat{F}^j}{\partial x^i}(\varphi(p)) \cdot \frac{\partial}{\partial y^j}\bigg|_{F(p)}(g). \]Since this holds for all \(g\), the result follows. \(\blacksquare\)
3.6 The Chain Rule and Functoriality
The chain rule from multivariable calculus generalizes beautifully to manifolds.
Moreover, \((\mathrm{id}_M)_{*,p} = \mathrm{id}_{T_pM}\).
The identity statement is immediate: \((\mathrm{id}_M)_{*,p}(v)(f) = v(f \circ \mathrm{id}_M) = v(f)\). \(\blacksquare\)
Working in the stereographic chart \((U_N, \sigma_N)\) around a point \(p \neq N\), with coordinates \(u = (u^1, \ldots, u^n)\), the inclusion in these coordinates is the map \(\hat{\iota} = \iota \circ \sigma_N^{-1} \colon \mathbb{R}^n \to \mathbb{R}^{n+1}\),
\[ \hat{\iota}(u) = \sigma_N^{-1}(u) = \frac{1}{|u|^2+1}(2u^1, \ldots, 2u^n, |u|^2 - 1). \]The Jacobian of \(\hat{\iota}\) at \(u\) is an \((n+1) \times n\) matrix with full rank \(n\) (since \(\hat{\iota}\) is an immersion). Each column \(\frac{\partial \hat{\iota}}{\partial u^k}\) is a tangent vector to \(S^n\) in the ambient \(\mathbb{R}^{n+1}\).
More conceptually: by the regular level set theorem (Example 6.20), \(T_p S^n = p^\perp\), the orthogonal complement of \(p\) in \(\mathbb{R}^{n+1}\). The pushforward \(\iota_{*,p}\) is simply the inclusion of this subspace into \(\mathbb{R}^{n+1}\). Concretely, if \(v \in T_pS^n\) is the velocity of a curve \(\gamma\) on \(S^n\) at \(p\), then \(\iota_{*,p}(v) = \gamma'(0)\) regarded as a vector in \(\mathbb{R}^{n+1}\). The condition \(\gamma(t) \in S^n\) forces \(\gamma(t) \cdot \gamma(t) = 1\), differentiating: \(2\gamma(0) \cdot \gamma'(0) = 0\), so \(\gamma'(0) \perp p\). Thus \(\iota_{*,p}(T_pS^n) = p^\perp \subset \mathbb{R}^{n+1}\), confirming that the differential of the inclusion is injective and identifies \(T_pS^n\) with the hyperplane perpendicular to the position vector.
In categorical language, the tangent space construction is a functor from the category of smooth manifolds (with smooth maps) to the category of vector spaces (with linear maps): it sends each manifold to a family of vector spaces and each smooth map to a family of linear maps, respecting composition and identities.
3.7 Change of Coordinates for Tangent Vectors
The coordinate basis vectors \(\frac{\partial}{\partial x^i}\big|_p\) depend on the choice of chart. When we change from one chart to another, the basis vectors transform by the Jacobian of the transition map. This gives the classical “transformation law” for tangent vectors.
where \(\frac{\partial y^j}{\partial x^i}(p)\) denotes the \((j,i)\)-entry of the Jacobian matrix of the transition map \(\psi \circ \varphi^{-1}\) evaluated at \(\varphi(p)\).
Consequently, if \(v = \sum_i v^i \frac{\partial}{\partial x^i}\big|_p = \sum_j w^j \frac{\partial}{\partial y^j}\big|_p\), then
\[ w^j = \sum_i \frac{\partial y^j}{\partial x^i}(p) \, v^i. \]3.8 Tangent Vectors as Velocity Vectors of Curves
We now connect the algebraic definition of tangent vectors to the geometric picture of velocity vectors.
Explicitly, \(\gamma'(t_0)\) acts on a smooth function \(f\) by
\[ \gamma'(t_0)(f) = \frac{d}{dt}\bigg|_{t_0}(f \circ \gamma) = (f \circ \gamma)'(t_0). \]In local coordinates \((U, x^1, \ldots, x^n)\) around \(\gamma(t_0)\), if we write \(\gamma(t) = (\gamma^1(t), \ldots, \gamma^n(t))\) (meaning \(x^i(\gamma(t)) = \gamma^i(t)\)), then
\[ \gamma'(t_0) = \sum_{i=1}^n \dot{\gamma}^i(t_0) \, \frac{\partial}{\partial x^i}\bigg|_{\gamma(t_0)}, \]where \(\dot{\gamma}^i(t_0) = \frac{d\gamma^i}{dt}(t_0)\). This is exactly what one would expect from the chain rule.
The next result shows that the geometric and algebraic definitions of tangent vectors are equivalent.
3.9 The Tangent Bundle
Having constructed the tangent space \(T_pM\) at each point, we now assemble all these vector spaces into a single smooth manifold: the tangent bundle.
The natural projection is the map \(\pi \colon TM \to M\) defined by \(\pi(p, v) = p\).
The tangent bundle \(TM\) carries a natural topology and smooth structure making it a smooth \(2n\)-dimensional manifold. The construction proceeds as follows.
- The projection \(\pi \colon TM \to M\) is a smooth surjection.
- For each smooth chart \((U, \varphi) = (U, x^1, \ldots, x^n)\) on \(M\), the map \(\tilde{\varphi} \colon \pi^{-1}(U) \to \varphi(U) \times \mathbb{R}^n\) defined by
\[
\tilde{\varphi}\left(p,\, v^i \frac{\partial}{\partial x^i}\bigg|_p\right) = (x^1(p), \ldots, x^n(p), v^1, \ldots, v^n)
\]
is a diffeomorphism onto an open subset of \(\mathbb{R}^{2n}\).
We need to check that the transition maps are smooth. On \(\pi^{-1}(U_\alpha \cap U_\beta)\), the transition map \(\tilde{\varphi}_\beta \circ \tilde{\varphi}_\alpha^{-1}\) acts on \((x, v) \in \varphi_\alpha(U_\alpha \cap U_\beta) \times \mathbb{R}^n\) by
\[ \tilde{\varphi}_\beta \circ \tilde{\varphi}_\alpha^{-1}(x, v) = \left(\tau(x),\; J_\tau(x) \cdot v\right), \]where \(\tau = \varphi_\beta \circ \varphi_\alpha^{-1}\) is the transition map on \(M\) and \(J_\tau(x)\) is its Jacobian matrix. This is smooth (since \(\tau\) is a diffeomorphism and the Jacobian depends smoothly on \(x\)), so the charts \(\{(\pi^{-1}(U_\alpha), \tilde{\varphi}_\alpha)\}\) form a smooth atlas on \(TM\).
We topologize \(TM\) by declaring \(W \subseteq TM\) to be open if \(\tilde{\varphi}_\alpha(W \cap \pi^{-1}(U_\alpha))\) is open in \(\mathbb{R}^{2n}\) for every \(\alpha\). This makes \(TM\) Hausdorff and second countable (because \(M\) is), and the maps \(\tilde{\varphi}_\alpha\) become homeomorphisms. \(\blacksquare\)
The non-triviality of \(TS^2\) has a direct physical consequence via the hairy ball theorem: you cannot comb a sphere flat. More precisely, there is no continuous, nowhere-vanishing tangent vector field on \(S^2\). Applied to fluid dynamics on the Earth (modeled as \(S^2\)), this means every steady-state wind pattern on the globe must have at least one point where the wind speed is zero (an eye of a storm, or a still point).
Chapter 4: Vector Fields on Manifolds
4.1 Sections of the Tangent Bundle
Chapter 3 established the tangent bundle \(\pi \colon TM \to M\) as a smooth \(2n\)-manifold fibred over \(M\). We now study the global objects that live on this bundle: vector fields. Informally, a vector field is a rule that assigns to each point of \(M\) a tangent vector at that point, varying smoothly from point to point. Such objects are fundamental to differential geometry and physics — they describe flows, infinitesimal symmetries, and dynamical systems.
In local coordinates \((U, x^1, \ldots, x^n)\), a vector field \(X\) has the form
\[ X = \sum_{i=1}^n X^i \frac{\partial}{\partial x^i}, \]where the component functions \(X^i \colon U \to \mathbb{R}\) are given by \(X^i(p) = (X_p)(x^i)\). The vector field \(X\) is smooth if and only if all component functions \(X^i\) are smooth in every coordinate chart.
Moreover, \(\Gamma(TM)\) is a module over \(C^\infty(M)\): for any \(f \in C^\infty(M)\) and \(X \in \Gamma(TM)\), the product \(fX\) defined by \((fX)_p = f(p) X_p\) is again a smooth vector field. The module axioms are easily verified.
4.2 Vector Fields as Derivations of \(C^\infty(M)\)
Each tangent vector \(v \in T_pM\) acts as a derivation on \(C^\infty(M)\) at the point \(p\). When we let \(p\) vary, a vector field \(X\) defines a derivation of the entire algebra \(C^\infty(M)\).
The resulting function \(Xf \colon M \to \mathbb{R}\) is smooth (one checks this in coordinates), so \(X\) defines a map \(X \colon C^\infty(M) \to C^\infty(M)\) satisfying:
- \(\mathbb{R}\)-linearity: \(X(af + bg) = a\,Xf + b\,Xg\) for all \(a, b \in \mathbb{R}\).
- Leibniz rule: \(X(fg) = f \cdot Xg + g \cdot Xf\).
In coordinates, if \(X = \sum_i X^i \frac{\partial}{\partial x^i}\), then
\[ Xf = \sum_{i=1}^n X^i \frac{\partial f}{\partial x^i}, \]which is a first-order linear partial differential operator.
4.3 \(F\)-Related Vector Fields
When we have a smooth map \(F \colon M \to N\), we generally cannot “push forward” a vector field on \(M\) to a vector field on \(N\), because \(F\) may not be surjective (there may be points of \(N\) not in the image) or not injective (distinct points may map to the same point, giving ambiguous values). Instead, we have the weaker notion of \(F\)-relatedness.
Equivalently, \(F_* \circ X = Y \circ F\) as maps \(M \to TN\).
These are equal for all \(g\) and \(p\) if and only if \(F_{*,p} X_p = Y_{F(p)}\) for all \(p\). \(\blacksquare\)
If \(F\) is a diffeomorphism, then for any \(X \in \Gamma(TM)\), there is a unique vector field \(F_*X \in \Gamma(TN)\) that is \(F\)-related to \(X\), called the pushforward of \(X\): it is defined by \((F_*X)_q = F_{*,F^{-1}(q)}(X_{F^{-1}(q)})\).
4.4 The Lie Bracket
We now come to one of the most important algebraic structures in differential geometry: the Lie bracket of vector fields. The Lie bracket measures the failure of two flows to commute and provides the tangent bundle with a rich algebraic structure that goes far beyond its mere vector space or module structure.
To motivate the definition geometrically: suppose we flow along \(X\) for time \(\sqrt{t}\), then along \(Y\) for time \(\sqrt{t}\), then backward along \(X\) for time \(\sqrt{t}\), and finally backward along \(Y\) for time \(\sqrt{t}\). If the two flows commuted, we would return exactly to the starting point for every \(t\). In fact, the displacement from the starting point is approximately \(t \cdot [X, Y]_p\) as \(t \to 0\). The Lie bracket thus measures, to first order, the failure of the two flows to trace out a closed parallelogram. This geometric interpretation will be made precise when we discuss flows and the Lie derivative in Chapter 12.
Given two vector fields \(X, Y \in \Gamma(TM)\), each acts as a derivation of \(C^\infty(M)\). The composition \(X \circ Y\) (meaning \(f \mapsto X(Yf)\)) is, in general, not a derivation because it involves second-order derivatives. However, the commutator \(XY - YX\) miraculously cancels the second-order terms and produces a derivation.
We must verify that \([X, Y]\) is indeed a derivation (and hence corresponds to a vector field by Proposition 4.4).
Similarly,
\[ Y(X(fg)) = Yf \cdot Xg + f \cdot Y(Xg) + Yg \cdot Xf + g \cdot Y(Xf). \]Subtracting:
\[ [X,Y](fg) = f \cdot (X(Yg) - Y(Xg)) + g \cdot (X(Yf) - Y(Xf)) = f \cdot [X,Y]g + g \cdot [X,Y]f. \quad \blacksquare \]Now we compute the Lie bracket in local coordinates.
Similarly,
\[ Y(Xf) = \sum_{i,j} Y^i \frac{\partial X^j}{\partial x^i}\frac{\partial f}{\partial x^j} + \sum_{i,j} Y^i X^j \frac{\partial^2 f}{\partial x^i \partial x^j}. \]The second-order terms cancel (by equality of mixed partials), and we are left with
\[ [X,Y](f) = \sum_{i,j} \left(X^i \frac{\partial Y^j}{\partial x^i} - Y^i \frac{\partial X^j}{\partial x^i}\right) \frac{\partial f}{\partial x^j}. \]Relabelling the summation index \(j\) as \(k\) gives the stated formula. \(\blacksquare\)
To compute \([X, Y]\) we apply the coordinate formula. Here \(X^1 = 1, X^2 = 0\) and \(Y^1 = 0, Y^2 = x\). The only nonzero partial derivative of the components of \(Y\) is \(\frac{\partial Y^2}{\partial x} = 1\). The only nonzero partial derivative of the components of \(X\) is zero everywhere. Thus
\[ [X, Y]^1 = X^1 \frac{\partial Y^1}{\partial x} + X^2 \frac{\partial Y^1}{\partial y} - Y^1 \frac{\partial X^1}{\partial x} - Y^2 \frac{\partial X^1}{\partial y} = 1 \cdot 0 + 0 - 0 - 0 = 0, \]\[ [X, Y]^2 = X^1 \frac{\partial Y^2}{\partial x} + X^2 \frac{\partial Y^2}{\partial y} - Y^1 \frac{\partial X^2}{\partial x} - Y^2 \frac{\partial X^2}{\partial y} = 1 \cdot 1 + 0 - 0 - 0 = 1. \]Hence \([X, Y] = \frac{\partial}{\partial y}\). We can verify this directly: for any \(f \in C^\infty(\mathbb{R}^2)\),
\[ [X, Y](f) = X(Y(f)) - Y(X(f)) = \frac{\partial}{\partial x}\!\left(x \frac{\partial f}{\partial y}\right) - x\frac{\partial}{\partial y}\!\left(\frac{\partial f}{\partial x}\right) = \frac{\partial f}{\partial y} + x\frac{\partial^2 f}{\partial x \partial y} - x\frac{\partial^2 f}{\partial y \partial x} = \frac{\partial f}{\partial y}. \]The second-order terms cancel, leaving \(\frac{\partial}{\partial y}\), as expected.
Now consider the rotation vector field on \(\mathbb{R}^2\): \(R = -y\frac{\partial}{\partial x} + x\frac{\partial}{\partial y}\), and the radial dilation \(D = x\frac{\partial}{\partial x} + y\frac{\partial}{\partial y}\). Applying the formula:
\[ [R, D]^1 = R^1 \frac{\partial D^1}{\partial x} + R^2 \frac{\partial D^1}{\partial y} - D^1 \frac{\partial R^1}{\partial x} - D^2 \frac{\partial R^1}{\partial y} = (-y)(1) + (x)(0) - (x)(0) - (y)(-1) = -y + y = 0. \]Similarly \([R, D]^2 = 0\), so \([R, D] = 0\). This makes geometric sense: the flow of \(D\) is radial dilation \((x,y) \mapsto (e^t x, e^t y)\), the flow of \(R\) is rotation, and radial dilations and rotations commute.
4.5 Properties of the Lie Bracket
The Lie bracket satisfies three fundamental algebraic properties that make \(\Gamma(TM)\) into a Lie algebra.
- Bilinearity: \([aX + bY, Z] = a[X,Z] + b[Y,Z]\) and \([Z, aX+bY] = a[Z,X] + b[Z,Y]\).
- Skew-symmetry (antisymmetry): \([X, Y] = -[Y, X]\).
- Jacobi identity: \[ [X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y]] = 0. \]
(3) We verify the Jacobi identity by direct computation. For any \(f \in C^\infty(M)\):
\[ [X,[Y,Z]](f) = X([Y,Z]f) - [Y,Z](Xf) = X(Y(Zf)) - X(Z(Yf)) - Y(Z(Xf)) + Z(Y(Xf)). \]Similarly,
\[ [Y,[Z,X]](f) = Y(Z(Xf)) - Y(X(Zf)) - Z(X(Yf)) + X(Z(Yf)), \]\[ [Z,[X,Y]](f) = Z(X(Yf)) - Z(Y(Xf)) - X(Y(Zf)) + Y(X(Zf)). \]Adding all three expressions, every term cancels in pairs, giving \(0\). \(\blacksquare\)
This shows that the Lie bracket is not \(C^\infty(M)\)-bilinear — it is only \(\mathbb{R}\)-bilinear. The “error terms” involve derivatives of \(f\) and \(g\), reflecting the fact that the Lie bracket is a first-order differential operator in each argument.
4.6 The Lie Bracket and \(F\)-Relatedness
A crucial property of the Lie bracket is its naturality with respect to smooth maps: the bracket is preserved by \(F\)-relatedness.
By Proposition 4.6 again, \([X_1,X_2]\) is \(F\)-related to \([Y_1,Y_2]\). \(\blacksquare\)
4.7 Lie Groups and Lie Algebras
We now develop one of the most beautiful applications of vector fields: the Lie algebra of a Lie group. The interplay between the group structure and the manifold structure gives rise to a finite-dimensional Lie algebra that encodes much of the group’s structure.
Each \(L_g\) is a diffeomorphism with inverse \(L_{g^{-1}}\).
Since each \(L_g\) is a diffeomorphism, it induces an isomorphism \((L_g)_{*,h} \colon T_hG \to T_{gh}G\) at every point. This allows us to “translate” tangent vectors from one point to another, giving rise to a special class of vector fields.
Equivalently, \((L_g)_{*,h}(X_h) = X_{gh}\) for all \(g, h \in G\).
The condition of left-invariance means that \(X\) is \(L_g\)-related to itself for every \(g\). This is an extremely rigid condition — it means the vector field is completely determined by its value at any single point.
By Proposition 4.15, \(\mathfrak{g} \cong T_eG\) as a vector space, so \(\dim \mathfrak{g} = \dim G\). The key point is that the Lie bracket of two left-invariant vector fields is again left-invariant, so the bracket operation restricts to \(\mathfrak{g}\).
- For \(G = \mathrm{GL}(n, \mathbb{R})\), we have \(\mathfrak{gl}(n, \mathbb{R}) = M_{n \times n}(\mathbb{R})\) (all \(n \times n\) matrices), with bracket \([A, B] = AB - BA\).
- For \(G = \mathrm{O}(n)\), the Lie algebra is \(\mathfrak{o}(n) = \{A \in M_{n \times n}(\mathbb{R}) : A^T + A = 0\}\), the space of skew-symmetric matrices, with dimension \(n(n-1)/2\).
- For \(G = \mathrm{SO}(n)\), since \(\mathrm{SO}(n)\) is the identity component of \(\mathrm{O}(n)\), we have \(\mathfrak{so}(n) = \mathfrak{o}(n)\).
- For \(G = \mathrm{U}(n)\), we have \(\mathfrak{u}(n) = \{A \in M_{n \times n}(\mathbb{C}) : A^* + A = 0\}\), the space of skew-Hermitian matrices, with real dimension \(n^2\).
- For \(G = \mathrm{SU}(n)\), we have \(\mathfrak{su}(n) = \{A \in \mathfrak{u}(n) : \mathrm{tr}(A) = 0\}\), with dimension \(n^2 - 1\).
4.8 Lie Group Homomorphisms and Lie Algebra Homomorphisms
The correspondence between Lie groups and Lie algebras is functorial: smooth group homomorphisms induce Lie algebra homomorphisms.
is a Lie algebra homomorphism from \(\mathfrak{g}\) to \(\mathfrak{h}\) (under the identifications \(\mathfrak{g} \cong T_eG\) and \(\mathfrak{h} \cong T_eH\)). This homomorphism is often denoted \(\mathrm{Lie}(\Phi)\) or \(d\Phi_e\).
The key observation is that because \(\Phi\) is a group homomorphism, we have \(\Phi \circ L_g = L_{\Phi(g)} \circ \Phi\) for all \(g \in G\). Taking the pushforward at \(h \in G\):
\[ \Phi_{*,gh} \circ (L_g)_{*,h} = (L_{\Phi(g)})_{*,\Phi(h)} \circ \Phi_{*,h}. \]In particular, if \(X\) is left-invariant on \(G\) and we define \(\tilde{X}\) on \(H\) by \(\tilde{X}_{\Phi(g)} = \Phi_{*,g}(X_g)\), then for any \(k = \Phi(g) \in \Phi(G)\):
\[ (L_k)_{*,e}(\Phi_{*,e}(X_e)) = (L_{\Phi(g)})_{*,e}(\Phi_{*,e}(X_e)) = \Phi_{*,g}((L_g)_{*,e}(X_e)) = \Phi_{*,g}(X_g). \]This shows \(X\) is \(\Phi\)-related to the left-invariant vector field \(\tilde{X}\) on \(H\) determined by \(\tilde{X}_e = \Phi_{*,e}(X_e)\). Similarly, \(Y\) is \(\Phi\)-related to \(\tilde{Y}\). By Theorem 4.10, \([X,Y]\) is \(\Phi\)-related to \([\tilde{X}, \tilde{Y}]\). Evaluating at \(e\):
\[ \Phi_{*,e}([X,Y]_e) = [\tilde{X}, \tilde{Y}]_e. \]This is exactly the statement that \(\Phi_{*,e}\) preserves Lie brackets. \(\blacksquare\)
This concludes our development of the foundational machinery of smooth manifolds through Chapter 4. We have built up from the topological definition of a manifold, through smooth structures and maps, to the tangent bundle and vector fields, arriving at the Lie bracket and the Lie algebra of a Lie group. These structures form the backbone upon which all further topics in the course — submanifolds, immersions and submersions, flows, differential forms, integration, and de Rham cohomology — will be built.
Chapter 5: Covectors and the Cotangent Bundle
5.1 Covectors and the Dual Space
Having developed the theory of tangent vectors and the tangent bundle in previous chapters, we now turn to the dual picture. In linear algebra, every vector space \( V \) has an associated dual space \( V^* \) consisting of all linear functionals on \( V \). When we apply this construction fibrewise to the tangent bundle of a smooth manifold, we obtain the cotangent bundle, whose sections are the covector fields or 1-forms. This dual perspective is not merely an abstract curiosity: covector fields arise naturally whenever we take the differential of a smooth function, and they are the objects that can be integrated along curves. The interplay between vectors and covectors is one of the central themes of differential geometry.
Let \( V \) be a finite-dimensional real vector space. Recall from linear algebra:
If \( V \) has dimension \( n \), then \( V^* \) also has dimension \( n \), though the two spaces are not canonically isomorphic (absent additional structure such as an inner product). Given a basis \( (e_1, \ldots, e_n) \) for \( V \), there is a uniquely determined dual basis \( (\varepsilon^1, \ldots, \varepsilon^n) \) for \( V^* \), characterised by the property
\[ \varepsilon^i(e_j) = \delta^i_j, \]where \( \delta^i_j \) is the Kronecker delta. Any covector \( \omega \in V^* \) can be expanded in the dual basis as \( \omega = \omega_i \, \varepsilon^i \) (using the Einstein summation convention), where \( \omega_i = \omega(e_i) \).
Now we bring this construction to manifolds. At each point \( p \) of a smooth manifold \( M \), the tangent space \( T_pM \) is a finite-dimensional real vector space, so we may form its dual.
Elements of \( T_p^*M \) are called tangent covectors at \( p \).
If \( (U, \varphi) \) is a smooth chart around \( p \) with coordinate functions \( (x^1, \ldots, x^n) \), then the coordinate vectors \( \left( \frac{\partial}{\partial x^1}\big|_p, \ldots, \frac{\partial}{\partial x^n}\big|_p \right) \) form a basis for \( T_pM \). The corresponding dual basis for \( T_p^*M \) is denoted \( (dx^1|_p, \ldots, dx^n|_p) \) and is characterised by
\[ dx^i|_p \left( \frac{\partial}{\partial x^j}\bigg|_p \right) = \delta^i_j. \]We shall see shortly that each \( dx^i|_p \) is in fact the differential of the coordinate function \( x^i \) at the point \( p \), which justifies the notation. Any covector \( \omega \in T_p^*M \) can be written as
\[ \omega = \omega_i \, dx^i|_p, \qquad \text{where } \omega_i = \omega\!\left( \frac{\partial}{\partial x^i}\bigg|_p \right). \]The components \( \omega_i \) carry a lower index, in contrast to the upper-indexed components \( v^i \) of a tangent vector \( v = v^i \frac{\partial}{\partial x^i}\big|_p \). This notational convention is deliberate and reflects the transformation laws: under a change of coordinates \( \tilde{x}^j = \tilde{x}^j(x^1, \ldots, x^n) \), the covector components transform covariantly:
\[ \tilde{\omega}_j = \frac{\partial x^i}{\partial \tilde{x}^j} \, \omega_i, \]whereas the tangent vector components transform contravariantly. The natural pairing \( \omega(v) = \omega_i v^i \) is coordinate-independent precisely because these two transformation laws compensate one another.
5.2 The Cotangent Bundle
Just as we assembled all tangent spaces into the tangent bundle \( TM \), we now assemble all cotangent spaces into a single object.
It carries a natural projection \( \pi \colon T^*M \to M \) defined by \( \pi(p, \omega) = p \).
The smooth structure is constructed exactly as for the tangent bundle. Given a smooth chart \( (U, \varphi) \) for \( M \) with coordinates \( (x^1, \ldots, x^n) \), we define a chart \( (\pi^{-1}(U), \tilde{\varphi}) \) for \( T^*M \) by
\[ \tilde{\varphi}(p, \omega_i \, dx^i|_p) = (x^1(p), \ldots, x^n(p), \omega_1, \ldots, \omega_n) \in \mathbb{R}^{2n}. \]The first \( n \) coordinates locate the base point \( p \) in \( M \), while the last \( n \) coordinates record the components of the covector in the dual basis. The transition functions between overlapping charts are smooth, yielding the desired smooth structure on \( T^*M \).
The cotangent bundle plays a fundamental role in Hamiltonian mechanics, where the phase space of a classical mechanical system is naturally \( T^*M \) rather than \( TM \). It also carries a canonical symplectic structure (the canonical 2-form), though we will not develop this in the present course.
To see why \(T^*M\) is the natural home for Hamiltonian mechanics rather than \(TM\): the momentum of a particle is not a velocity vector (contravariant object) but rather a linear functional on velocity space — it tells you how much kinetic energy a unit velocity in a given direction generates. More precisely, for a Lagrangian \(L \colon TM \to \mathbb{R}\), the Legendre transform sends a tangent vector \((q, \dot{q}) \in TM\) to the covector \(p = \frac{\partial L}{\partial \dot{q}} \in T^*_q M\), and the Hamiltonian \(H \colon T^*M \to \mathbb{R}\) is defined by \(H(q, p) = p(\dot{q}) - L(q, \dot{q})\). Hamilton’s equations then become the flow of the Hamiltonian vector field on \(T^*M\), which is symplectic. The canonical 1-form \(\lambda = p_i\,dq^i\) (the tautological 1-form, or Liouville form) on \(T^*M\) and the canonical symplectic form \(\omega = -d\lambda = dq^i \wedge dp_i\) are intrinsic structures that exist on any cotangent bundle and require no additional geometric input.
5.3 Covector Fields and 1-Forms
In local coordinates \( (x^1, \ldots, x^n) \) on an open set \( U \subseteq M \), a 1-form \( \omega \) can be written as
\[ \omega = \omega_i \, dx^i, \]where the component functions \( \omega_i \colon U \to \mathbb{R} \) are smooth. The space \( \Omega^1(M) \) is a module over the ring \( C^\infty(M) \): we can add 1-forms and multiply them by smooth functions pointwise.
Covector fields act on vector fields to produce smooth functions. If \( \omega \in \Omega^1(M) \) and \( X \in \mathfrak{X}(M) \), then the function \( \omega(X) \colon M \to \mathbb{R} \) defined by
\[ \omega(X)(p) = \omega_p(X_p) \]is smooth. In local coordinates, if \( X = X^j \frac{\partial}{\partial x^j} \), then
\[ \omega(X) = \omega_i \, X^i. \]This pairing \( \Omega^1(M) \times \mathfrak{X}(M) \to C^\infty(M) \) is \( C^\infty(M) \)-bilinear and is sometimes called the natural pairing or contraction.
5.4 The Differential of a Smooth Function
The most natural and important source of covector fields is the differential of a smooth function. This is a coordinate-free operation that generalises the classical gradient.
for all \( p \in M \) and \( X_p \in T_pM \). Equivalently, for any smooth vector field \( X \in \mathfrak{X}(M) \),
\[ df(X) = Xf. \]In words, \( df \) at a point \( p \) eats a tangent vector and returns the directional derivative of \( f \) in that direction. This is entirely coordinate-free. Let us now compute \( df \) in local coordinates.
on \( U \).
On the other hand,
\[ \frac{\partial f}{\partial x^i} \, dx^i\!\left( \frac{\partial}{\partial x^j} \right) = \frac{\partial f}{\partial x^i} \, \delta^i_j = \frac{\partial f}{\partial x^j}. \qquad \square \]This proposition shows that the notation \( dx^i \) for the dual basis elements is consistent: \( dx^i \) is literally the differential of the coordinate function \( x^i \).
The vanishing of the \( d\phi \) component reflects the rotational symmetry of the height function about the vertical axis.
The differential satisfies the following algebraic properties, all immediate from the definition.
for all \( f, g \in C^\infty(M) \). Moreover, if \( c \in \mathbb{R} \) is a constant function, then \( dc = 0 \).
The differential \( d \colon C^\infty(M) \to \Omega^1(M) \) is a fundamental example of a derivation from the algebra \( C^\infty(M) \) to the module \( \Omega^1(M) \). Later, when we develop differential forms in full generality, we will extend \( d \) to the exterior derivative \( d \colon \Omega^k(M) \to \Omega^{k+1}(M) \).
5.5 Pullback of Covectors and 1-Forms
One of the most important features distinguishing covectors from vectors is that covectors can be pulled back by smooth maps. Recall that a smooth map \( F \colon M \to N \) induces a pushforward \( F_{*,p} \colon T_pM \to T_{F(p)}N \) on tangent vectors. In general, there is no way to push forward a vector field from \( M \) to \( N \) unless \( F \) is a diffeomorphism. However, the dual operation — the pullback — goes in the opposite direction and is always well-defined for covectors.
defined by
\[ (F^*\omega)(X_p) = \omega(F_{*,p} X_p) \]for all \( \omega \in T_{F(p)}^*N \) and \( X_p \in T_pM \).
Note the directions carefully: the pushforward goes “forward” from \( T_pM \) to \( T_{F(p)}N \), while the pullback goes “backward” from \( T_{F(p)}^*N \) to \( T_p^*M \). This is the standard contravariance of the dual space construction in linear algebra.
The pointwise pullback extends to 1-forms: if \( \omega \) is a 1-form on \( N \), we define the pullback 1-form \( F^*\omega \) on \( M \) by
\[ (F^*\omega)_p = F^*(\omega_{F(p)}) \qquad \text{for all } p \in M. \]More explicitly, for any vector field \( X \in \mathfrak{X}(M) \),
\[ (F^*\omega)(X)(p) = \omega_{F(p)}(F_{*,p} X_p). \]The following proposition lists the key properties of pullback.
- \( F^* \colon \Omega^1(N) \to \Omega^1(M) \) is \( \mathbb{R} \)-linear.
- \( F^*(h\omega) = (h \circ F)(F^*\omega) \) for all \( h \in C^\infty(N) \), \( \omega \in \Omega^1(N) \).
- Naturality with respect to \( d \): \( F^*(dh) = d(h \circ F) \) for all \( h \in C^\infty(N) \).
- If \( G \colon N \to P \) is another smooth map, then \( (G \circ F)^* = F^* \circ G^* \).
- \( (\mathrm{id}_M)^* = \mathrm{id}_{\Omega^1(M)} \).
Since this holds for all \( X_p \), we conclude \( F^*(dh) = d(h \circ F) \). The other properties are straightforward. \( \square \)
Property (3) is extremely useful in computations. It tells us that pulling back a differential is the same as first composing with \( F \) and then taking the differential. In particular, for coordinate functions \( y^j \) on \( N \) and a smooth map \( F \colon M \to N \) written in coordinates as \( y^j = F^j(x^1, \ldots, x^m) \):
\[ F^*(dy^j) = d(y^j \circ F) = dF^j = \frac{\partial F^j}{\partial x^i} \, dx^i. \]This gives a practical recipe for computing pullbacks of arbitrary 1-forms. If \( \omega = \omega_j \, dy^j \) on \( N \), then
\[ F^*\omega = (\omega_j \circ F) \frac{\partial F^j}{\partial x^i} \, dx^i. \]Therefore,
\[ F^*\omega = t^2(2s \, ds) + s^2(t \, ds + s \, dt) - st(2t \, dt) = (2st^2 + s^2 t) \, ds + (s^3 - 2st^2) \, dt. \]5.6 Line Integrals of Covector Fields
One of the key motivations for studying covector fields is that they are the natural objects to integrate along curves. This should be contrasted with vector fields, which cannot be integrated along curves in a coordinate-independent way without additional structure (such as a Riemannian metric).
In local coordinates, if \( \omega = \omega_i \, dx^i \) and \( \gamma(t) = (x^1(t), \ldots, x^n(t)) \), then
\[ \int_\gamma \omega = \int_a^b \omega_i(\gamma(t)) \frac{dx^i}{dt} \, dt. \]This is a coordinate-independent quantity: one can verify directly that the integral does not depend on the choice of coordinates, because the transformation laws for covector components and tangent vector components cancel. This is precisely why 1-forms (not vector fields) are the correct objects to integrate along curves.
The line integral is also invariant under orientation-preserving reparametrisation of the curve, and reverses sign under orientation-reversing reparametrisation.
We compute the line integral of \(\omega\) along the unit circle \(\gamma \colon [0, 2\pi] \to \mathbb{R}^2\), \(\gamma(t) = (\cos t, \sin t)\). We have \(\gamma'(t) = (-\sin t, \cos t)\), so
\[ \omega_{\gamma(t)}(\gamma'(t)) = \frac{-\sin t \cdot (-\sin t) + \cos t \cdot \cos t}{\cos^2 t + \sin^2 t} = \frac{\sin^2 t + \cos^2 t}{1} = 1. \]Therefore
\[ \int_\gamma \omega = \int_0^{2\pi} 1\,dt = 2\pi. \]Now suppose instead we take the straight-line path \(\sigma \colon [0,1] \to \mathbb{R}^2 \setminus \{0\}\), \(\sigma(t) = (1, t)\) from \((1,0)\) to \((1,1)\). Then \(\sigma'(t) = (0,1)\), and
\[ \omega_{\sigma(t)}(\sigma'(t)) = \frac{-t \cdot 0 + 1 \cdot 1}{1 + t^2} = \frac{1}{1+t^2}, \]so \(\int_\sigma \omega = \int_0^1 \frac{dt}{1+t^2} = \arctan(1) - \arctan(0) = \pi/4\).
These two computations illustrate an important point: the value of \(\int_\gamma \omega\) depends on the path, not just the endpoints, because \(\omega\) is closed but not exact on \(\mathbb{R}^2 \setminus \{0\}\). On the contractible domain \(\mathbb{R}^2 \setminus \{(-\infty, 0]\}\) (for instance), \(\omega\) equals \(d(\text{arg})\) where \(\text{arg}\) is the argument function, so the integral is path-independent there. But on the full punctured plane, the winding number of the path around the origin contributes multiples of \(2\pi\) to the integral.
This is the manifold generalisation of the fundamental theorem of calculus, and it tells us that the line integral of an exact 1-form depends only on the endpoints.
5.7 Conservative Covector Fields, Exact and Closed 1-Forms
The fundamental theorem for line integrals motivates the following important notions.
- exact if \( \omega = df \) for some \( f \in C^\infty(M) \); the function \( f \) is called a potential for \( \omega \);
- closed if, in every smooth chart \( (U, (x^1, \ldots, x^n)) \), we have \[ \frac{\partial \omega_i}{\partial x^j} = \frac{\partial \omega_j}{\partial x^i} \qquad \text{for all } i, j. \]
These three properties are related as follows.
- If \( \omega \) is exact, then \( \omega \) is conservative.
- If \( \omega \) is conservative, then \( \omega \) is exact.
- If \( \omega \) is exact, then \( \omega \) is closed.
The converse of (3) — “every closed 1-form is exact” — is not true in general. Whether it holds depends on the topology of \( M \).
is closed (as one can verify by direct computation) but not exact. Indeed, integrating \( \omega \) around the unit circle gives \( 2\pi \neq 0 \). On the other hand, locally (say on the upper half-plane), \( \omega = d(\arctan(y/x)) \), so \( \omega \) is locally exact. The obstruction to global exactness is the nontrivial first de Rham cohomology group \( H^1_{\mathrm{dR}}(\mathbb{R}^2 \setminus \{0\}) \cong \mathbb{R} \).
The quotient space \( H^1_{\mathrm{dR}}(M) = \ker d / \operatorname{im} d \) (closed 1-forms modulo exact 1-forms) is the first de Rham cohomology group of \( M \). It is a topological invariant that measures the extent to which closed forms can fail to be exact. The Poincare lemma asserts that on contractible domains (such as \( \mathbb{R}^n \) or any open ball), every closed form is exact. We will return to this in a later chapter.
Chapter 6: Submersions, Immersions, and Embeddings
6.1 Maps of Constant Rank
In multivariable calculus, the behaviour of a smooth map \( F \colon \mathbb{R}^m \to \mathbb{R}^n \) near a point is governed by the rank of its Jacobian matrix. The same principle carries over to smooth manifolds via the differential \( F_{*,p} \colon T_pM \to T_{F(p)}N \). In this chapter, we study the key classes of smooth maps — submersions, immersions, and embeddings — and develop the fundamental rank theorems that reveal their local structure. These results are indispensable tools for constructing new manifolds from old ones, particularly through the regular level set theorem.
If \( \dim M = m \) and \( \dim N = n \), then \( 0 \le \operatorname{rank}_p F \le \min(m, n) \).
In local coordinates, the rank of \( F \) at \( p \) is just the rank of the Jacobian matrix \( \left( \frac{\partial F^j}{\partial x^i}(p) \right) \). The rank is a lower semicontinuous function of \( p \): if \( F \) has rank \( r \) at \( p \), then \( F \) has rank at least \( r \) at all nearby points.
We now define the three principal classes of smooth maps based on their rank.
6.2 Submersions
Submersions are the smooth analogues of “onto” linear maps. The defining condition — surjectivity of the differential everywhere — guarantees that the map locally looks like a projection. This is made precise by the following theorem.
That is, in suitable coordinates, a submersion is just a projection onto the first \( n \) coordinates.
The Jacobian of \( \Phi \) at the origin is nonsingular (it has the Jacobian of \( \hat{F} \) in its first \( n \) rows and the identity in the last \( m - n \) rows, with suitable structure), so by the inverse function theorem, \( \Phi \) is a local diffeomorphism. Replacing the chart on \( M \) by the composition with \( \Phi^{-1} \) gives the desired canonical form. \( \square \)
An important consequence of the local submersion theorem is:
6.3 Immersions
An immersion need not be injective as a map of sets — only the differential is required to be injective at every point. Immersions locally look like inclusions, as the following theorem makes precise.
That is, in suitable coordinates, an immersion is just the inclusion \( \mathbb{R}^m \hookrightarrow \mathbb{R}^n \) as the first \( m \) coordinate axes.
The Jacobian of \( \Psi \) at \( 0 \) is nonsingular, so \( \Psi \) is a local diffeomorphism by the inverse function theorem. Replacing the chart on \( N \) by the composition with \( \Psi^{-1} \) gives the desired canonical form. \( \square \)
6.4 Smooth Embeddings
The notion of immersion is too weak for many purposes. For example, we want to define submanifolds as images of maps that are both immersions and well-behaved topologically. This leads to the notion of embedding.
In other words, a smooth embedding is an immersion that is also a topological embedding. The condition of being a homeomorphism onto the image excludes self-intersections and also excludes curves that “accumulate on themselves” (such as an injective immersion of \( \mathbb{R} \) whose image is dense in the torus).
6.5 The Constant Rank Theorem
The local submersion and immersion theorems are special cases of a more general result.
When \( r = n \), this reduces to the local submersion theorem; when \( r = m \), it reduces to the local immersion theorem. The constant rank theorem is a powerful tool, but its hypothesis — that the rank is constant on all of \( M \) — is quite restrictive. In practice, one often only needs the rank to be constant on a neighbourhood of a point, or one uses the following local version.
6.6 Level Sets and the Regular Level Set Theorem
One of the most important applications of the submersion theorem is to the study of level sets. Many naturally occurring manifolds — spheres, orthogonal groups, and more — arise as level sets of smooth maps.
- A point \( p \in M \) is a regular point of \( F \) if \( F_{*,p} \) is surjective, and a critical point otherwise.
- A point \( c \in N \) is a regular value of \( F \) if every point in the preimage \( F^{-1}(c) \) is a regular point. (In particular, if \( F^{-1}(c) = \emptyset \), then \( c \) is a regular value vacuously.)
- A point \( c \in N \) is a critical value if it is not a regular value.
Moreover, the tangent space to the level set at any point \( p \in F^{-1}(c) \) is precisely the kernel of the differential:
This provides a practical way to compute tangent spaces to manifolds defined as level sets: one simply computes the kernel of the Jacobian.
At any point \( x \neq 0 \), the differential \( df_x \) is surjective (it is a nonzero linear map to \( \mathbb{R} \)), so every nonzero real number is a regular value. In particular, \( 1 \) is a regular value, and
\[ S^n = f^{-1}(1) \]is a smooth \( n \)-dimensional submanifold of \( \mathbb{R}^{n+1} \). The tangent space at \( p \in S^n \) is
\[ T_p S^n = \ker df_p = \{ v \in \mathbb{R}^{n+1} : \sum_{i=1}^{n+1} p^i v^i = 0 \} = p^\perp, \]the orthogonal complement of \( p \) in \( \mathbb{R}^{n+1} \).
One computes \( \Phi_{*,A}(B) = B^T A + A^T B \). When \( A \in O(n) \), this becomes \( \Phi_{*,A}(B) = B^T A + A^T B \). For any symmetric matrix \( S \), the matrix \( B = \frac{1}{2}AS \) satisfies \( \Phi_{*,A}(B) = S \), so \( \Phi_{*,A} \) is surjective. Hence \( I_n \) is a regular value, and \( O(n) \) is a smooth submanifold of \( M(n, \mathbb{R}) \cong \mathbb{R}^{n^2} \) of dimension
\[ \dim O(n) = n^2 - \frac{n(n+1)}{2} = \frac{n(n-1)}{2}. \]The tangent space at the identity is
\[ T_I O(n) = \ker \Phi_{*,I} = \{ B \in M(n, \mathbb{R}) : B^T + B = 0 \} = \mathfrak{o}(n), \]the space of skew-symmetric matrices.
The regular level set theorem, combined with Sard’s theorem (which states that the set of critical values has measure zero), tells us that “most” level sets of a smooth map are smooth submanifolds.
Chapter 7: Submanifolds
7.1 Embedded Submanifolds
The notion of submanifold gives a rigorous framework for studying lower-dimensional smooth objects sitting inside ambient manifolds. There are two important flavours of submanifold — embedded and immersed — reflecting the distinction between embeddings and immersions from the previous chapter. Embedded submanifolds are the more well-behaved class and suffice for most applications.
Such a chart is called a slice chart (or adapted chart, or submanifold chart) for \( S \) in \( M \), and the set \( U \cap S \) is called a slice of \( U \). The number \( n - k \) is the codimension of \( S \) in \( M \).
The idea behind the slice chart condition is simple: near any point of \( S \), we can find coordinates for \( M \) in which \( S \) is “flat” — it is carved out by setting the last \( n - k \) coordinates to zero. This is the strongest and most natural notion of submanifold.
The topology on an embedded submanifold \( S \) is the subspace topology inherited from \( M \). This is an important point: an embedded submanifold is required to have the subspace topology, not some other topology that might make the inclusion continuous.
7.2 Slice Charts and Local Structure
The slice chart condition can be reformulated in several equivalent ways, each useful in different contexts.
- \( S \) is a \( k \)-dimensional embedded submanifold of \( M \).
- For every \( p \in S \), there is a neighbourhood \( U \) of \( p \) in \( M \) and a smooth submersion \( \Phi \colon U \to \mathbb{R}^{n-k} \) such that \( U \cap S = \Phi^{-1}(0) \).
- For every \( p \in S \), there is a neighbourhood \( U \) of \( p \) in \( M \) and smooth functions \( \Phi^1, \ldots, \Phi^{n-k} \colon U \to \mathbb{R} \) such that \( U \cap S = \{ q \in U : \Phi^1(q) = \cdots = \Phi^{n-k}(q) = 0 \} \) and \( d\Phi^1|_p, \ldots, d\Phi^{n-k}|_p \) are linearly independent in \( T_p^*M \).
Condition (3) gives a very practical way to verify that a subset is an embedded submanifold: one exhibits defining functions whose differentials are independent. This is exactly the setup of the implicit function theorem.
7.3 Immersed Submanifolds
There is a more general notion of submanifold that arises from immersions rather than embeddings.
Every embedded submanifold is an immersed submanifold (with the subspace topology), but the converse is false. The distinction is subtle but important: an immersed submanifold may have a topology that is finer than the subspace topology.
Despite these pathologies, immersed submanifolds arise naturally in several important contexts, particularly in the theory of Lie groups.
7.4 Tangent Space to a Submanifold
If \( S \subseteq M \) is an embedded (or immersed) submanifold and \( \iota \colon S \hookrightarrow M \) is the inclusion, then the differential \( \iota_{*,p} \colon T_pS \to T_pM \) is injective (since \( \iota \) is an immersion). We routinely identify \( T_pS \) with its image \( \iota_{*,p}(T_pS) \subseteq T_pM \), treating \( T_pS \) as a linear subspace of \( T_pM \).
where \( k = \dim S \). That is, \( T_pS \) is the subspace of \( T_pM \) spanned by the coordinate directions tangent to the slice.
When \( S = F^{-1}(c) \) is a regular level set, we have already seen (Proposition 6.19) that \( T_pS = \ker F_{*,p} \). These two descriptions are consistent: in a slice chart adapted to the level set, the tangent directions to \( S \) are precisely those annihilated by the differential of \( F \).
7.5 Restricting Maps to Submanifolds
A key practical question is: if \( F \colon M \to N \) is a smooth map and \( S \subseteq M \), \( T \subseteq N \) are submanifolds with \( F(S) \subseteq T \), is the restricted map \( F|_S \colon S \to T \) smooth?
This is immediate since the composition of smooth maps is smooth. The more subtle question is about restricting the codomain:
7.6 Lie Subgroups
The theory of submanifolds is particularly important in the context of Lie groups, where it provides a clean framework for studying subgroups.
If \( H \) is an embedded submanifold, it is called a closed Lie subgroup or regular Lie subgroup.
This theorem is remarkable because it requires no smoothness hypothesis: one starts with a purely algebraic and topological condition (closed subgroup) and obtains a smooth structure for free. We used this implicitly in Examples 6.21 and 6.22: both \( O(n) \) and \( SL(n, \mathbb{R}) \) are closed subgroups of \( GL(n, \mathbb{R}) \), so Cartan’s theorem guarantees they are Lie subgroups. (We verified this independently using the regular level set theorem.)
7.7 The Whitney Embedding Theorem
One of the most fundamental — and reassuring — results in the theory of smooth manifolds is that every abstract smooth manifold can be faithfully realised as a submanifold of some Euclidean space. This is the content of Whitney’s embedding theorems. From a foundational perspective the result tells us that the category of smooth manifolds is no larger than the category of smooth submanifolds of \(\mathbb{R}^N\); the abstract definition we have been using all along does not secretly produce “exotic” objects that cannot be concretely visualised.
We distinguish two versions, which differ markedly in difficulty.
The strong theorem is essentially sharp: \( \mathbb{RP}^2 \) does not embed in \( \mathbb{R}^3 \), confirming that \( 2n \) cannot in general be reduced to \( 2n - 1 \).
Proof strategy for the weak theorem. The argument for Theorem 7.19 proceeds in two main steps.
Step 1: Constructing an injective immersion into some \( \mathbb{R}^N \). Since \( M \) is second countable, it admits a countable atlas \(\{(U_\alpha, \varphi_\alpha)\}\). Using a partition of unity \(\{\rho_\alpha\}\) subordinate to this cover, one defines a map \( F \colon M \to \mathbb{R}^N \) (for sufficiently large \( N \)) by assembling the locally-defined data \( \rho_\alpha \varphi_\alpha \) into a single globally-defined map. One verifies that \( F \) is an injective immersion: injectivity follows from the identity of the partition of unity, and the immersion property (injectivity of \( dF_p \) for all \( p \)) follows because at least one bump function \( \rho_\alpha \) is nonzero near each point, making the derivative of the local coordinate factor injective.
Step 2: Reducing the target dimension to \( 2n+1 \) via Sard’s theorem. At this stage \( N \) may be very large. The reduction is achieved by repeatedly projecting onto hyperplanes. For a unit vector \( v \in \mathbb{R}^N \), consider the orthogonal projection \( \pi_v \colon \mathbb{R}^N \to v^\perp \cong \mathbb{R}^{N-1} \). The composition \( \pi_v \circ F \) is still an injective immersion provided \( v \) is not in the secant variety \(\{ \frac{F(p) - F(q)}{|F(p)-F(q)|} : p \neq q \}\) (which has dimension \( 2n < N-1 \) for \( N > 2n+1 \)) and not in the tangential variety \(\{ dF_p(w)/|dF_p(w)| : w \in T_pM \setminus \{0\} \}\) (dimension \( 2n-1 \)). By Sard’s theorem, the set of “bad” directions has measure zero in \( S^{N-1} \), so almost every projection is still an injective immersion. Iterating this argument reduces \( N \) to \( 2n+1 \). If \( M \) is compact, injectivity of \( F \) together with compactness automatically makes \( F \) a proper embedding.
The strong theorem and Whitney’s trick. Reducing the target dimension from \( 2n+1 \) to \( 2n \) requires a fundamentally different argument. The projection trick can still eliminate one more dimension at the immersion level — by Sard’s theorem, a generic projection of an injective immersion into \( \mathbb{R}^{2n+1} \) yields an immersion into \( \mathbb{R}^{2n} \) — but the resulting map may no longer be injective: it can have a finite number of transverse double points (self-intersections). Whitney’s celebrated Whitney trick (or Whitney’s cancellation lemma) is a delicate geometric manoeuvre that eliminates pairs of self-intersection points by constructing an embedded “Whitney disc” and performing a controlled isotopy along it. The argument works in full generality only when \( n \geq 2 \) (for \( n = 1 \), smooth curves in \( \mathbb{R}^2 \) need not embed, but they do embed in \( \mathbb{R}^2 = \mathbb{R}^{2 \cdot 1} \) by a separate argument). The Whitney trick is also one of the key tools in Smale’s proof of the \( h \)-cobordism theorem and in surgery theory.
Chapter 8: Tensors and Riemannian Metrics
8.1 Multilinear Algebra: Tensor Products
We now develop the algebraic machinery of tensors, which provides the natural framework for defining Riemannian metrics and many other geometric structures on manifolds. A tensor is, at its core, a multilinear map. The passage from linear algebra (vectors and covectors) to multilinear algebra (tensors) vastly expands the kinds of geometric objects we can work with.
Let \( V \) be a finite-dimensional real vector space with dual space \( V^* \).
such that for every multilinear map \( \alpha \colon V_1 \times \cdots \times V_r \to W \) into any vector space \( W \), there is a unique linear map \( \tilde{\alpha} \colon V_1 \otimes \cdots \otimes V_r \to W \) with \( \alpha = \tilde{\alpha} \circ \iota \).
The image \( \iota(v_1, \ldots, v_r) \) is denoted \( v_1 \otimes \cdots \otimes v_r \) and is called a decomposable (or simple) tensor. Not every element of the tensor product is decomposable; a general element is a sum of decomposable tensors.
If \( (e_1, \ldots, e_n) \) is a basis for \( V \) and \( (f_1, \ldots, f_m) \) is a basis for \( W \), then \( \{ e_i \otimes f_j : 1 \le i \le n, \, 1 \le j \le m \} \) is a basis for \( V \otimes W \), so \( \dim(V \otimes W) = nm \). More generally, \( \dim(V_1 \otimes \cdots \otimes V_r) = \prod_i \dim V_i \).
8.2 Tensors on a Vector Space
In the context of tangent spaces, we form tensor products of copies of \( V = T_pM \) and its dual \( V^* = T_p^*M \).
Equivalently, a \( (k, \ell) \)-tensor is a multilinear map
\[ \sigma \colon \underbrace{V \times \cdots \times V}_{k} \times \underbrace{V^* \times \cdots \times V^*}_{\ell} \to \mathbb{R}. \]The number \( k \) is the covariant order (number of “covector slots”) and \( \ell \) is the contravariant order (number of “vector slots”). The sum \( k + \ell \) is the total order of the tensor.
- \( T^{(0,0)}(V) = \mathbb{R} \) (scalars)
- \( T^{(1,0)}(V) = V^* \) (covectors / linear forms)
- \( T^{(0,1)}(V) = V \) (vectors)
- \( T^{(1,1)}(V) = V^* \otimes V \cong \operatorname{End}(V) \) (linear endomorphisms)
- \( T^{(2,0)}(V) = V^* \otimes V^* \) (bilinear forms)
The space \( T^{(k,\ell)}(V) \) has dimension \( n^{k+\ell} \) where \( n = \dim V \). Given a basis \( (e_1, \ldots, e_n) \) for \( V \) with dual basis \( (\varepsilon^1, \ldots, \varepsilon^n) \) for \( V^* \), a general \( (k, \ell) \)-tensor can be written as
\[ \sigma = \sigma^{j_1 \cdots j_\ell}_{i_1 \cdots i_k} \, \varepsilon^{i_1} \otimes \cdots \otimes \varepsilon^{i_k} \otimes e_{j_1} \otimes \cdots \otimes e_{j_\ell}, \]where the components are given by
\[ \sigma^{j_1 \cdots j_\ell}_{i_1 \cdots i_k} = \sigma(e_{i_1}, \ldots, e_{i_k}, \varepsilon^{j_1}, \ldots, \varepsilon^{j_\ell}). \]The Einstein summation convention (sum over repeated upper-lower index pairs) makes these expressions manageable. Upper indices correspond to contravariant (vector) slots, and lower indices correspond to covariant (covector) slots.
8.3 Tensor Bundles and Tensor Fields
We now globalise the construction by assembling tensors over all points of a manifold into a bundle.
It is a smooth vector bundle over \( M \) of rank \( n^{k+\ell} \), so \( T^{(k,\ell)}M \) is a smooth manifold of dimension \( n + n^{k+\ell} \).
In local coordinates \( (x^1, \ldots, x^n) \) on an open set \( U \subseteq M \), a \( (k, \ell) \)-tensor field \( \sigma \) can be written as
\[ \sigma = \sigma^{j_1 \cdots j_\ell}_{i_1 \cdots i_k} \, dx^{i_1} \otimes \cdots \otimes dx^{i_k} \otimes \frac{\partial}{\partial x^{j_1}} \otimes \cdots \otimes \frac{\partial}{\partial x^{j_\ell}}, \]where the component functions \( \sigma^{j_1 \cdots j_\ell}_{i_1 \cdots i_k} \colon U \to \mathbb{R} \) are smooth. The tensor product of two tensor fields is defined pointwise:
\[ (\sigma \otimes \tau)_p = \sigma_p \otimes \tau_p. \]This makes the collection of all tensor fields into a graded algebra over \( C^\infty(M) \).
- \( \mathcal{T}^0_0(M) = C^\infty(M) \): smooth functions.
- \( \mathcal{T}^1_0(M) = \Omega^1(M) \): covector fields (1-forms).
- \( \mathcal{T}^0_1(M) = \mathfrak{X}(M) \): vector fields.
- \( \mathcal{T}^2_0(M) \): covariant 2-tensor fields (bilinear forms on vector fields). Riemannian metrics and Ricci curvature tensors live here.
- \( \mathcal{T}^1_1(M) \): mixed \( (1,1) \)-tensor fields, corresponding to smooth endomorphism fields \( \operatorname{End}(TM) \). Complex structures and almost complex structures are examples.
8.4 Symmetry and Alternation
Many of the most important tensors in geometry possess symmetry properties.
- symmetric if \( \sigma(v_1, \ldots, v_k) \) is unchanged under any permutation of its arguments;
- alternating (or skew-symmetric, or antisymmetric) if \( \sigma(v_1, \ldots, v_k) \) changes sign under any transposition of two arguments.
There are natural projection operators:
- The symmetrisation: \( (\operatorname{Sym} \sigma)(v_1, \ldots, v_k) = \frac{1}{k!} \sum_{\pi \in S_k} \sigma(v_{\pi(1)}, \ldots, v_{\pi(k)}) \).
- The alternation: \( (\operatorname{Alt} \sigma)(v_1, \ldots, v_k) = \frac{1}{k!} \sum_{\pi \in S_k} (\operatorname{sgn} \pi) \, \sigma(v_{\pi(1)}, \ldots, v_{\pi(k)}) \).
The alternating tensors \( \Lambda^k(V^*) \) are the building blocks of differential forms (the subject of a later chapter). For now, the symmetric tensors \( \Sigma^k(V^*) \) are more immediately relevant, as Riemannian metrics are symmetric \( (2,0) \)-tensor fields.
For a 2-tensor \( \sigma \in T^{(2,0)}(V) \), symmetry means \( \sigma(v, w) = \sigma(w, v) \) for all \( v, w \), and skew-symmetry means \( \sigma(v, w) = -\sigma(w, v) \). Every 2-tensor decomposes uniquely as a sum of a symmetric part and a skew-symmetric part:
\[ \sigma = \operatorname{Sym}(\sigma) + \operatorname{Alt}(\sigma), \qquad \text{where} \quad \operatorname{Sym}(\sigma)(v,w) = \tfrac{1}{2}(\sigma(v,w) + \sigma(w,v)) \]and \( \operatorname{Alt}(\sigma)(v,w) = \tfrac{1}{2}(\sigma(v,w) - \sigma(w,v)) \).
8.5 Riemannian Metrics
We now arrive at one of the most important structures in differential geometry: the Riemannian metric. It is the structure that allows us to measure lengths, angles, areas, and volumes on a smooth manifold — concepts that the smooth structure alone does not provide.
- Symmetry: \( g_p(v, w) = g_p(w, v) \) for all \( p \in M \) and \( v, w \in T_pM \).
- Positive definiteness: \( g_p(v, v) > 0 \) for all \( p \in M \) and \( 0 \neq v \in T_pM \).
- Smoothness: for any smooth vector fields \( X, Y \in \mathfrak{X}(M) \), the function \( g(X, Y) \colon M \to \mathbb{R} \) is smooth.
In other words, a Riemannian metric is a smoothly varying choice of inner product on each tangent space. At each point \( p \), the metric \( g_p \colon T_pM \times T_pM \to \mathbb{R} \) is an inner product, and this inner product varies smoothly from point to point.
In local coordinates \( (x^1, \ldots, x^n) \), the metric is expressed as
\[ g = g_{ij} \, dx^i \otimes dx^j, \]where the component functions are
\[ g_{ij} = g\!\left( \frac{\partial}{\partial x^i}, \frac{\partial}{\partial x^j} \right). \]The matrix \( (g_{ij}(p)) \) is a symmetric positive definite \( n \times n \) matrix at each point \( p \). The symmetry of \( g \) means \( g_{ij} = g_{ji} \), and the positive definiteness means the matrix \( (g_{ij}) \) has all positive eigenvalues.
8.6 Examples of Riemannian Metrics
where \( \delta_{ij} \) is the Kronecker delta. This is the Euclidean metric (or flat metric). In matrix form, \( (g_{ij}) = I_n \). The Euclidean metric is “flat” in the sense that its Riemann curvature tensor vanishes identically.
In spherical coordinates \( (\theta^1, \ldots, \theta^n) \) on \( S^n \), the round metric for \( S^2 \) takes the familiar form
\[ g_{S^2} = d\theta^2 + \sin^2\theta \, d\phi^2. \]The round metric has constant sectional curvature \( +1 \).
In matrix form, \( g_{ij} = \frac{\delta_{ij}}{(x^n)^2} \). This metric has constant sectional curvature \( -1 \) and is a model for hyperbolic geometry.
is a Riemannian metric on \( M \). (Positive definiteness uses the injectivity of \( F_{*,p} \).) This is how the round metric on \( S^n \) is defined, and more generally how any submanifold of a Riemannian manifold inherits a Riemannian metric.
where \( \pi_i \colon M_1 \times M_2 \to M_i \) are the projections. In terms of the natural splitting \( T_{(p,q)}(M_1 \times M_2) \cong T_pM_1 \oplus T_qM_2 \), vectors from different factors are orthogonal. The flat torus \( \mathbb{T}^n = \mathbb{R}^n / \mathbb{Z}^n \) with its standard flat metric is a product of circles with the standard metric.
8.7 Existence of Riemannian Metrics
One might worry that Riemannian metrics are difficult to construct. The following theorem, which relies on partitions of unity, shows that every smooth manifold admits at least one Riemannian metric.
which is just the pullback of the Euclidean metric by the coordinate map \( \varphi_\alpha \). Now define
\[ g = \sum_{\alpha \in A} \rho_\alpha \, g_\alpha. \]This is a well-defined smooth symmetric covariant 2-tensor field on \( M \) (the sum is locally finite). It remains to check positive definiteness. For any \( p \in M \) and \( 0 \neq v \in T_pM \),
\[ g_p(v, v) = \sum_{\alpha} \rho_\alpha(p) \, (g_\alpha)_p(v, v). \]Each term is nonneg (since \( \rho_\alpha \ge 0 \) and each \( g_\alpha \) is positive definite), and at least one term is strictly positive (since \( \sum \rho_\alpha = 1 \) and \( (g_\alpha)_p(v,v) > 0 \) whenever \( p \in U_\alpha \) and \( v \neq 0 \)). Thus \( g_p(v,v) > 0 \). \( \square \)
8.8 The Musical Isomorphisms
A Riemannian metric establishes a canonical isomorphism between the tangent and cotangent spaces at each point, resolving the non-canonical nature of the identification between a vector space and its dual. These isomorphisms are called the musical isomorphisms because of the notational conventions used in index notation.
defined at each point \( p \) by
\[ v^\flat(w) = g_p(v, w) \qquad \text{for all } v, w \in T_pM. \]The sharp map (or index-raising map) is the inverse
\[ \sharp \colon T^*M \to TM, \qquad \sharp = \flat^{-1}. \]That is, \( \omega^\sharp \) is the unique vector satisfying \( g(\omega^\sharp, w) = \omega(w) \) for all \( w \in T_pM \).
The flat map \( \flat \) is well-defined and invertible because \( g_p \) is a nondegenerate bilinear form (being positive definite). The name “flat” comes from the fact that in musical notation, the flat symbol \( \flat \) lowers the pitch — and here it “lowers an index” from upper (vector) to lower (covector) position. Similarly, the sharp \( \sharp \) “raises an index.”
In local coordinates, if \( v = v^i \frac{\partial}{\partial x^i} \), then
\[ v^\flat = g_{ij} v^i \, dx^j. \]That is, the components of \( v^\flat \) are \( (v^\flat)_j = g_{ij} v^i \): the metric “lowers the index.” Conversely, if \( \omega = \omega_j \, dx^j \), then
\[ \omega^\sharp = g^{ij} \omega_j \, \frac{\partial}{\partial x^i}, \]where \( (g^{ij}) \) is the inverse matrix of \( (g_{ij}) \), i.e., \( g^{ik}g_{kj} = \delta^i_j \). The components of \( \omega^\sharp \) are \( (\omega^\sharp)^i = g^{ij}\omega_j \): the inverse metric “raises the index.”
The musical isomorphisms extend to tensor fields by acting on individual indices, and they allow us to freely move indices up and down. This is a powerful computational tool in Riemannian geometry.
8.9 The Gradient
The musical isomorphisms give us the correct definition of the gradient vector field on a Riemannian manifold.
Equivalently, \( \operatorname{grad} f \) is the unique smooth vector field satisfying
\[ g(\operatorname{grad} f, X) = df(X) = Xf \qquad \text{for all } X \in \mathfrak{X}(M). \]This definition makes it clear that the gradient depends on the choice of Riemannian metric, unlike the differential \( df \), which is purely a smooth-manifold concept. In Euclidean space \( (\mathbb{R}^n, \bar{g}) \), the gradient reduces to the classical gradient:
\[ \operatorname{grad} f = \frac{\partial f}{\partial x^i} \frac{\partial}{\partial x^i} = \nabla f, \]because \( g^{ij} = \delta^{ij} \). On a general Riemannian manifold, however, the components of the gradient involve the inverse metric:
\[ (\operatorname{grad} f)^i = g^{ij} \frac{\partial f}{\partial x^j}. \]For the height function \( h = \cos\theta \), we have \( dh = -\sin\theta \, d\theta \), so
\[ \operatorname{grad} h = g^{11}(-\sin\theta) \frac{\partial}{\partial \theta} + g^{22}(0) \frac{\partial}{\partial \phi} = -\sin\theta \frac{\partial}{\partial \theta}. \]The gradient points “downhill” along lines of longitude, as expected.
- \( \operatorname{grad}(f + h) = \operatorname{grad} f + \operatorname{grad} h \).
- \( \operatorname{grad}(fh) = f \operatorname{grad} h + h \operatorname{grad} f \) (product rule).
- \( \operatorname{grad} f = 0 \) if and only if \( f \) is locally constant.
- At a point \( p \) where \( df_p \neq 0 \), the gradient \( \operatorname{grad} f|_p \) points in the direction of steepest ascent of \( f \), and \( |\operatorname{grad} f|_p| = \|df_p\|_{g^{-1}} \).
8.10 Inner Product on Covectors and Forms
The Riemannian metric \( g \) on \( TM \) induces a corresponding inner product on \( T^*M \) (and more generally on all tensor bundles) via the musical isomorphisms.
for all \( \omega, \eta \in T_p^*M \). In local coordinates,
\[ g^{-1}(\omega, \eta) = g^{ij} \omega_i \eta_j. \]This gives a pointwise inner product on \( T_p^*M \) that is positive definite (since \( g \) is). The norm of a covector is \( |\omega|_{g^{-1}} = \sqrt{g^{ij}\omega_i\omega_j} \). In particular,
\[ |\operatorname{grad} f|^2 = g(\operatorname{grad} f, \operatorname{grad} f) = g^{-1}(df, df) = g^{ij} \frac{\partial f}{\partial x^i} \frac{\partial f}{\partial x^j}. \]More generally, the metric induces inner products on all tensor bundles \( T^{(k,\ell)}M \) by contracting indices with \( g_{ij} \) and \( g^{ij} \). For example, on \( (2,0) \)-tensors:
\[ \langle \sigma, \tau \rangle = g^{i_1 j_1} g^{i_2 j_2} \sigma_{i_1 i_2} \tau_{j_1 j_2}. \]These inner products are essential for the \( L^2 \) theory of differential forms (Hodge theory), which we will encounter in later chapters.
8.11 Lengths of Curves and the Riemannian Distance
A Riemannian metric allows us to measure the lengths of tangent vectors, and thereby the lengths of curves.
In local coordinates, if \( \gamma(t) = (x^1(t), \ldots, x^n(t)) \), then
\[ L(\gamma) = \int_a^b \sqrt{g_{ij}(\gamma(t)) \frac{dx^i}{dt} \frac{dx^j}{dt}} \, dt. \]The length is independent of the parametrisation of the curve (as one can verify by the change-of-variables formula), which is geometrically natural.
This is a deep and important result: it says that the Riemannian metric (a smooth tensor field) determines a metric space structure on \( M \), and that this metric space structure is compatible with the original topology. The curves that minimise length (locally) are called geodesics, and their study is a central topic in Riemannian geometry.
8.12 Transformation of Metric Components
Under a change of coordinates \( x^i \mapsto \tilde{x}^j \), the metric components transform according to the standard covariant tensor transformation law:
\[ \tilde{g}_{k\ell} = \frac{\partial x^i}{\partial \tilde{x}^k} \frac{\partial x^j}{\partial \tilde{x}^\ell} \, g_{ij}. \]This can be written in matrix form as \( \tilde{G} = J^T G J \), where \( J = \left( \frac{\partial x^i}{\partial \tilde{x}^j} \right) \) is the Jacobian of the coordinate change and \( G = (g_{ij}) \). The positive definiteness of \( G \) is preserved since \( J \) is invertible.
Substituting:
\[ g = (\cos^2\theta + \sin^2\theta) \, dr^2 + (-r\sin\theta\cos\theta + r\sin\theta\cos\theta)(dr \, d\theta + d\theta \, dr) + r^2(\sin^2\theta + \cos^2\theta) \, d\theta^2, \]which simplifies to
\[ g = dr^2 + r^2 \, d\theta^2. \]The metric matrix is \( (g_{ij}) = \operatorname{diag}(1, r^2) \), and the inverse is \( (g^{ij}) = \operatorname{diag}(1, r^{-2}) \).
8.13 Isometries and Local Isometries
for all \( p \in M \) and \( v, w \in T_pM \). More generally, a smooth map \( F \colon M \to N \) is a local isometry if every point \( p \in M \) has a neighbourhood \( U \) such that \( F|_U \colon U \to F(U) \) is an isometry onto its image.
Isometries are the structure-preserving maps of Riemannian geometry. They preserve lengths, angles, areas, curvature, and all other Riemannian invariants. The set of all isometries from \( (M, g) \) to itself forms a group under composition, called the isometry group \( \operatorname{Isom}(M, g) \).
This concludes our introduction to Riemannian metrics. The metric tensor \( g \) is the foundation upon which the entire edifice of Riemannian geometry is built. In subsequent chapters, we will use the metric to define connections, curvature, geodesics, and the Laplacian, revealing the deep interplay between geometry and analysis on manifolds.
Chapter 9: Differential Forms
9.1 The Algebra of Alternating Tensors
In our study of smooth manifolds, we have encountered tangent vectors, cotangent vectors, and general tensors. We now turn to a particularly important class of tensors — the alternating (or antisymmetric) tensors — which will form the algebraic backbone of the theory of differential forms. Differential forms are the natural objects to integrate on manifolds, and they encode geometric and topological information in a remarkably elegant way.
Let \( V \) be a finite-dimensional real vector space of dimension \( n \). Recall that a covariant \( k \)-tensor on \( V \) is a multilinear map \( \alpha \colon V^k \to \mathbb{R} \). We say that \( \alpha \) is alternating (or skew-symmetric) if it changes sign whenever two of its arguments are transposed.
The vector space of alternating \( k \)-tensors on \( V \) is denoted \( \Lambda^k(V^*) \).
By convention, \( \Lambda^0(V^*) = \mathbb{R} \) and \( \Lambda^1(V^*) = V^* \). Since an alternating \( k \)-tensor must vanish whenever two of its arguments are equal (set \( v_i = v_j \) and use the sign change), it follows that \( \Lambda^k(V^*) = 0 \) for \( k > n = \dim V \). The space \( \Lambda^k(V^*) \) has dimension \( \binom{n}{k} \).
Given a basis \( (e_1, \ldots, e_n) \) for \( V \) with dual basis \( (\varepsilon^1, \ldots, \varepsilon^n) \), a basis for \( \Lambda^k(V^*) \) is given by the collection
\[ \{ \varepsilon^{i_1} \wedge \cdots \wedge \varepsilon^{i_k} : 1 \le i_1 < i_2 < \cdots < i_k \le n \}. \]We will define the wedge product momentarily.
The alternation operator. There is a natural projection from the space of all covariant \( k \)-tensors onto the subspace of alternating ones. Define \( \operatorname{Alt} \colon T^k(V^*) \to \Lambda^k(V^*) \) by
\[ (\operatorname{Alt}\, \alpha)(v_1, \ldots, v_k) = \frac{1}{k!} \sum_{\sigma \in S_k} (\operatorname{sgn}\, \sigma)\, \alpha(v_{\sigma(1)}, \ldots, v_{\sigma(k)}). \]One readily checks that \( \operatorname{Alt}\, \alpha \) is indeed alternating, and that \( \operatorname{Alt}\, \alpha = \alpha \) if and only if \( \alpha \) is already alternating.
9.2 The Wedge Product
The tensor product of two alternating tensors is generally not alternating. To obtain an alternating tensor from the product, we apply the alternation operator with appropriate normalization.
The combinatorial prefactor ensures that the wedge product of basis covectors agrees with the determinant. Explicitly, if \( \varepsilon^1, \ldots, \varepsilon^n \) is a dual basis, then
\[ (\varepsilon^{i_1} \wedge \cdots \wedge \varepsilon^{i_k})(v_1, \ldots, v_k) = \det \begin{pmatrix} \varepsilon^{i_1}(v_1) & \cdots & \varepsilon^{i_1}(v_k) \\ \vdots & \ddots & \vdots \\ \varepsilon^{i_k}(v_1) & \cdots & \varepsilon^{i_k}(v_k) \end{pmatrix}. \]The wedge product satisfies several fundamental algebraic properties.
- Bilinearity: \( \wedge \) is bilinear in each factor.
- Associativity: \( (\alpha \wedge \beta) \wedge \gamma = \alpha \wedge (\beta \wedge \gamma) \).
- Graded commutativity: \( \alpha \wedge \beta = (-1)^{k\ell}\, \beta \wedge \alpha \).
The graded commutativity is the key feature distinguishing the exterior algebra from the tensor algebra. In particular, if \( \alpha \) is a 1-form, then \( \alpha \wedge \alpha = 0 \), since \( (-1)^{1 \cdot 1} = -1 \) forces \( \alpha \wedge \alpha = -\alpha \wedge \alpha \).
The exterior algebra of \( V^* \) is the direct sum
\[ \Lambda^*(V^*) = \bigoplus_{k=0}^{n} \Lambda^k(V^*), \]which is a graded, associative, graded-commutative algebra of total dimension \( 2^n \).
9.3 Differential Forms on Manifolds
With the pointwise algebra of alternating tensors in hand, we now globalize to manifolds. At each point \( p \) of a smooth manifold \( M \), we have the cotangent space \( T_p^*M \), and we can form \( \Lambda^k(T_p^*M) \). Assembling these vector spaces as \( p \) varies over \( M \) produces a vector bundle \( \Lambda^k(T^*M) \to M \).
A 0-form is simply a smooth function \( f \in C^\infty(M) = \Omega^0(M) \). A 1-form is a smooth section of the cotangent bundle, which we have already studied extensively. In local coordinates \( (x^1, \ldots, x^n) \) on a chart \( (U, \varphi) \), every \( k \)-form \( \omega \in \Omega^k(M) \) can be written as
\[ \omega = \sum_{i_1 < \cdots < i_k} \omega_{i_1 \cdots i_k}\, dx^{i_1} \wedge \cdots \wedge dx^{i_k}, \]where the coefficient functions \( \omega_{i_1 \cdots i_k} \in C^\infty(U) \) are smooth. The wedge product of differential forms is defined pointwise, making \( \Omega^*(M) = \bigoplus_{k=0}^{n} \Omega^k(M) \) into a graded algebra over \( C^\infty(M) \).
- The 1-form \( \omega = x\, dy - y\, dx \) encodes a rotational quantity.
- The 2-form \( \eta = x\, dy \wedge dz + y\, dz \wedge dx + z\, dx \wedge dy \) is related to the flux of the radial vector field through a surface.
- The 3-form \( \mu = f(x,y,z)\, dx \wedge dy \wedge dz \) is a "volume element" weighted by \( f \).
9.4 The Exterior Derivative
The exterior derivative is a first-order differential operator that generalizes the total derivative of a function to forms of arbitrary degree. It is the single most important operator in the calculus of differential forms.
- \( d \) on functions: For \( f \in \Omega^0(M) = C^\infty(M) \), \( df \) is the differential of \( f \), i.e., \( df(X) = Xf \) for all vector fields \( X \).
- Nilpotency: \( d \circ d = 0 \), i.e., \( d(d\omega) = 0 \) for all \( \omega \).
- Graded Leibniz rule: For \( \alpha \in \Omega^k(M) \) and \( \beta \in \Omega^\ell(M) \), \[ d(\alpha \wedge \beta) = d\alpha \wedge \beta + (-1)^k \alpha \wedge d\beta. \]
The proof of existence proceeds by constructing \( d \) in local coordinates and showing the result is independent of the choice of coordinates. In a coordinate chart \( (x^1, \ldots, x^n) \), if
\[ \omega = \sum_{I} \omega_I\, dx^I, \]where \( I = (i_1, \ldots, i_k) \) is an increasing multi-index and \( dx^I = dx^{i_1} \wedge \cdots \wedge dx^{i_k} \), then
\[ d\omega = \sum_{I} d\omega_I \wedge dx^I = \sum_{I} \sum_{j=1}^{n} \frac{\partial \omega_I}{\partial x^j}\, dx^j \wedge dx^I. \]because mixed partial derivatives are symmetric while \( dx^i \wedge dx^j \) is antisymmetric. For a general \( k \)-form \( \omega = \omega_I\, dx^I \), we use the Leibniz rule:
\[ d^2(\omega_I\, dx^I) = d(d\omega_I \wedge dx^I) = d^2\omega_I \wedge dx^I - d\omega_I \wedge d(dx^I). \]Since \( d^2\omega_I = 0 \) by the function case, and \( d(dx^I) = 0 \) because each \( dx^j \) is closed (\( d(dx^j) = 0 \) by the coordinate computation), both terms vanish.
- If \( f \in C^\infty(\mathbb{R}^3) \), then \( df = \frac{\partial f}{\partial x} dx + \frac{\partial f}{\partial y} dy + \frac{\partial f}{\partial z} dz \). This corresponds to the gradient \( \nabla f \).
- If \( \omega = P\,dx + Q\,dy + R\,dz \) is a 1-form, then
\[
d\omega = \Bigl(\frac{\partial Q}{\partial x} - \frac{\partial P}{\partial y}\Bigr) dx \wedge dy + \Bigl(\frac{\partial R}{\partial y} - \frac{\partial Q}{\partial z}\Bigr) dy \wedge dz + \Bigl(\frac{\partial P}{\partial z} - \frac{\partial R}{\partial x}\Bigr) dz \wedge dx.
\]
This corresponds to the curl \( \nabla \times \mathbf{F} \).
- If \( \eta = A\, dy \wedge dz + B\, dz \wedge dx + C\, dx \wedge dy \) is a 2-form, then
\[
d\eta = \Bigl(\frac{\partial A}{\partial x} + \frac{\partial B}{\partial y} + \frac{\partial C}{\partial z}\Bigr) dx \wedge dy \wedge dz.
\]
This corresponds to the divergence \( \nabla \cdot \mathbf{F} \).
corresponds (via the musical isomorphisms of a Riemannian metric on \(\mathbb{R}^3\)) to
\[ C^\infty(\mathbb{R}^3) \xrightarrow{\nabla} \mathfrak{X}(\mathbb{R}^3) \xrightarrow{\nabla \times} \mathfrak{X}(\mathbb{R}^3) \xrightarrow{\nabla \cdot} C^\infty(\mathbb{R}^3). \]The identity \(d^2 = 0\) says “\(\text{curl} \circ \text{grad} = 0\)” and “\(\text{div} \circ \text{curl} = 0\).” The cohomology groups \(H^k_{\text{dR}}(\mathbb{R}^3)\) all vanish for \(k \geq 1\) (by the Poincaré lemma), which expresses the familiar facts: every irrotational vector field on \(\mathbb{R}^3\) has a potential, and every divergence-free vector field on \(\mathbb{R}^3\) is a curl. On a non-contractible domain these facts can fail, and the failures are measured by \(H^1\) and \(H^2\) respectively.
9.5 The Invariant Formula for the Exterior Derivative
While the coordinate formula for \( d \) is useful for computation, it is often valuable to have a coordinate-free expression. For a \( k \)-form \( \omega \) and vector fields \( X_0, \ldots, X_k \), the exterior derivative is given by:
where a hat denotes omission of that argument.
For example, when \( \omega \) is a 1-form and \( X, Y \) are vector fields,
\[ d\omega(X, Y) = X(\omega(Y)) - Y(\omega(X)) - \omega([X, Y]). \]This formula is extremely important in differential geometry, particularly in the theory of connections and curvature.
9.6 Pullback of Differential Forms
If \( F \colon M \to N \) is a smooth map and \( \omega \) is a differential form on \( N \), we can pull \( \omega \) back to a form on \( M \). This operation is contravariantly functorial and interacts beautifully with all the algebraic and differential operations on forms.
for all \( p \in M \) and \( v_1, \ldots, v_k \in T_pM \).
- \( F^*(\alpha \wedge \beta) = (F^*\alpha) \wedge (F^*\beta) \) for all forms \( \alpha, \beta \) on \( N \).
- \( d(F^*\omega) = F^*(d\omega) \) for all \( \omega \in \Omega^k(N) \), i.e., pullback commutes with the exterior derivative.
- \( (G \circ F)^* = F^* \circ G^* \) for smooth maps \( F \colon M \to N \) and \( G \colon N \to P \).
- \( (\mathrm{Id}_M)^* = \mathrm{Id}_{\Omega^*(M)} \).
Property (2) is particularly significant: the exterior derivative is a natural operator. The proof uses the coordinate formula for \( d \) together with the chain rule for the differential \( dF \). Properties (3) and (4) express the fact that \( \Omega^*(-) \) is a contravariant functor from the category of smooth manifolds to the category of graded algebras.
In local coordinates, if \( F \colon M \to N \) is given in coordinates by \( y^j = F^j(x^1, \ldots, x^m) \), then the pullback of a basic 1-form is \( F^*(dy^j) = \sum_i \frac{\partial F^j}{\partial x^i} dx^i \), and pullback of a general form follows from linearity and the wedge product property.
9.7 Interior Product and Cartan’s Magic Formula
The interior product (also called contraction) is an algebraic operation that “inserts” a vector field into a differential form, reducing its degree by one.
By convention, \( \iota_X f = 0 \) for \( f \in \Omega^0(M) \).
The interior product is a graded derivation of degree \( -1 \): it satisfies
\[ \iota_X(\alpha \wedge \beta) = (\iota_X \alpha) \wedge \beta + (-1)^k \alpha \wedge (\iota_X \beta) \]for \( \alpha \in \Omega^k(M) \), and \( \iota_X \circ \iota_X = 0 \). It is \( C^\infty(M) \)-linear in \( X \) but not in \( \omega \).
The most remarkable formula involving the interior product connects it to the Lie derivative of differential forms.
Equivalently, \( \mathcal{L}_X = \iota_X \circ d + d \circ \iota_X \).
Cartan’s formula is indispensable in computations involving the Lie derivative. It also provides the key tool for proving homotopy invariance of de Rham cohomology.
- \( [d, d] = 2d^2 = 0 \),
- \( [\iota_X, \iota_Y] = 0 \),
- \( [d, \iota_X] = \mathcal{L}_X \) (Cartan's formula),
- \( [\mathcal{L}_X, \iota_Y] = \iota_{[X,Y]} \),
- \( [\mathcal{L}_X, d] = 0 \),
- \( [\mathcal{L}_X, \mathcal{L}_Y] = \mathcal{L}_{[X,Y]} \).
Chapter 10: Orientations and Integration on Manifolds
10.1 Orientation of Vector Spaces
Before we can integrate differential forms on manifolds, we need the concept of orientation. The idea is simple: an orientation is a consistent choice of “which ordered bases are positively oriented.” We begin with the linear algebra.
This relation is an equivalence relation on the set of ordered bases of \( V \), and it partitions the bases into exactly two equivalence classes. An orientation of \( V \) is a choice of one of these two equivalence classes. The chosen class is called the class of positively oriented bases; the other class consists of negatively oriented bases. A vector space together with a choice of orientation is called an oriented vector space.
There is an equivalent description using the top exterior power. Since \( \dim \Lambda^n(V^*) = 1 \), a nonzero element \( \mu \in \Lambda^n(V^*) \) determines an orientation: a basis \( (v_1, \ldots, v_n) \) is positively oriented if and only if \( \mu(v_1, \ldots, v_n) > 0 \). Two nonzero \( n \)-forms \( \mu \) and \( \mu' \) determine the same orientation if and only if \( \mu' = c\mu \) for some \( c > 0 \).
10.2 Orientations of Manifolds
We now extend the notion of orientation from vector spaces to manifolds by requiring a consistent, continuously varying choice of orientation on each tangent space.
Equivalently, \( M \) is orientable if and only if there exists a nowhere-vanishing smooth \( n \)-form on \( M \). This is the most useful characterization in practice.
- Every \( \mathbb{R}^n \) is orientable, with the standard orientation given by \( dx^1 \wedge \cdots \wedge dx^n \).
- Every sphere \( S^n \) is orientable.
- Every Lie group is orientable (any nonzero left-invariant \( n \)-form is nowhere vanishing).
- The Möbius band is not orientable.
- The real projective space \( \mathbb{R}P^n \) is orientable if and only if \( n \) is odd.
A smooth map \( F \colon M \to N \) between oriented \( n \)-manifolds is orientation-preserving if the differential \( dF_p \colon T_pM \to T_{F(p)}N \) maps positively oriented bases to positively oriented bases for all \( p \), and orientation-reversing if it reverses the orientation at every point. For a diffeomorphism, this is equivalent to \( F^*\mu_N = f \mu_M \) with \( f > 0 \) everywhere (orientation-preserving) or \( f < 0 \) everywhere (orientation-reversing).
10.3 Volume Forms and the Riemannian Volume Form
A nowhere-vanishing \( n \)-form on an oriented \( n \)-manifold serves as a “volume element.” On a Riemannian manifold, there is a canonical choice.
In local coordinates \( (x^1, \ldots, x^n) \) compatible with the orientation,
\[ dV_g = \sqrt{\det(g_{ij})}\, dx^1 \wedge \cdots \wedge dx^n, \]where \( g_{ij} = g\bigl(\frac{\partial}{\partial x^i}, \frac{\partial}{\partial x^j}\bigr) \).
(where \(x, y, z\) are the standard coordinates on \(\mathbb{R}^3\) restricted to \(S^2 \subset \mathbb{R}^3\)) over the unit sphere \(S^2\) with the outward orientation.
We use spherical coordinates \(x = \sin\theta\cos\phi\), \(y = \sin\theta\sin\phi\), \(z = \cos\theta\), where \(\theta \in (0,\pi)\) and \(\phi \in (0, 2\pi)\). First, we compute the pullbacks of the coordinate differentials:
\[ dx = \cos\theta\cos\phi\,d\theta - \sin\theta\sin\phi\,d\phi, \]\[ dy = \cos\theta\sin\phi\,d\theta + \sin\theta\cos\phi\,d\phi, \]\[ dz = -\sin\theta\,d\theta. \]We compute \(dy \wedge dz\):
\[ dy \wedge dz = (\cos\theta\sin\phi\,d\theta + \sin\theta\cos\phi\,d\phi) \wedge (-\sin\theta\,d\theta) = -\sin^2\theta\cos\phi\,d\phi \wedge d\theta = \sin^2\theta\cos\phi\,d\theta \wedge d\phi. \]Similarly, \(dz \wedge dx = \sin^2\theta\sin\phi\,d\theta \wedge d\phi\) and \(dx \wedge dy = \sin\theta\cos\theta\,d\theta \wedge d\phi\) — but we can compute the sum more efficiently. Note that
\[ x\,dy \wedge dz + y\,dz \wedge dx + z\,dx \wedge dy \]restricted to \(S^2\) equals the Riemannian volume form \(\sin\theta\,d\theta \wedge d\phi\). Indeed, the form \(\omega\) is the standard area form on the sphere; one can verify:
\[ x \cdot \sin^2\theta\cos\phi + y \cdot \sin^2\theta\sin\phi + z \cdot \sin\theta\cos\theta \]\[ = \sin^2\theta\cos^2\phi + \sin^2\theta\sin^2\phi + \cos^2\theta\sin\theta = \sin^2\theta + \cos^2\theta\sin\theta. \]A cleaner approach: the outward unit normal to \(S^2\) at a point \((x,y,z)\) is \(\nu = (x,y,z)\) itself, and the area form equals the contraction \(\iota_\nu(dx \wedge dy \wedge dz)\), which gives precisely \(\omega\). Therefore
\[ \int_{S^2} \omega = \int_{S^2} dV_{g_{S^2}} = \text{area of } S^2 = 4\pi. \]This is consistent with the divergence theorem: \(d\omega = (1+1+1)\,dx \wedge dy \wedge dz = 3\,dV_{\mathbb{R}^3}\), so
\[ \int_{S^2} \omega = \int_{B^3} d\omega = 3 \cdot \text{vol}(B^3) = 3 \cdot \frac{4\pi}{3} = 4\pi. \qquad \checkmark \]10.4 Integration of Differential Forms
We are now in a position to define integration of differential forms on oriented manifolds. The key observation is that an \( n \)-form on an oriented \( n \)-manifold can be integrated using partitions of unity, because a top-degree form transforms under coordinate changes by the absolute value of the Jacobian determinant (with the correct sign, thanks to orientation).
- Single chart: If \( \operatorname{supp}(\omega) \) is contained in a single positively oriented coordinate chart \( (U, \varphi) \) with coordinates \( (x^1, \ldots, x^n) \), write \( \omega = f\, dx^1 \wedge \cdots \wedge dx^n \) on \( U \), and define
\[
\int_M \omega = \int_{\varphi(U)} f \circ \varphi^{-1}\, dx^1 \cdots dx^n,
\]
where the right-hand side is the ordinary Lebesgue integral on \( \mathbb{R}^n \).
- General case: Choose a partition of unity \( \{\psi_\alpha\} \) subordinate to a positively oriented atlas, and define \[ \int_M \omega = \sum_\alpha \int_M \psi_\alpha \omega. \]
The crucial point is that this definition is independent of the choice of oriented atlas and partition of unity. This follows from the change-of-variables formula for multiple integrals: if \( \varphi_\beta \circ \varphi_\alpha^{-1} \) is the transition map, the Jacobian determinant is positive (by the orientation assumption), so the absolute value signs in the change-of-variables formula can be dropped.
- Linearity: \( \int_M (a\omega + b\eta) = a \int_M \omega + b \int_M \eta \).
- Orientation reversal: If \( \overline{M} \) denotes \( M \) with the opposite orientation, then \( \int_{\overline{M}} \omega = -\int_M \omega \).
- Diffeomorphism invariance: If \( F \colon M \to N \) is an orientation-preserving diffeomorphism, then \( \int_N \omega = \int_M F^*\omega \).
10.5 Manifolds with Boundary and Induced Orientation
To state Stokes’s theorem, we need manifolds with boundary and an orientation convention for the boundary.
Recall that a smooth manifold with boundary \( M \) is locally modeled on the upper half-space \( \mathbb{H}^n = \{x \in \mathbb{R}^n : x^n \ge 0\} \). The boundary \( \partial M \) is a smooth \( (n-1) \)-manifold (without boundary).
10.6 Stokes’s Theorem
We now arrive at the crowning result of the theory of differential forms and integration.
where \( \partial M \) carries the induced (Stokes) orientation. If \( \partial M = \emptyset \), the right-hand side is zero.
Case 1: Interior chart. Suppose \( \operatorname{supp}(\omega) \subset U \) where \( U \) is diffeomorphic to an open subset of \( \mathbb{R}^n \). In coordinates, write
\[ \omega = \sum_{j=1}^{n} (-1)^{j-1} f_j\, dx^1 \wedge \cdots \wedge \widehat{dx^j} \wedge \cdots \wedge dx^n. \]Then
\[ d\omega = \Bigl(\sum_{j=1}^{n} \frac{\partial f_j}{\partial x^j}\Bigr) dx^1 \wedge \cdots \wedge dx^n. \]The integral \( \int_M d\omega = \sum_j \int_{\mathbb{R}^n} \frac{\partial f_j}{\partial x^j}\, dx^1 \cdots dx^n \). Each term vanishes by iterated integration: integrating \( \frac{\partial f_j}{\partial x^j} \) with respect to \( x^j \) over all of \( \mathbb{R} \), the compact support ensures the integral of the derivative is zero. Since \( U \cap \partial M = \emptyset \), the right-hand side is also zero.
Case 2: Boundary chart. Suppose \( \operatorname{supp}(\omega) \subset U \) where \( U \) is diffeomorphic to an open subset of \( \mathbb{H}^n = \{x^n \ge 0\} \). The computation is similar, but now integrating \( \frac{\partial f_n}{\partial x^n} \) over \( x^n \in [0, \infty) \) yields a boundary contribution. Specifically,
\[ \int_0^\infty \frac{\partial f_n}{\partial x^n}\, dx^n = -f_n(x^1, \ldots, x^{n-1}, 0), \]and tracking signs using the induced orientation shows this equals \( \int_{\partial M} \omega \).
10.7 Classical Special Cases of Stokes’s Theorem
The classical integral theorems of vector calculus are all special cases of Stokes’s theorem. This unification is one of the great achievements of the language of differential forms.
This is Stokes’s theorem with \( M = [a,b] \), \( \omega = f \), and \( d\omega = f'\, dt \).
This is Stokes’s theorem with \( \omega = P\, dx + Q\, dy \).
This is Stokes’s theorem with \( \omega = F^1\, dy \wedge dz + F^2\, dz \wedge dx + F^3\, dx \wedge dy \).
Consider the 1-form \(\omega = -y\,dx + x\,dy\) on \(\mathbb{R}^3\). We compute both sides of Stokes’ theorem \(\int_M d\omega = \int_{\partial M} \omega\).
Right-hand side: Parametrise \(\partial M\) by \(\gamma(t) = (\cos t, \sin t, 0)\), \(t \in [0, 2\pi]\). Then
\[ \int_{\partial M} \omega = \int_0^{2\pi} (-\sin t)(-\sin t) + (\cos t)(\cos t)\,dt = \int_0^{2\pi} 1\,dt = 2\pi. \]Left-hand side: We compute \(d\omega = d(-y\,dx + x\,dy) = -dy \wedge dx + dx \wedge dy = 2\,dx \wedge dy\). We integrate over the upper hemisphere using the parametrisation \(F(\theta, \phi) = (\sin\theta\cos\phi, \sin\theta\sin\phi, \cos\theta)\) for \(\theta \in [0, \pi/2]\), \(\phi \in [0, 2\pi]\). The pullback of \(dx \wedge dy\) under \(F\) is \(\cos\theta\sin\theta\,d\theta \wedge d\phi\) (the \(z\)-component of the area form). Hence
\[ \int_M d\omega = \int_0^{2\pi}\int_0^{\pi/2} 2\cos\theta\sin\theta\,d\theta\,d\phi = 2\pi \int_0^{\pi/2} \sin(2\theta)\,d\theta = 2\pi \cdot \left[-\frac{\cos(2\theta)}{2}\right]_0^{\pi/2} = 2\pi \cdot 1 = 2\pi. \checkmark \]Both sides equal \(2\pi\), confirming Stokes’ theorem.
10.8 Divergence and the Divergence Theorem on Riemannian Manifolds
On a Riemannian manifold \( (M, g) \), one can define the divergence of a vector field and state the divergence theorem in a coordinate-free manner.
or equivalently, \( d(\iota_X\, dV_g) = (\operatorname{div} X)\, dV_g \) (by Cartan’s formula, since \( d(dV_g) = 0 \)).
In local coordinates with \( dV_g = \sqrt{\det g}\, dx^1 \wedge \cdots \wedge dx^n \) and \( X = X^i \partial_i \), one computes
\[ \operatorname{div} X = \frac{1}{\sqrt{\det g}} \frac{\partial}{\partial x^i}\bigl(\sqrt{\det g}\, X^i\bigr). \]where \( \bar{g} \) is the induced metric on \( \partial M \).
Chapter 11: De Rham Cohomology
11.1 Closed and Exact Forms
The identity \( d^2 = 0 \) — which states that the exterior derivative of an exterior derivative is always zero — is deceptively simple. Its consequences, however, are profound: it gives rise to a cohomology theory that captures deep topological information about the manifold.
- Closed if \( d\omega = 0 \).
- Exact if \( \omega = d\eta \) for some \( \eta \in \Omega^{k-1}(M) \).
Since \( d^2 = 0 \), every exact form is closed: \( B^k(M) \subseteq Z^k(M) \). The central question of de Rham cohomology is: is every closed form exact? The answer depends on the topology of \( M \). The discrepancy between closed and exact forms measures the “nontrivial topology” (holes, cycles) of the manifold.
To ground the discussion in physics: in electrostatics, the electric field \(\mathbf{E}\) satisfies \(\nabla \times \mathbf{E} = 0\) (i.e., the 1-form \(E_x\,dx + E_y\,dy + E_z\,dz\) is closed). The question “can \(\mathbf{E}\) be written as \(-\nabla \phi\) for a global potential \(\phi\)?” (i.e., “is the closed 1-form exact?”) depends on the topology of the domain. In simply connected space (\(\mathbb{R}^3\) or any contractible region), the Poincaré lemma says yes. But if the domain has a “hole” — such as the region \(\mathbb{R}^3 \setminus \{\text{line}\}\) around an infinite wire — then closed forms need not be exact, and a global potential may not exist. De Rham cohomology is precisely the algebraic gadget that measures this obstruction: \(H^1_{\text{dR}}(M) = 0\) if and only if every closed 1-form on \(M\) is exact, i.e., every irrotational vector field has a global potential.
By Stokes’s theorem, if \( \omega = df \) for some function \( f \), then this integral would be zero. Thus \( \omega \) is closed but not exact on \( \mathbb{R}^2 \setminus \{0\} \), reflecting the fact that \( \mathbb{R}^2 \setminus \{0\} \) has a “hole.”
11.2 The De Rham Cohomology Groups
Elements of \( H^k_{\text{dR}}(M) \) are equivalence classes \( [\omega] \), where \( \omega \) is a closed \( k \)-form and \( [\omega] = [\omega'] \) if and only if \( \omega - \omega' \) is exact.
The de Rham cohomology groups are real vector spaces. They are diffeomorphism invariants of \( M \), and in fact, much more is true — they are homotopy invariants. We begin by computing the simplest case.
where \( c \) is the number of connected components of \( M \).
11.3 Functoriality and Diffeomorphism Invariance
Smooth maps between manifolds induce maps on cohomology that go in the “reverse” direction, making de Rham cohomology a contravariant functor.
- \( F^* \) maps closed forms to closed forms (since \( d \circ F^* = F^* \circ d \)).
- \( F^* \) maps exact forms to exact forms (since \( F^*(d\eta) = d(F^*\eta) \)).
Moreover:
- \( (G \circ F)^* = F^* \circ G^* \) on cohomology.
- \( (\mathrm{Id}_M)^* = \mathrm{Id}_{H^*_{\text{dR}}(M)} \).
11.4 Homotopy Invariance
De Rham cohomology is invariant under a much weaker equivalence than diffeomorphism: it is a homotopy invariant. This is one of its most powerful features.
for all \( k \). Consequently, homotopy equivalent manifolds have isomorphic de Rham cohomology.
The proof uses a chain homotopy (or homotopy operator) \( h \colon \Omega^k(N) \to \Omega^{k-1}(M) \) satisfying the fundamental identity
\[ G^*\omega - F^*\omega = d(h\omega) + h(d\omega) \]for all \( \omega \in \Omega^k(N) \). If \( \omega \) is closed, this gives \( G^*\omega - F^*\omega = d(h\omega) \), which means \( [G^*\omega] = [F^*\omega] \) in \( H^k_{\text{dR}}(M) \). The operator \( h \) is constructed explicitly using integration along the fiber of the homotopy \( H \colon M \times [0,1] \to N \).
11.5 The Poincaré Lemma
The most fundamental computation in de Rham cohomology is the cohomology of \( \mathbb{R}^n \) (or, more generally, any contractible space).
That is, every closed \( k \)-form on \( \mathbb{R}^n \) (with \( k \ge 1 \)) is exact. More generally, the same holds for any star-shaped open subset of \( \mathbb{R}^n \), or any contractible manifold.
Since the cohomology of a point is \( \mathbb{R} \) in degree 0 and 0 in all positive degrees, the result follows.
More concretely, we construct the homotopy operator \( h \) explicitly. Let \( \omega = \sum_I f_I(x)\, dx^I \) be a closed \( k \)-form on a star-shaped domain (star-shaped about the origin). Define
\[ (h\omega)(x) = \sum_I \sum_{j \in I} \pm \Bigl(\int_0^1 t^{k-1} f_I(tx)\, dt\Bigr) x^j\, dx^{I \setminus \{j\}}, \]where the precise signs come from contracting with the radial vector field \( R = \sum_i x^i \frac{\partial}{\partial x^i} \) and integrating in \( t \). One then verifies that \( dh\omega + hd\omega = \omega \) for \( k \ge 1 \), so that closed implies exact.
11.6 The Mayer–Vietoris Sequence
The Mayer–Vietoris sequence is the primary computational tool for de Rham cohomology. It allows us to compute the cohomology of a manifold by breaking it into simpler pieces.
where \( i, j \) are the inclusions of \( U, V \) into \( M \), and \( k, \ell \) are the inclusions of \( U \cap V \) into \( U, V \) respectively. The map \( \delta^* \) is the connecting homomorphism.
The sequence arises from the short exact sequence of cochain complexes
\[ 0 \to \Omega^*(M) \xrightarrow{(i^*, j^*)} \Omega^*(U) \oplus \Omega^*(V) \xrightarrow{k^* - \ell^*} \Omega^*(U \cap V) \to 0. \]Surjectivity of the last map uses a partition of unity argument. The connecting homomorphism \( \delta^* \) is constructed by the usual diagram chase.
The Mayer–Vietoris sequence is extremely powerful because it reduces the computation of cohomology to knowledge of simpler pieces. Combined with the Poincaré lemma (which gives us the cohomology of contractible sets), it allows us to compute the cohomology of many manifolds by induction.
11.7 Computations
This is \( 0 \to \mathbb{R} \to \mathbb{R}^2 \to \mathbb{R}^2 \xrightarrow{\delta^*} H^1_{\text{dR}}(S^1) \to 0 \). The map \( \mathbb{R}^2 \to \mathbb{R}^2 \) has rank 1 (one can verify), so \( \ker \delta^* \) is 1-dimensional and \( H^1_{\text{dR}}(S^1) \cong \mathbb{R} \). This confirms our earlier observation that the angle form \( d\theta \) represents a nontrivial cohomology class.
The generator of \( H^n_{\text{dR}}(S^n) \) is the class of any volume form on \( S^n \), normalized so that \( \int_{S^n} \omega = 1 \).
This can be proved using the Künneth formula for de Rham cohomology: if \( M \) and \( N \) are manifolds of finite type, then
\[ H^k_{\text{dR}}(M \times N) \cong \bigoplus_{p+q=k} H^p_{\text{dR}}(M) \otimes H^q_{\text{dR}}(N). \]For \( T^n \), this gives \( H^*_{\text{dR}}(T^n) \cong \Lambda^*(\mathbb{R}^n) \), the exterior algebra on \( n \) generators of degree 1. The generators of \( H^1_{\text{dR}}(T^n) \) are represented by the \( n \) angle forms \( d\theta^1, \ldots, d\theta^n \), and the generators in higher degree are their wedge products.
11.8 Top Cohomology and Degree Theory
The top-degree cohomology of a manifold is closely related to orientability.
- If \( M \) is compact and orientable, then \( H^n_{\text{dR}}(M) \cong \mathbb{R} \). Integration gives an isomorphism \( \int_M \colon H^n_{\text{dR}}(M) \xrightarrow{\sim} \mathbb{R} \).
- If \( M \) is compact and not orientable, then \( H^n_{\text{dR}}(M) = 0 \).
- If \( M \) is not compact, then \( H^n_{\text{dR}}(M) = 0 \).
This theorem has a beautiful application to degree theory. If \( F \colon M \to N \) is a smooth map between compact, connected, oriented \( n \)-manifolds, the induced map \( F^* \colon H^n_{\text{dR}}(N) \to H^n_{\text{dR}}(M) \) is a linear map \( \mathbb{R} \to \mathbb{R} \), hence multiplication by a scalar. This scalar is the degree of \( F \).
for all \( \omega \in \Omega^n(N) \).
The degree is always an integer, and it counts (with signs according to orientation) the number of preimages of a regular value. It is a homotopy invariant: homotopic maps have the same degree.
11.9 Poincaré Duality
Poincaré duality is the crowning theorem of the de Rham theory of compact orientable manifolds. It reveals a striking symmetry among the cohomology groups: the \( k \)-th and \( (n-k) \)-th de Rham cohomology groups of a compact oriented \( n \)-manifold are canonically dual to each other. This symmetry has deep consequences for the topology of such manifolds and is one of the primary reasons that compact orientable manifolds are so well-behaved compared to their non-compact or non-orientable counterparts.
The Poincaré pairing. Let \( M \) be a compact, connected, oriented, smooth manifold of dimension \( n \), without boundary. For \( 0 \leq k \leq n \), define the Poincaré pairing
\[ \langle \cdot, \cdot \rangle \colon H^k_{\text{dR}}(M) \times H^{n-k}_{\text{dR}}(M) \to \mathbb{R}, \qquad \langle [\alpha], [\beta] \rangle = \int_M \alpha \wedge \beta. \]We must first verify that this is well-defined, i.e., that the integral depends only on the cohomology classes \( [\alpha] \) and \( [\beta] \), not on the choice of representatives. Suppose \( \alpha' = \alpha + d\mu \) is another representative of \( [\alpha] \), where \( d\beta = 0 \). Then
\[ \int_M \alpha' \wedge \beta = \int_M \alpha \wedge \beta + \int_M d\mu \wedge \beta = \int_M \alpha \wedge \beta + \int_M d(\mu \wedge \beta), \]where we used \( d\mu \wedge \beta = d(\mu \wedge \beta) \pm \mu \wedge d\beta = d(\mu \wedge \beta) \) since \( d\beta = 0 \). By Stokes’ theorem and the fact that \( \partial M = \emptyset \), the last integral vanishes. An analogous argument handles a change of representative for \( [\beta] \). Hence the Poincaré pairing is well-defined.
is nondegenerate. Consequently, there is a canonical isomorphism
\[ H^k_{\text{dR}}(M) \cong \bigl(H^{n-k}_{\text{dR}}(M)\bigr)^*. \]Since the de Rham cohomology groups are finite-dimensional (which follows from the fact that compact manifolds admit finite good covers), this implies
\[ H^k_{\text{dR}}(M) \cong H^{n-k}_{\text{dR}}(M). \]Consequences for Betti numbers. The Betti numbers of \( M \) are \( b_k = \dim H^k_{\text{dR}}(M) \). Poincaré duality immediately yields
\[ b_k = b_{n-k} \quad \text{for all } 0 \leq k \leq n. \]This symmetry of Betti numbers is a powerful constraint on the topology of compact oriented manifolds.
Since \( n \) is odd, \( (-1)^{n-k} = (-1)^n (-1)^{-k} = -(-1)^{-k} = -(-1)^k \cdot (-1)^{2k} / (-1)^{2k} \); more directly, \( (-1)^k + (-1)^{n-k} = (-1)^k(1 + (-1)^n) = (-1)^k(1-1) = 0 \). Hence every term vanishes and \( \chi(M) = 0 \).
Examples.
Chapter 12: Integral Curves, Flows, and the Frobenius Theorem
12.1 Integral Curves of Vector Fields
A vector field on a smooth manifold assigns a “direction of motion” to each point. The integral curves of a vector field are the curves that are everywhere tangent to the field — they represent the trajectories of particles moving according to the vector field.
In local coordinates \( (x^1, \ldots, x^n) \), if \( V = V^i \frac{\partial}{\partial x^i} \) and \( \gamma(t) = (\gamma^1(t), \ldots, \gamma^n(t)) \), this becomes the system of ordinary differential equations
\[ \frac{d\gamma^i}{dt}(t) = V^i(\gamma^1(t), \ldots, \gamma^n(t)), \quad i = 1, \ldots, n. \]The existence and uniqueness of integral curves is guaranteed by the fundamental theorem of ODEs.
By the uniqueness assertion, for each \( p \in M \) there is a unique maximal integral curve \( \gamma_p \colon I_p \to M \) starting at \( p \), defined on the largest possible interval \( I_p \).
- On \( \mathbb{R}^2 \), the vector field \( V = -y\, \frac{\partial}{\partial x} + x\, \frac{\partial}{\partial y} \) has integral curves \( \gamma(t) = (r\cos(t+\theta_0), r\sin(t+\theta_0)) \), which are circles centered at the origin.
- On \( \mathbb{R} \), the vector field \( V = x^2 \frac{\partial}{\partial x} \) has integral curves \( \gamma(t) = \frac{x_0}{1 - x_0 t} \) for \( \gamma(0) = x_0 \neq 0 \). For \( x_0 > 0 \), this blows up in finite time at \( t = 1/x_0 \), showing that integral curves need not exist for all time.
12.2 Flows
The collection of all integral curves of a vector field assembles into a single object called the flow of the vector field.
- \( \Theta_0 = \mathrm{Id}_M \), i.e., \( \Theta_0(p) = p \) for all \( p \).
- \( \Theta_{t+s} = \Theta_t \circ \Theta_s \) for all \( t, s \in \mathbb{R} \) (the group law).
The group law says that flowing for time \( s \) and then for time \( t \) is the same as flowing for time \( t + s \). In particular, each \( \Theta_t \) is a diffeomorphism of \( M \) with inverse \( \Theta_{-t} \). Thus a global flow is a smooth action of \( (\mathbb{R}, +) \) on \( M \) — a one-parameter group of diffeomorphisms.
Not every vector field is complete. When integral curves may fail to exist for all time, we need the notion of a local flow.
The key theorems about flows establish existence, uniqueness, and the group property in the local setting.
- For each \( (t, p) \in \mathcal{D} \) and \( s \in \mathbb{R} \) such that \( (s, \Theta_t(p)) \in \mathcal{D} \), we have \( (t+s, p) \in \mathcal{D} \) and \[ \Theta_s(\Theta_t(p)) = \Theta_{t+s}(p). \]
- For each \( t \) such that \( \Theta_t \) is defined, it is a diffeomorphism from its domain onto its image, with \( \Theta_t^{-1} = \Theta_{-t} \).
Completeness is guaranteed under natural geometric conditions.
- If \( V \) has compact support, then \( V \) is complete.
- If \( M \) is compact, then every smooth vector field on \( M \) is complete.
12.3 Regular and Singular Points; the Canonical Form Theorem
The behavior of integral curves is qualitatively different at points where the vector field vanishes compared to points where it does not.
- A singular point (or zero) of \( V \) if \( V_p = 0 \).
- A regular point of \( V \) if \( V_p \neq 0 \).
Near a regular point, the flow of a vector field can be completely “straightened out” by a suitable choice of coordinates. This is the content of the following important theorem.
In these coordinates, the integral curves of \( V \) are simply the lines \( t \mapsto (y^1 + t, y^2, \ldots, y^n) \) — they are straight and parallel.
where \( \Theta \) is the flow of \( V \) and \( (0, a^2, \ldots, a^n) \) lies in the hyperplane \( \{x^1 = 0\} \). By construction, \( \Phi \) maps the line \( t \mapsto (t, a^2, \ldots, a^n) \) to the integral curve of \( V \) through \( (0, a^2, \ldots, a^n) \). The differential \( d\Phi_0 \) is the identity (one can check), so by the inverse function theorem, \( \Phi \) is a local diffeomorphism near the origin. Setting \( \varphi = \Phi^{-1} \) gives the desired coordinates.
12.4 Lie Derivatives of Vector Fields
We introduced the Lie derivative of differential forms in Chapter 9 via Cartan’s formula. Now we consider the Lie derivative of one vector field with respect to another, which has a more subtle definition involving the flow.
The key idea is that we cannot directly subtract \( W_{\Theta_t(p)} \) and \( W_p \), because they live in different tangent spaces. We use the flow \( \Theta_t \) to push \( W_{\Theta_t(p)} \) back to \( T_pM \) via the differential of \( \Theta_{-t} \), and then take the derivative.
The remarkable fact is that this dynamical construction yields the same result as the purely algebraic Lie bracket.
Expanding using the product rule for the derivative at \( t = 0 \), one obtains two terms: \( V_p(Wf) - W_p(Vf) = [V,W]_p(f) \).
The Lie derivative satisfies several important properties that make it a powerful tool for studying the geometry of vector fields.
- \( \mathcal{L}_X Y = -\mathcal{L}_Y X \) (antisymmetry, following from \( [X,Y] = -[Y,X] \)).
- \( \mathcal{L}_X(fY) = (Xf)Y + f\mathcal{L}_X Y \) for \( f \in C^\infty(M) \).
- The Jacobi identity: \( \mathcal{L}_X[Y,Z] = [\mathcal{L}_X Y, Z] + [Y, \mathcal{L}_X Z] \), i.e., \( \mathcal{L}_X \) is a derivation of the Lie bracket.
12.5 Commuting Vector Fields and Flows
The Lie bracket measures the failure of two flows to commute. When the bracket vanishes, the flows commute, and this has profound geometric consequences.
- \( [V, W] = 0 \).
- \( \mathcal{L}_V W = 0 \).
- \( \mathcal{L}_W V = 0 \).
- The flows commute wherever both sides are defined: \( \Theta_t \circ \Psi_s = \Psi_s \circ \Theta_t \).
- \( W \) is invariant under the flow of \( V \): \( (\Theta_t)_* W = W \) wherever defined.
- \( V \) is invariant under the flow of \( W \): \( (\Psi_s)_* V = V \) wherever defined.
12.6 Lie Derivatives of Differential Forms
We have already encountered Cartan’s magic formula in Chapter 9. Now that we have the flow interpretation of the Lie derivative, we can give a more conceptual treatment.
The Lie derivative of a differential form \( \omega \) along a vector field \( X \) with flow \( \Theta \) is
\[ (\mathcal{L}_X \omega)_p = \frac{d}{dt}\bigg|_{t=0} (\Theta_t^* \omega)_p. \]This pulls the form back from \( \Theta_t(p) \) to \( p \) using the flow and then differentiates.
A useful consequence is the following formula for the time derivative of the pullback along a flow:
\[ \frac{d}{dt} \Theta_t^* \omega = \Theta_t^* (\mathcal{L}_X \omega). \]This identity is used constantly in applications to fluid dynamics, where it describes how a differential form is transported by a flow.
12.7 Tangent Distributions and the Frobenius Theorem
The Frobenius theorem is one of the fundamental results in differential geometry. It provides a necessary and sufficient condition for a family of subspaces of the tangent bundle to be “integrable” — that is, to arise as the tangent spaces to a foliation of the manifold by submanifolds.
To motivate the theorem, consider the following question: given a system of first-order PDEs of the form
\[ \frac{\partial F}{\partial x^i} = f_i(x^1, \ldots, x^n), \quad i = 1, \ldots, n, \]when does a solution \(F\) exist? The necessary condition is obvious: the mixed partial derivatives must be equal, so \(\frac{\partial f_i}{\partial x^j} = \frac{\partial f_j}{\partial x^i}\) — i.e., the 1-form \(\omega = f_i\,dx^i\) must be closed. In the Frobenius context, this integrability condition is the requirement that the distribution defined by the 1-form be involutive.
In geometric control theory, a distribution \(D\) describes the “allowed directions of motion” at each point of the configuration space. A car, for instance, can move forward or backward and can turn, but it cannot slide sideways — this non-holonomic constraint is captured by a non-integrable distribution. The Frobenius theorem says: if \(D\) is involutive, then you are confined to a single “integral manifold” (a leaf of the foliation) and cannot reach the whole manifold. If \(D\) is non-involutive (like the car), you may be able to access all of the configuration space by combining the allowed motions cleverly. The contact structure on \(\mathbb{R}^3\) (Example 12.27 below) is the archetypal non-integrable distribution, arising in the geometry of rolling without slipping and in thermodynamics.
If \( D \) is integrable, it must be involutive: if \( N \) is an integral submanifold through \( p \), and \( X, Y \) are tangent to \( D \), then \( X \) and \( Y \) restrict to vector fields on \( N \), so \( [X,Y] \) is also tangent to \( N \), hence tangent to \( D \). The Frobenius theorem is the remarkable converse.
Base case \( k = 1 \): A rank-1 distribution is spanned by a single nonvanishing vector field \( V \). Any rank-1 distribution is automatically involutive (the bracket of multiples of \( V \) is again a multiple of \( V \)). The integral submanifolds are the integral curves of \( V \), which exist by the existence theorem for ODEs.
Inductive step: Assume the result for rank \( k-1 \). Let \( D \) be an involutive rank-\( k \) distribution, locally spanned by vector fields \( V_1, \ldots, V_k \). Using the flow-box theorem, we may choose coordinates so that \( V_1 = \frac{\partial}{\partial y^1} \). Since \( [V_1, V_j] \in D \) for all \( j \) (involutivity), we can modify \( V_2, \ldots, V_k \) so that \( [V_1, V_j] = 0 \) for \( j = 2, \ldots, k \) (by subtracting appropriate multiples of \( V_1 \)). Since these modified vector fields commute with \( V_1 = \frac{\partial}{\partial y^1} \), their coefficients are independent of \( y^1 \). The distribution spanned by the projections of \( V_2, \ldots, V_k \) onto the “transverse” slice \( \{y^1 = \text{const}\} \) is a rank-\( (k-1) \) involutive distribution, to which we apply the inductive hypothesis.
The Frobenius theorem has a clean statement for commuting vector fields, which can be viewed as a “simultaneous straightening” result.
This is a generalization of the canonical form theorem (which handles the case \( k = 1 \)) and is proved similarly, using the commutativity of the flows to construct the coordinate chart.
The differential-forms (Pfaffian system) formulation of Frobenius. There is a completely equivalent reformulation of the Frobenius theorem in the language of differential forms, which is often more convenient in practice — especially in the study of PDEs and geometric structures.
A rank-\(k\) distribution \( D \) on an \(n\)-manifold \( M \) can be locally described not by giving \(k\) spanning vector fields, but by giving \(n - k\) linearly independent 1-forms that annihilate \( D \). Concretely, near any point \( p \in M \), there exist smooth 1-forms \( \theta^1, \ldots, \theta^{n-k} \) such that
\[ D_q = \ker \theta^1_q \cap \ker \theta^2_q \cap \cdots \cap \ker \theta^{n-k}_q \quad \text{for all } q \text{ near } p. \]The collection \( \{\theta^1, \ldots, \theta^{n-k}\} \) is called a Pfaffian system (or a codistribution) defining \( D \). The ideal in the exterior algebra generated by \( \theta^1, \ldots, \theta^{n-k} \) is denoted \( \mathcal{I}(D^\perp) \).
That is, each \( d\theta^i \) lies in the ideal \( \mathcal{I}(D^\perp) \) — meaning there exist smooth 1-forms \( \alpha^i{}_j \) such that \( d\theta^i = \sum_j \alpha^i{}_j \wedge \theta^j \).
This formulation is equivalent to the vector-field version: the condition \( d\theta^i(X, Y) = 0 \) for all \( X, Y \in D \) is exactly the statement that \( [X, Y] \in D \) whenever \( X, Y \in D \) (by Cartan’s formula and the definition of the annihilator).
The distribution \( D = \ker \theta \) is a rank-2 distribution on \( \mathbb{R}^3 \) (at each point, \( D_p \) is the 2-dimensional subspace of \( T_p \mathbb{R}^3 \) on which \( \theta \) vanishes). We test integrability using the forms criterion. Compute:
\[ d\theta = d(dz - y\, dx) = -dy \wedge dx = dx \wedge dy. \]We ask: does \( d\theta \) lie in the ideal generated by \( \theta \)? An element of this ideal has the form \( \alpha \wedge \theta \) for some 1-form \( \alpha \). But \( \alpha \wedge \theta \) always contains a \( dz \) or \( dx \) factor (from \( \theta = dz - y\,dx \)), while \( d\theta = dx \wedge dy \) involves only \( dx \) and \( dy \). One can check directly that \( d\theta \wedge \theta = (dx \wedge dy) \wedge (dz - y\,dx) = dx \wedge dy \wedge dz \neq 0 \), so \( d\theta \notin \mathcal{I}(\theta) \). By the Frobenius theorem (forms version), \( D \) is not integrable. This distribution is precisely the standard contact structure on \( \mathbb{R}^3 \), a fundamental example in contact geometry and geometric control theory.
Foliations. When a distribution \( D \) is integrable (equivalently, involutive), the Frobenius theorem guarantees the existence of integral submanifolds through every point. The collection of all maximal connected integral submanifolds is called a foliation of \( M \), and each maximal integral submanifold is called a leaf of the foliation. Thus, an integrable rank-\(k\) distribution partitions \( M \) into a disjoint union of immersed \(k\)-dimensional submanifolds — the leaves — that fit together smoothly in the sense that local coordinates can be chosen making the leaves locally look like parallel \(k\)-dimensional planes in \(\mathbb{R}^n\). The Reeb foliation of \( S^3 \) and the fibres of a submersion \( f \colon M \to N \) (which form a foliation of \( M \) of rank \( \dim M - \dim N \)) are canonical examples.
12.8 Affine Connections and Covariant Derivatives
We now introduce a fundamentally new structure on manifolds: the connection, which provides a way to differentiate vector fields (and more general tensor fields) along curves and in the direction of other vector fields. Unlike the Lie derivative, a connection allows us to differentiate a vector field at a point using only the direction of differentiation at that point, not a full vector field.
The need for connections arises because there is no canonical way to compare tangent vectors at different points of a manifold — the tangent spaces \( T_pM \) and \( T_qM \) are abstractly isomorphic but not canonically so. A connection provides the additional structure needed to “connect” nearby tangent spaces.
satisfying the following three properties:
- \( C^\infty(M) \)-linearity in \( X \): \( \nabla_{fX_1 + gX_2} Y = f \nabla_{X_1} Y + g \nabla_{X_2} Y \) for \( f, g \in C^\infty(M) \).
- \( \mathbb{R} \)-linearity in \( Y \): \( \nabla_X(aY_1 + bY_2) = a \nabla_X Y_1 + b \nabla_X Y_2 \) for \( a, b \in \mathbb{R} \).
- Leibniz rule in \( Y \): \( \nabla_X(fY) = (Xf)Y + f \nabla_X Y \) for \( f \in C^\infty(M) \).
The \( C^\infty(M) \)-linearity in \( X \) is crucial: it means that \( (\nabla_X Y)_p \) depends only on the value \( X_p \in T_pM \), not on the global behavior of \( X \). This is in stark contrast to the Lie bracket \( [X, Y] \), which depends on the derivatives of \( X \). On the other hand, the Leibniz rule in \( Y \) means that \( \nabla_X Y \) at \( p \) depends on the values of \( Y \) along a curve tangent to \( X_p \), not just at \( p \) itself. However, one can show the following locality results.
12.9 Christoffel Symbols
In local coordinates \( (x^1, \ldots, x^n) \), a connection is completely determined by specifying the covariant derivatives of the coordinate vector fields with respect to each other.
Given the Christoffel symbols, the covariant derivative of an arbitrary vector field \( Y = Y^j \frac{\partial}{\partial x^j} \) in the direction \( X = X^i \frac{\partial}{\partial x^i} \) is
\[ \nabla_X Y = X^i \Bigl(\frac{\partial Y^k}{\partial x^i} + \Gamma^k_{ij} Y^j\Bigr) \frac{\partial}{\partial x^k}. \]The expression \( \frac{\partial Y^k}{\partial x^i} + \Gamma^k_{ij} Y^j \) represents the “covariant components” of \( \nabla_X Y \); the Christoffel symbols provide the “correction term” that accounts for the twisting of the coordinate frame.
12.10 Geodesics and Parallel Transport
Two of the most important constructions associated with a connection are geodesics and parallel transport.
In local coordinates, this becomes the geodesic equation:
\[ \frac{d^2 \gamma^k}{dt^2} + \Gamma^k_{ij}(\gamma(t)) \frac{d\gamma^i}{dt} \frac{d\gamma^j}{dt} = 0, \quad k = 1, \ldots, n. \]This is a second-order system of ODEs, so by the existence and uniqueness theorem, given any point \( p \in M \) and any tangent vector \( v \in T_pM \), there exists a unique maximal geodesic \( \gamma \) with \( \gamma(0) = p \) and \( \gamma'(0) = v \).
Given \( v_0 \in T_{\gamma(a)}M \), there exists a unique parallel vector field \( V \) along \( \gamma \) with \( V(a) = v_0 \). The map \( P_\gamma \colon T_{\gamma(a)}M \to T_{\gamma(b)}M \) defined by \( P_\gamma(v_0) = V(b) \) is called parallel transport along \( \gamma \). It is a linear isomorphism.
A geodesic is precisely a curve whose velocity vector is parallel along itself — the curve “goes straight” in the sense determined by the connection. Parallel transport moves vectors along curves “without rotating” them, according to the connection.
12.11 Torsion and the Levi-Civita Connection
Not all connections are created equal. On a Riemannian manifold, there is a canonical connection that is uniquely determined by two natural requirements.
A connection is torsion-free (or symmetric) if \( T = 0 \). In coordinates, this means \( \Gamma^k_{ij} = \Gamma^k_{ji} \).
The torsion measures the extent to which the connection fails to be “symmetric.” While the torsion is a geometric object of interest in its own right (it appears in Einstein–Cartan theory and other generalizations of general relativity), the most important connection in Riemannian geometry is torsion-free.
or equivalently, for all vector fields \( X, Y, Z \):
\[ X(g(Y, Z)) = g(\nabla_X Y, Z) + g(Y, \nabla_X Z). \]This means that parallel transport preserves inner products.
- Torsion-free: \( \nabla_X Y - \nabla_Y X = [X, Y] \).
- Metric-compatible: \( \nabla g = 0 \).
Existence: Define \( \nabla_X Y \) by the Koszul formula (the right-hand side is \( C^\infty \)-linear in \( Z \), so it defines a vector field via the metric). One then verifies the three axioms of a connection and the two additional properties (torsion-free, metric-compatible).
In coordinates, the Christoffel symbols of the Levi-Civita connection are given by
\[ \Gamma^k_{ij} = \frac{1}{2} g^{k\ell}\Bigl(\frac{\partial g_{j\ell}}{\partial x^i} + \frac{\partial g_{i\ell}}{\partial x^j} - \frac{\partial g_{ij}}{\partial x^\ell}\Bigr). \]12.12 The Riemann Curvature Tensor
The curvature of a connection measures the extent to which parallel transport around an infinitesimal loop fails to return a vector to itself. It is the fundamental local invariant of Riemannian geometry.
The curvature tensor measures the failure of second covariant derivatives to commute. If the connection is flat (\( R = 0 \)), then covariant differentiation in different directions commutes, and parallel transport is path-independent.
In local coordinates, the curvature tensor has components
\[ R^l{}_{ijk} = \frac{\partial \Gamma^l_{jk}}{\partial x^i} - \frac{\partial \Gamma^l_{ik}}{\partial x^j} + \Gamma^l_{im} \Gamma^m_{jk} - \Gamma^l_{jm} \Gamma^m_{ik}. \]12.13 Symmetries of the Riemann Tensor
For the Levi-Civita connection on a Riemannian manifold, the curvature tensor possesses remarkable symmetry properties. We define the fully covariant version \( \operatorname{Rm}(X, Y, Z, W) = g(R(X, Y)Z, W) \).
- Skew-symmetry in the first pair: \( \operatorname{Rm}(X, Y, Z, W) = -\operatorname{Rm}(Y, X, Z, W) \).
- Skew-symmetry in the second pair: \( \operatorname{Rm}(X, Y, Z, W) = -\operatorname{Rm}(X, Y, W, Z) \).
- Pair symmetry: \( \operatorname{Rm}(X, Y, Z, W) = \operatorname{Rm}(Z, W, X, Y) \).
- First Bianchi identity: \( R(X, Y)Z + R(Y, Z)X + R(Z, X)Y = 0 \) (equivalently, \( \operatorname{Rm}(X, Y, Z, W) + \operatorname{Rm}(Y, Z, X, W) + \operatorname{Rm}(Z, X, Y, W) = 0 \)).
These symmetries drastically reduce the number of independent components. On an \( n \)-dimensional manifold, the Riemann tensor has \( \frac{n^2(n^2 - 1)}{12} \) independent components (for instance, 1 in dimension 2, 6 in dimension 3, and 20 in dimension 4).
12.14 Sectional, Ricci, and Scalar Curvature
The full Riemann tensor contains a wealth of information. Various traces and contractions extract geometrically meaningful quantities.
This is independent of the choice of basis \( (X, Y) \) for \( \sigma \).
The sectional curvature \( K(\sigma) \) is the Gauss curvature of the “slice” of \( M \) through \( p \) tangent to \( \sigma \) — it measures how geodesics starting at \( p \) in the plane \( \sigma \) spread apart or converge. The sectional curvature completely determines the Riemann tensor.
where \( (E_1, \ldots, E_n) \) is any orthonormal frame. The Ricci tensor is a symmetric \( (0,2) \)-tensor.
It is a smooth function on \( M \).
12.15 Einstein Manifolds
for some constant \( \lambda \in \mathbb{R} \). Equivalently, \( S = n\lambda \), so the scalar curvature is constant.
Einstein manifolds arise naturally in general relativity (the vacuum Einstein field equations with cosmological constant are \( \operatorname{Ric} = \lambda g \)) and in many areas of differential geometry. Important examples include spaces of constant sectional curvature (spheres, Euclidean spaces, hyperbolic spaces), complex projective spaces with the Fubini–Study metric, and products of Einstein manifolds with equal Einstein constants.
12.16 The Exponential Map and the Hopf–Rinow Theorem
We conclude with two important results about geodesics on Riemannian manifolds.
The exponential map at a point \( p \in M \) is defined by
\[ \exp_p \colon T_pM \supset U \to M, \quad \exp_p(v) = \gamma_v(1), \]where \( \gamma_v \) is the geodesic with \( \gamma_v(0) = p \) and \( \gamma_v'(0) = v \), and \( U \) is a neighborhood of \( 0 \in T_pM \) where this is defined.
The exponential map is a local diffeomorphism near \( 0 \), and the coordinates it induces (called normal coordinates or geodesic coordinates) have the special property that \( \Gamma^k_{ij}(p) = 0 \) at the center point.
The Riemannian distance function is defined by
\[ d(p, q) = \inf \Bigl\{ \int_0^1 |\gamma'(t)|_g\, dt : \gamma \text{ is a piecewise smooth curve from } p \text{ to } q \Bigr\}. \]This makes \( (M, d) \) into a metric space whose topology agrees with the manifold topology.
- \( (M, d) \) is a complete metric space (every Cauchy sequence converges).
- The exponential map \( \exp_p \) is defined on all of \( T_pM \) for some (equivalently, every) \( p \in M \).
- Every closed and bounded subset of \( M \) is compact.
The Hopf–Rinow theorem is a cornerstone of Riemannian geometry. It guarantees that on a “geodesically complete” manifold (one where geodesics can be extended indefinitely), the geometric and topological notions of completeness coincide, and optimal paths between any two points always exist. All compact Riemannian manifolds are complete.