MATH 148: Calculus 2 for Honours Mathematics
Estimated reading time: 27 minutes
Table of contents
These notes cover MATH 148 (Calculus 2 for Honours Mathematics) at the University of Waterloo. The course covers the same broad territory as MATH 138 — integration, differential equations, series, and curves — but with the rigour appropriate to a student who has seen epsilon-delta analysis in MATH 147. Where MATH 138 treats the integral as a limiting process to be computed, we develop it as a precise mathematical object: the Darboux integral, built from upper and lower sums, whose existence can be verified from first principles. The payoff is that every theorem we state has an honest proof, and the machinery we build here — uniform convergence, the Fundamental Theorem with full rigour, the Picard–Lindelöf existence theorem — forms the foundation of real analysis.
Chapter 1: The Riemann Integral
Defining integration carefully turns out to be subtle. The naive idea — “sum up infinitely many infinitely thin rectangles” — is not a definition; it is a picture. The Darboux approach makes this precise by bracketing the integral between two quantities we can actually compute: the upper sum, which overestimates by taking the supremum on each subinterval, and the lower sum, which underestimates by taking the infimum. A function is integrable precisely when these two estimates can be made arbitrarily close.
1.1 Upper and Lower Sums
We work throughout with bounded functions on a closed interval \([a,b]\).
Definition 1.1 (Partition). A partition of \([a,b]\) is a finite set \(P = \{x_0, x_1, \ldots, x_n\}\) with \(a = x_0 < x_1 < \cdots < x_n = b\). The mesh of \(P\) is \(\|P\| = \max_{1 \le i \le n}(x_i - x_{i-1})\). A partition \(Q\) is a refinement of \(P\) if \(P \subseteq Q\).
Notice that \(L(f,P) \le U(f,P)\) always, since \(m_i \le M_i\). The key monotonicity property is that refining a partition can only improve the estimates: upper sums decrease and lower sums increase.
Lemma 1.3 (Refinement Lemma). If \(Q\) is a refinement of \(P\), then \(L(f,P) \le L(f,Q)\) and \(U(f,Q) \le U(f,P)\).
since the supremum over a larger set is at least as large. Hence \(U(f,Q) \le U(f,P)\). The lower sum case is analogous. \(\square\)
Corollary 1.4. For any two partitions \(P\) and \(Q\) of \([a,b]\), \(L(f,P) \le U(f,Q)\).
Proof. Let \(R = P \cup Q\). Then \(L(f,P) \le L(f,R) \le U(f,R) \le U(f,Q)\). \(\square\)
\[\sup_P L(f,P) \le \inf_P U(f,P),\]and the gap between these two quantities measures how “integrable” \(f\) is.
1.2 Riemann Integrability
We now have the tools to say precisely what it means for a function to be integrable.
The common value is the definite integral \(\displaystyle\int_a^b f(x)\,dx\).
The following criterion gives a practical way to verify integrability without computing the supremum and infimum explicitly.
Theorem 1.7 (Cauchy Criterion for Integrability). A bounded function \(f\) is integrable on \([a,b]\) if and only if for every \(\varepsilon > 0\) there exists a partition \(P\) such that \(U(f,P) - L(f,P) < \varepsilon\).
Proof. If the criterion holds, then for each \(\varepsilon > 0\) we find \(P\) with \(U(f,P) - L(f,P) < \varepsilon\). Since \(L(f,P) \le \underline{\int} f \le \overline{\int} f \le U(f,P)\), we get \(0 \le \overline{\int} f - \underline{\int} f < \varepsilon\). Since \(\varepsilon\) is arbitrary, the two integrals agree. The converse is similar. \(\square\)
Theorem 1.8 (Continuous Functions are Integrable). If \(f : [a,b] \to \mathbb{R}\) is continuous, then \(f\) is integrable on \([a,b]\).
Theorem 1.9 (Monotone Functions are Integrable). If \(f : [a,b] \to \mathbb{R}\) is monotone and bounded, then \(f\) is integrable on \([a,b]\).
Remark 1.10. More generally, a bounded function with finitely many discontinuities is integrable (each discontinuity can be enclosed in a small subinterval contributing negligible oscillation). The full characterisation — due to Lebesgue — is that a bounded function is Riemann integrable if and only if its set of discontinuities has measure zero. This is proved in PMATH 451.
1.3 Properties of the Integral
The integral inherits the expected algebraic and order properties directly from the definitions.
Theorem 1.11 (Properties of the Definite Integral). Let \(f, g : [a,b] \to \mathbb{R}\) be integrable. Then:
(i) Linearity: \(\int_a^b (cf + g) = c\int_a^b f + \int_a^b g\) for any \(c \in \mathbb{R}\).
(ii) Additivity: If \(c \in (a,b)\), then \(\int_a^b f = \int_a^c f + \int_c^b f\).
(iii) Monotonicity: If \(f \le g\) on \([a,b]\), then \(\int_a^b f \le \int_a^b g\).
(iv) Triangle inequality: \(|f|\) is integrable and \(\left|\int_a^b f\right| \le \int_a^b |f|\).
(v) Bounds: If \(m \le f \le M\) on \([a,b]\), then \(m(b-a) \le \int_a^b f \le M(b-a)\).
Proof. By the Extreme Value Theorem, \(f\) attains its minimum \(m\) and maximum \(M\) on \([a,b]\). By Theorem 1.11(v), \(m \le \frac{1}{b-a}\int_a^b f \le M\). By the Intermediate Value Theorem, \(f\) takes every value between \(m\) and \(M\), so in particular it takes the value \(\frac{1}{b-a}\int_a^b f\) at some \(c \in [a,b]\). \(\square\)
1.4 The Fundamental Theorem of Calculus
The Fundamental Theorem is the central result of calculus: it says that differentiation and integration are inverse operations. We prove both parts with full rigour.
Theorem 1.14 (Fundamental Theorem of Calculus, Part 1). If \(f : [a,b] \to \mathbb{R}\) is integrable and \(f\) is continuous at \(x_0 \in [a,b]\), then \(F\) is differentiable at \(x_0\) and \(F'(x_0) = f(x_0)\).
Given \(\varepsilon > 0\), by continuity of \(f\) at \(x_0\) there exists \(\delta > 0\) such that \(|t - x_0| < \delta\) implies \(|f(t) - f(x_0)| < \varepsilon\). For \(0 < h < \delta\), every \(t \in [x_0, x_0+h]\) satisfies \(|t-x_0| < \delta\), so the expression above is at most \(\frac{1}{h} \cdot \varepsilon h = \varepsilon\). Hence the limit is \(f(x_0)\). \(\square\)
The geometric meaning is immediate: \(F(x)\) measures accumulated area, and its rate of change at any point is precisely the height of \(f\) at that point. If \(f\) is large at \(x_0\), area is accumulating quickly; if \(f\) is small or negative, area accumulates slowly or decreases.
Proof. Let \(F(x) = \int_a^x f(t)\,dt\). By Part 1, \(F' = f = G'\) on \([a,b]\). Hence \((G - F)' = 0\) on \([a,b]\), so \(G - F\) is constant by the Mean Value Theorem: \(G(x) - F(x) = C\) for all \(x\). Setting \(x = a\) gives \(C = G(a) - F(a) = G(a)\) since \(F(a) = 0\). Therefore \(G(b) - F(b) = G(a)\), i.e., \(\int_a^b f = F(b) = G(b) - G(a)\). \(\square\)
Chapter 2: Integration Techniques
With the theoretical foundation secure, we develop the computational toolkit. Every technique is ultimately a restatement of either the chain rule (substitution) or the product rule (integration by parts).
2.1 Substitution
Proof. Let \(F\) be an antiderivative of \(f\). By the chain rule, \(\frac{d}{dx}F(g(x)) = f(g(x))g'(x)\). By FTC Part 2, both sides equal \(F(g(b)) - F(g(a))\). \(\square\)
2.2 Integration by Parts
Proof. The product rule gives \((fg)' = f'g + fg'\). Integrating and applying FTC Part 2 yields the result. \(\square\)
The mnemonic is \(\int u\,dv = uv - \int v\,du\). The art lies in choosing \(u\) and \(dv\). A useful hierarchy for \(u\): logarithms, inverse trig functions, polynomials, trig functions, exponentials (LIPTE). Among the key applications: \(\int \ln x\,dx = x\ln x - x + C\) (taking \(u = \ln x\), \(dv = dx\)); the reduction formula for \(\int x^n e^x\,dx\); and the self-referential trick for \(\int e^x \sin x\,dx\) where integrating by parts twice returns the original integral.
2.3 Trigonometric Substitution
When an integrand contains \(\sqrt{a^2 - x^2}\), \(\sqrt{a^2 + x^2}\), or \(\sqrt{x^2 - a^2}\), a trigonometric substitution eliminates the radical by exploiting a Pythagorean identity.
| Form | Substitution | Identity |
|---|---|---|
| \(\sqrt{a^2 - x^2}\) | \(x = a\sin\theta\) | \(1 - \sin^2\theta = \cos^2\theta\) |
| \(\sqrt{a^2 + x^2}\) | \(x = a\tan\theta\) | \(1 + \tan^2\theta = \sec^2\theta\) |
| \(\sqrt{x^2 - a^2}\) | \(x = a\sec\theta\) | \(\sec^2\theta - 1 = \tan^2\theta\) |
After substituting and integrating in \(\theta\), one converts back to \(x\) using a reference triangle. For example, \(\int_0^1 \sqrt{1-x^2}\,dx = \frac{\pi}{4}\), computing the area of a quarter-circle.
2.4 Partial Fractions
Every rational function \(p(x)/q(x)\) with \(\deg p < \deg q\) can be decomposed into simpler fractions, each of which is straightforward to integrate. The decomposition is guaranteed by the Fundamental Theorem of Algebra, which states that every real polynomial factors completely into linear and irreducible quadratic factors over \(\mathbb{R}\).
Definition 2.3 (Partial Fraction Decomposition). Let \(r(x) = p(x)/q(x)\) with \(\deg p < \deg q\). Factor \(q(x)\) as \(a\prod_j (x - \alpha_j)^{m_j} \prod_k (x^2 + b_k x + c_k)^{n_k}\) over \(\mathbb{R}\). The partial fraction decomposition writes \(r(x)\) as a sum: each linear factor \((x - \alpha_j)^{m_j}\) contributes \(\sum_{s=1}^{m_j} \frac{A_{j,s}}{(x-\alpha_j)^s}\), and each irreducible quadratic factor \((x^2 + b_k x + c_k)^{n_k}\) contributes \(\sum_{s=1}^{n_k} \frac{B_{k,s}x + C_{k,s}}{(x^2+b_kx+c_k)^s}\).
If \(\deg p \ge \deg q\), perform polynomial long division first to write \(r = q_0 + p_1/q\) with \(\deg p_1 < \deg q\). The integrals of each piece involve \(\ln|x - \alpha|\), \(\arctan\), and — for repeated quadratic factors — recursive reduction formulas.
Chapter 3: Improper Integrals
So far our integral handles bounded functions on bounded intervals. Many natural integrands — Gaussian densities, the Gamma function, inverse powers near a singularity — require extending this definition. Improper integrals do so by taking limits, and the question of convergence becomes central.
3.1 Type I and Type II Improper Integrals
when this limit exists (and is finite). Similarly for \(\int_{-\infty}^a\) and \(\int_{-\infty}^\infty\) (the last requires splitting at some finite point \(c\) and demanding both halves converge independently).
Similarly for singularities at \(b\) or at an interior point.
Theorem 3.3 (\(p\)-Test, Type I). The integral \(\int_1^\infty x^{-p}\,dx\) converges if and only if \(p > 1\). When \(p > 1\), its value is \(\frac{1}{p-1}\).
Theorem 3.4 (\(p\)-Test, Type II). The integral \(\int_0^1 x^{-p}\,dx\) converges if and only if \(p < 1\). When \(p < 1\), its value is \(\frac{1}{1-p}\).
Notice that the two \(p\)-tests have complementary conditions: the integral to infinity converges for large \(p\) (the function decays fast enough), while the integral near zero converges for small \(p\) (the singularity is not too severe). Together they govern a vast array of comparison arguments.
3.2 Absolute Convergence
Not every convergent improper integral converges because the integrand is eventually non-negative; the integral may converge through cancellation. This distinction is important.
Definition 3.5 (Absolute Convergence of an Integral). The integral \(\int_a^\infty f(x)\,dx\) converges absolutely if \(\int_a^\infty |f(x)|\,dx\) converges.
Theorem 3.6 (Absolute Convergence Implies Convergence). If \(\int_a^\infty |f(x)|\,dx < \infty\), then \(\int_a^\infty f(x)\,dx\) converges.
Proof. Write \(f = f^+ - f^-\) where \(f^+ = \max(f,0)\) and \(f^- = \max(-f,0)\). Both \(f^+\) and \(f^-\) are non-negative and bounded by \(|f|\), so both integrals converge by the comparison test. Hence \(\int f = \int f^+ - \int f^-\) converges. \(\square\)
A function like \(f(x) = \sin(x)/x\) on \([1,\infty)\) converges conditionally but not absolutely: the oscillation provides cancellation that the absolute value destroys. This parallels the series distinction between absolute and conditional convergence.
3.3 Comparison Tests
Theorem 3.7 (Direct Comparison Test). Suppose \(0 \le g(x) \le f(x)\) for \(x \ge a\), with both functions continuous.
- If \(\int_a^\infty f\,dx\) converges, then \(\int_a^\infty g\,dx\) converges.
- If \(\int_a^\infty g\,dx\) diverges, then \(\int_a^\infty f\,dx\) diverges.
Theorem 3.8 (Limit Comparison Test). Suppose \(f, g > 0\) on \([a,\infty)\) and \(\lim_{x \to \infty} f(x)/g(x) = L\).
- If \(0 < L < \infty\), then \(\int_a^\infty f\) and \(\int_a^\infty g\) either both converge or both diverge.
- If \(L = 0\) and \(\int_a^\infty g\) converges, then \(\int_a^\infty f\) converges.
Definition 3.9 (Gamma Function). For \(x > 0\), the Gamma function is \(\Gamma(x) = \int_0^\infty t^{x-1}e^{-t}\,dt\). This integral converges: the singularity at 0 is of type \(t^{x-1}\) (integrable when \(x > 0\)), and the decay \(e^{-t}\) dominates any polynomial at infinity.
Integrating by parts with \(u = t^x\) and \(dv = e^{-t}\,dt\) yields the functional equation \(\Gamma(x+1) = x\,\Gamma(x)\). Since \(\Gamma(1) = 1\), induction gives \(\Gamma(n) = (n-1)!\) for positive integers \(n\). The Gamma function is the unique (up to normalisation) log-convex extension of the factorial to the positive reals.
Chapter 4: Applications of Integration
Integration measures more than area; it is the natural tool for any quantity built by accumulation — volume, arc length, work, and probability. The unifying idea is always the same: approximate by a Riemann sum, pass to the limit, and the approximation becomes an integral.
4.1 Area and Volume
When the curves cross, one must identify the crossing points and split the integral accordingly.
For volumes, we use the method of cross-sections: if \(A(x)\) is the area of the cross-section of a solid at position \(x\), then the volume is \(V = \int_a^b A(x)\,dx\). Two specialisations are particularly useful.
4.2 Arc Length
For a smooth parametric curve \(\vec{r}(t) = (x(t), y(t))\), \(t \in [a,b]\), the length is obtained by approximating the curve by polygonal segments and passing to the limit. The result is clean because the Euclidean norm of the velocity vector \(\vec{r}'(t) = (x'(t), y'(t))\) is exactly the instantaneous speed.
For a graph \(y = f(x)\), parametrised as \(\vec{r}(t) = (t, f(t))\), this reduces to the familiar formula \(L = \int_a^b \sqrt{1 + (f'(x))^2}\,dx\).
The rigorous justification: approximate by polygonal segments with vertices at \(\vec{r}(t_0), \vec{r}(t_1), \ldots, \vec{r}(t_n)\). The length of the \(i\)-th segment is \(\|\vec{r}(t_i) - \vec{r}(t_{i-1})\|\). By the mean value theorem applied to each component, \(\|\vec{r}(t_i) - \vec{r}(t_{i-1})\| \approx \|\vec{r}'(\xi_i)\|(t_i - t_{i-1})\) for some \(\xi_i\) in the subinterval, and the sum converges to the integral.
Definition 4.5 (Arc Length Function and Unit Tangent). The arc length function \(s(t) = \int_a^t \|\vec{r}'(\tau)\|\,d\tau\) satisfies \(s'(t) = \|\vec{r}'(t)\|\). The unit tangent vector is \(\hat{T}(t) = \vec{r}'(t)/\|\vec{r}'(t)\|\).
4.3 Work and Physics Applications
\[W = \int_a^b F(x)\,dx.\]Hooke’s Law \(F(x) = kx\) for a spring gives \(W = \frac{1}{2}k(b^2 - a^2)\). Gravitational and hydrostatic pressure problems follow the same accumulation principle: slice the region into thin pieces, approximate the contribution of each slice, and integrate.
Chapter 5: Differential Equations
A differential equation relates a function to its derivatives, encoding how a quantity changes in response to its current state. The subject sits at the intersection of analysis, geometry (direction fields), and modelling. For MATH 148, we emphasise: (a) two families of equations with explicit solution formulas, (b) the geometric picture of solution curves, and (c) the fundamental existence and uniqueness theorem that guarantees a unique solution before we search for it.
5.1 Separable Equations
Definition 5.1 (Separable Differential Equation). A first-order ODE is separable if it has the form \(y' = f(x)\,g(y)\). The variables can be “separated”: \(\frac{dy}{g(y)} = f(x)\,dx\).
The method is: (1) find equilibrium solutions \(g(y_0) = 0\); (2) for non-equilibrium solutions, integrate both sides after separating, obtaining \(\int \frac{dy}{g(y)} = \int f(x)\,dx + C\); (3) solve for \(y\) if possible.
The justification is clean: if \(y = \varphi(x)\) is a solution, then by the chain rule and the equation \(\varphi' = f(x)g(\varphi)\), both sides when integrated give the same value, and the equality of antiderivatives is guaranteed by FTC.
5.2 Linear First-Order Equations
Definition 5.2 (First-Order Linear ODE). An ODE is linear and first-order if it has the form \(y' = p(x)\,y + q(x)\), or equivalently \(y' - p(x)\,y = q(x)\).
where \(P(x) = \int p(x)\,dx\). The function \(\mu(x) = e^{-P(x)}\) is the integrating factor.
Proof. Multiply the equation by \(\mu(x) = e^{-P(x)}\). The left side becomes \(\frac{d}{dx}[\mu(x)\,y(x)]\) by the product rule (since \(\mu' = -p(x)\mu\)). Integrating gives \(\mu(x)y(x) = \int q(x)\mu(x)\,dx + C\), and dividing by \(\mu(x)>0\) yields the result. \(\square\)
5.3 Existence and Uniqueness
Before solving any initial value problem, we should ask: does a solution exist, and is it unique? Without an existence theorem, our search for a solution might be futile; without uniqueness, a single initial condition might lead to multiple incompatible solutions. The following theorem, known as the Picard–Lindelöf theorem, answers both questions under mild assumptions.
One verifies that this sequence converges uniformly on a small interval using the Lipschitz condition and the contraction mapping principle.
Remark 5.5. The Lipschitz hypothesis is necessary. The equation \(y' = y^{2/3}\), \(y(0) = 0\) has both \(y \equiv 0\) and \(y = (x/3)^3\) as solutions: uniqueness fails because \(\partial f/\partial y = \frac{2}{3}y^{-1/3}\) is unbounded near \(y = 0\).
5.4 Applications
The following models all reduce to separable or linear first-order ODEs.
Exponential Growth/Decay. The equation \(y' = ky\) (with \(k\) constant) has the unique solution \(y = y_0 e^{k(x-x_0)}\). For radioactive decay, \(k < 0\) and the half-life is \(T_{1/2} = (\ln 2)/|k|\).
Newton’s Law of Cooling. \(T'(t) = k(T - T_e)\) (temperature difference decays exponentially). Solution: \(T(t) = T_e + (T_0 - T_e)e^{kt}\), \(k < 0\).
\[P(t) = \frac{MP_0}{P_0 + (M - P_0)e^{-kt}}.\]As \(t \to \infty\), \(P(t) \to M\). The inflection point occurs at \(P = M/2\), where growth is fastest.
Chapter 6: Sequences and Series
We turn from functions to infinite sums. Intuitively, adding infinitely many numbers can give a finite result — as the geometric series \(1 + 1/2 + 1/4 + \cdots = 2\) shows — but the conditions under which this happens are subtle. The theory of convergence tests organises our knowledge: we learn to compare unknown series against known benchmarks, and to distinguish the more stable absolute convergence from the fragile conditional convergence.
6.1 Sequences and Series
Definition 6.1 (Convergence of a Series). The infinite series \(\sum_{n=1}^\infty a_n\) converges to \(S \in \mathbb{R}\) if the sequence of partial sums \(S_k = \sum_{n=1}^k a_n\) satisfies \(\lim_{k \to \infty} S_k = S\). Otherwise it diverges.
Theorem 6.2 (Geometric Series). The geometric series \(\sum_{n=0}^\infty r^n\) converges to \(\frac{1}{1-r}\) if \(|r| < 1\), and diverges if \(|r| \ge 1\).
Theorem 6.3 (Divergence Test). If \(\sum a_n\) converges, then \(a_n \to 0\). Equivalently, if \(a_n \not\to 0\), the series diverges.
The converse fails: \(\sum 1/n\) diverges despite \(1/n \to 0\).
6.2 Convergence Tests
Theorem 6.4 (Integral Test). Let \(f : [1,\infty) \to \mathbb{R}\) be positive, continuous, and decreasing with \(f(n) = a_n\). Then \(\sum_{n=1}^\infty a_n\) converges if and only if \(\int_1^\infty f(x)\,dx\) converges. Moreover, if the series converges to \(S\), then \(\int_{n+1}^\infty f \le S - S_n \le \int_n^\infty f\).
Theorem 6.5 (\(p\)-Series). The series \(\sum_{n=1}^\infty n^{-p}\) converges if and only if \(p > 1\).
Theorem 6.6 (Comparison Test). Suppose \(0 \le a_n \le b_n\) for all \(n\).
- If \(\sum b_n\) converges, then \(\sum a_n\) converges.
- If \(\sum a_n\) diverges, then \(\sum b_n\) diverges.
Theorem 6.7 (Limit Comparison Test). Suppose \(a_n, b_n > 0\) and \(L = \lim_{n\to\infty} a_n/b_n\).
- If \(0 < L < \infty\): \(\sum a_n\) and \(\sum b_n\) converge or diverge together.
- If \(L = 0\) and \(\sum b_n < \infty\): then \(\sum a_n < \infty\).
Theorem 6.8 (Alternating Series Test). If \(\{a_n\}\) is decreasing and \(a_n \to 0\), then \(\sum_{n=1}^\infty (-1)^{n-1}a_n\) converges. The error in approximating the sum by \(S_k\) satisfies \(|S - S_k| \le a_{k+1}\).
Theorem 6.9 (Ratio Test). Let \(L = \lim_{n\to\infty}|a_{n+1}/a_n|\).
- If \(L < 1\): the series converges absolutely.
- If \(L > 1\): the series diverges.
- If \(L = 1\): the test is inconclusive.
6.3 Absolute and Conditional Convergence
Definition 6.10 (Absolute and Conditional Convergence). A series \(\sum a_n\) converges absolutely if \(\sum |a_n| < \infty\). It converges conditionally if \(\sum a_n\) converges but \(\sum |a_n| = \infty\).
Theorem 6.11 (Absolute Convergence Implies Convergence). If \(\sum |a_n| < \infty\), then \(\sum a_n\) converges.
Proof. Write \(a_n = a_n^+ - a_n^-\) as in the integral case. Both \(\sum a_n^+\) and \(\sum a_n^-\) are bounded above by \(\sum |a_n| < \infty\) and are non-negative, hence convergent. Their difference converges. \(\square\)
Theorem 6.12 (Riemann Rearrangement).
- If \(\sum a_n\) converges absolutely, then every rearrangement converges to the same sum.
- If \(\sum a_n\) converges conditionally, then for any \(\alpha \in \mathbb{R} \cup \{\pm\infty\}\), there exists a rearrangement converging to \(\alpha\).
This dramatic theorem, due to Riemann, shows that absolute convergence is the “right” notion of convergence for infinite sums: only then does the sum not depend on the order of terms.
6.4 Power Series and Radius of Convergence
Definition 6.13 (Power Series). A power series centred at \(a\) is a series of the form \(\sum_{n=0}^\infty c_n(x-a)^n\), where \(x\) is a real variable and \(c_n \in \mathbb{R}\).
Theorem 6.14 (Structure of Convergence). For a power series \(\sum c_n(x-a)^n\), exactly one of the following holds:
- The series converges only at \(x = a\).
- There exists \(R \in (0,\infty)\) such that the series converges absolutely for \(|x-a| < R\) and diverges for \(|x-a| > R\).
- The series converges absolutely for all \(x \in \mathbb{R}\).
Proof sketch. If the series converges at some \(x_1 \ne a\), then the terms \(c_n(x_1-a)^n \to 0\), so they are bounded: \(|c_n| \le M/|x_1-a|^n\). For \(|x-a| < |x_1-a|\), the terms are bounded by \(M(|x-a|/|x_1-a|)^n\, (< 1)\), giving geometric-series convergence. The set of \(x\) for which convergence holds is therefore an interval centred at \(a\). \(\square\)
The number \(R\) is the radius of convergence. It can be computed via the Hadamard formula \(1/R = \limsup_{n\to\infty} |c_n|^{1/n}\), or via the ratio test: if \(\lim |c_{n+1}/c_n| = L\), then \(R = 1/L\).
6.5 Uniform Convergence
Uniform convergence is the key property that allows us to interchange limits with integration and differentiation. It is strictly stronger than pointwise convergence, and power series provide our main supply of uniformly convergent sequences.
Definition 6.15 (Uniform Convergence). A sequence of functions \(f_n : E \to \mathbb{R}\) converges uniformly to \(f : E \to \mathbb{R}\) if for every \(\varepsilon > 0\) there exists \(N\) (independent of \(x\)) such that \(n \ge N\) implies \(|f_n(x) - f(x)| < \varepsilon\) for all \(x \in E\).
The contrast with pointwise convergence is that for pointwise convergence \(N\) may depend on \(x\); for uniform convergence a single \(N\) works everywhere simultaneously.
Theorem 6.16 (Uniform Limit of Continuous Functions is Continuous). If \(f_n\) are continuous on \(E\) and \(f_n \to f\) uniformly, then \(f\) is continuous on \(E\).
Proof. \(\left|\int_a^b f_n - \int_a^b f\right| \le \int_a^b |f_n - f| \le (b-a)\sup_x|f_n(x)-f(x)| \to 0\). \(\square\)
Theorem 6.18 (Power Series Converge Uniformly on Compact Subintervals). If \(\sum c_n(x-a)^n\) has radius of convergence \(R > 0\), then for any \(0 < r < R\) the series converges uniformly on \([a-r, a+r]\).
Proof. For \(|x-a| \le r < R\), we have \(|c_n(x-a)^n| \le |c_n|r^n\). Since \(r < R\), the series \(\sum |c_n|r^n\) converges (ratio test). The Weierstrass \(M\)-test (with \(M_n = |c_n|r^n\)) then gives uniform convergence. \(\square\)
Chapter 7: Taylor Series
Taylor series are the ultimate tool for approximating functions by polynomials. They connect differentiation, integration, and power series into a single coherent picture. In MATH 148, we go beyond computation: we prove that power series can be differentiated and integrated termwise within the radius of convergence, and we examine the theoretical conditions under which a function equals its Taylor series.
7.1 Taylor and Maclaurin Series
When \(a = 0\) this is the Maclaurin series. The \(n\)-th partial sum \(T_{n,a}(x) = \sum_{k=0}^n \frac{f^{(k)}(a)}{k!}(x-a)^k\) is the Taylor polynomial of degree \(n\).
Proof. Define \(g(t) = f(x) - \sum_{k=0}^n \frac{f^{(k)}(t)}{k!}(x-t)^k - K(x-t)^{n+1}\) where \(K\) is chosen so that \(g(a) = 0\). Clearly \(g(x) = 0\). By Rolle’s Theorem applied to \(g\) on \([a,x]\), there exists \(c\) with \(g'(c) = 0\). Computing \(g'(t)\) (telescoping cancellation leaves only \(-\frac{f^{(n+1)}(t)}{n!}(x-t)^n + K(n+1)(x-t)^n\)) and setting it to zero yields \(K = \frac{f^{(n+1)}(c)}{(n+1)!}\). \(\square\)
Theorem 7.3 (Convergence of Taylor Series). If \(f\) has derivatives of all orders on an interval \(I\) containing \(a\) and there exists \(M > 0\) with \(|f^{(n)}(x)| \le M\) for all \(n\) and all \(x \in I\), then \(f(x) = \sum_{n=0}^\infty \frac{f^{(n)}(a)}{n!}(x-a)^n\) for all \(x \in I\).
Proof. By Taylor’s Theorem, \(|R_{n,a}(x)| \le M\frac{|x-a|^{n+1}}{(n+1)!} \to 0\) as \(n \to \infty\), since \(r^n/n! \to 0\) for any fixed \(r\). \(\square\)
\[e^x = \sum_{n=0}^\infty \frac{x^n}{n!}, \quad \cos x = \sum_{k=0}^\infty \frac{(-1)^k x^{2k}}{(2k)!}, \quad \sin x = \sum_{k=0}^\infty \frac{(-1)^k x^{2k+1}}{(2k+1)!}.\]For \(|x| \le 1\): \(\ln(1+x) = \sum_{n=1}^\infty \frac{(-1)^{n-1}x^n}{n}\) and \(\arctan x = \sum_{n=0}^\infty \frac{(-1)^n x^{2n+1}}{2n+1}\). Setting \(x=1\) in the arctan series gives Leibniz’s formula \(\pi/4 = 1 - 1/3 + 1/5 - \cdots\).
7.2 Termwise Differentiation and Integration
The central analytical fact about power series is that they behave exactly like polynomials with respect to differentiation and integration: we may differentiate or integrate term by term, and the radius of convergence is preserved.
The differentiated series also has radius of convergence \(R\).
For each \(n\), by the factorisation \(\frac{u^n - v^n}{u-v} = u^{n-1}+u^{n-2}v+\cdots+v^{n-1}\), the \(n\)-th summand converges to \(n\,c_n(x_0-a)^{n-1}\) as \(x \to x_0\). The key step is that this convergence is dominated uniformly in \(n\) by a convergent series (using the bound on \([a-r,a+r]\)), allowing us to interchange the limit with the sum. Hence \(f'(x_0) = g(x_0)\). \(\square\)
Repeated application shows \(f\) has derivatives of all orders, and evaluating \(f^{(k)}(a)\) gives \(k!\,c_k\), confirming that the coefficients must be \(c_k = f^{(k)}(a)/k!\) — the Taylor coefficients. In other words, if a function has a power series representation, that representation must be its Taylor series.
Proof. The integrated series has the same radius of convergence \(R\) (check via the Hadamard formula). By Theorem 6.17, we may integrate the uniformly convergent partial sums termwise, and the limit of the integral equals the integral of the limit. \(\square\)
7.3 Applications
\[\int_0^x e^{-t^2}\,dt = \sum_{n=0}^\infty \frac{(-1)^n x^{2n+1}}{n!(2n+1)}.\]This series converges for all \(x\) and gives the error function \(\operatorname{erf}(x) = \frac{2}{\sqrt{\pi}}\int_0^x e^{-t^2}\,dt\) to any desired precision.
\[e^{i\theta} = \cos\theta + i\sin\theta.\]This is the starting point for the theory of complex power series.
Brief Note on Complex Power Series. A power series \(\sum_{n=0}^\infty c_n(z-a)^n\) with \(c_n, a, z \in \mathbb{C}\) converges in a disk \(|z-a| < R\) in the complex plane, where \(R\) is again given by the Hadamard formula. Within this disk, all the same termwise differentiation and integration theorems hold, and the sum defines an analytic (holomorphic) function. The real and imaginary parts of a complex power series satisfy the Cauchy–Riemann equations. This is the entry point to PMATH 352 (Complex Analysis).
Appendix: Vector-Valued Functions and Curves
We close with a brief treatment of vector-valued functions, which gives a coordinate-free language for parametric curves and unifies arc length, velocity, and tangent vectors.
Definition A.1 (Vector-Valued Function). A vector-valued function \(\vec{r} : I \to \mathbb{R}^2\) assigns to each \(t \in I\) a vector \(\vec{r}(t) = (x(t), y(t))\). The range of \(\vec{r}\) is called a parametric curve. The function is continuous (resp. differentiable) at \(t_0\) if both component functions are.
In a physical context, \(\vec{r}'(t)\) is the velocity vector and \(\|\vec{r}'(t)\|\) is the speed. The curve is smooth at \(t_0\) if \(\vec{r}'(t_0) \ne (0,0)\), in which case \(\vec{r}'(t_0)\) is tangent to the curve.
By the MVT, \(x(t_i)-x(t_{i-1}) = x'(\xi_i)\Delta t_i\) and \(y(t_i)-y(t_{i-1}) = y'(\eta_i)\Delta t_i\) for some \(\xi_i, \eta_i\) in the subinterval. Since \(\|\vec{r}'\|\) is continuous, as \(\|P\| \to 0\) the Riemann sums converge to the integral. A careful argument using uniform continuity makes this precise.
Remark A.4 (Arc Length Parametrisation). The arc length function \(s(t) = \int_a^t \|\vec{r}'(\tau)\|\,d\tau\) has \(s'(t) = \|\vec{r}'(t)\| > 0\) whenever the curve is smooth. Hence \(s\) is strictly increasing and can be inverted: reparametrising \(\vec{r}\) by arc length yields a curve with speed identically 1, the natural geometric parametrisation. The curvature \(\kappa = \|d\hat{T}/ds\|\) (rate of turning of the unit tangent with respect to arc length) is then a geometric invariant independent of parametrisation.
These notes are intended to accompany the lectures of MATH 148 and should be read alongside a textbook such as Spivak’s Calculus or Apostol’s Calculus, Vol. 1. Problem sets are the primary vehicle for internalising the material; reading without problem-solving is insufficient.