ECE 313: Digital Signal Processing
Zhou Wang
Estimated study time: 1 hr 26 min
Table of contents
These notes provide a rigorous, self-contained treatment of digital signal processing at the level of a third- or fourth-year electrical and computer engineering course. The exposition follows the logical arc of the course: from the algebraic foundations of discrete-time signals and the z-transform, through spectral analysis via the DTFT and DFT, to practical filter design and multirate processing.
Sources and References
The following publicly available references underpin the technical content of these notes.
Oppenheim, A. V. & Schafer, R. W. Discrete-Time Signal Processing, 3rd ed. Prentice Hall, 2010. The standard graduate-level reference for the field; Chapters 2–8 cover z-transform, DTFT, DFT, FFT, and filter design in full rigor.
Proakis, J. G. & Manolakis, D. G. Digital Signal Processing: Principles, Algorithms, and Applications, 4th ed. Pearson, 2006. Comprehensive treatment with detailed examples; especially strong on IIR design and adaptive filters.
Smith, J. O. Mathematics of the Discrete Fourier Transform (DFT). CCRMA, Stanford University. Freely available at ccrma.stanford.edu/~jos/mdft/. Detailed mathematical development of the DFT including spectral leakage, windows, and FFT.
Smith, J. O. Introduction to Digital Filters with Audio Applications. CCRMA, Stanford University. Available at ccrma.stanford.edu/~jos/filters/. Covers z-transform, LTI systems, and filter design with worked examples.
McClellan, J. H., Parks, T. W. & Rabiner, L. R. “A computer program for designing optimum FIR linear phase digital filters.” IEEE Trans. Audio Electroacoust., vol. 21, no. 6, pp. 506–526, 1973. The original Parks–McClellan paper.
MIT OpenCourseWare 6.341: Discrete-Time Signal Processing (open access). Lecture notes and problem sets covering the core ECE 313 syllabus, freely available at ocw.mit.edu.
Vaidyanathan, P. P. Multirate Systems and Filter Banks. Prentice Hall, 1993. The authoritative treatment of polyphase decompositions, noble identities, and perfect-reconstruction filter banks.
Haykin, S. Adaptive Filter Theory, 4th ed. Prentice Hall, 2002. Full development of the LMS algorithm and its variants with convergence proofs.
Chapter 1: Discrete-Time Signals and Systems
1.1 The Discrete-Time Signal Model
A discrete-time signal is a function \(x : \mathbb{Z} \to \mathbb{C}\), written \(x[n]\) for \(n \in \mathbb{Z}\). This abstraction arises naturally whenever a continuous-time signal \(x_a(t)\) is sampled uniformly at period \(T_s\), producing the sequence \(x[n] = x_a(nT_s)\). The integer \(n\) is the sample index; the physical time instant is \(t = nT_s\).
- Unit impulse: \(\delta[n] = 1\) if \(n = 0\), and \(\delta[n] = 0\) otherwise.
- Unit step: \(u[n] = 1\) if \(n \ge 0\), and \(u[n] = 0\) otherwise.
- Real sinusoid: \(x[n] = A\cos(\omega_0 n + \phi)\), where \(\omega_0 \in [0, \pi]\) is the discrete-time frequency in radians per sample.
- Complex exponential: \(x[n] = e^{j\omega_0 n}\). Because \(e^{j(\omega_0 + 2\pi)n} = e^{j\omega_0 n}\) for all \(n \in \mathbb{Z}\), the complex exponential is periodic in \(\omega_0\) with period \(2\pi\). This periodicity is the fundamental difference from the continuous-time case.
Every signal can be decomposed into a weighted sum of shifted impulses:
\[ x[n] = \sum_{k=-\infty}^{\infty} x[k]\,\delta[n - k]. \]This sifting representation is not merely a formula — it is the discrete-time analog of the Dirac-delta sifting property and underpins the entire theory of LTI systems.
1.1.1 Signal Energy and Power
The energy of a signal is
\[ E_x = \sum_{n=-\infty}^{\infty} |x[n]|^2. \]If \(E_x < \infty\) the signal is called an energy signal. For signals with infinite energy the average power
\[ P_x = \lim_{N \to \infty} \frac{1}{2N+1} \sum_{n=-N}^{N} |x[n]|^2 \]may be finite; such signals are power signals. Periodic signals are archetypal power signals; finite-length signals are archetypal energy signals.
1.2 Linear Time-Invariant Systems
for all sequences \(x_1, x_2\) and scalars \(a, b \in \mathbb{C}\). It is time-invariant if a shift of the input causes an identical shift of the output:
\[ y[n] = H\{x[n]\} \implies H\{x[n - n_0]\} = y[n - n_0] \]for every integer \(n_0\).
The decisive structural consequence of linearity and time-invariance is the convolution representation.
Convolution is commutative, associative, and distributive over addition. The commutativity \((x * h)[n] = (h * x)[n]\) is not obvious from the definition but follows directly by the substitution \(m = n - k\).
1.2.1 Properties of LTI Systems
- Causal: \(h[n] = 0\) for all \(n < 0\). Equivalently, the output at time \(n\) depends only on present and past inputs.
- BIBO stable: Every bounded input produces a bounded output. The necessary and sufficient condition is absolute summability of the impulse response: \[ \sum_{n=-\infty}^{\infty} |h[n]| < \infty. \]
- FIR (Finite Impulse Response): \(h[n] = 0\) outside a finite interval \([0, M]\). Equivalently, the system is described by a finite-length convolution sum.
- IIR (Infinite Impulse Response): \(h[n]\) is nonzero for infinitely many \(n\). Typically realized by a linear constant-coefficient difference equation with feedback.
1.2.2 Linear Constant-Coefficient Difference Equations
IIR systems are described by equations of the form
\[ \sum_{k=0}^{N} a_k\,y[n-k] = \sum_{k=0}^{M} b_k\,x[n-k], \]with \(a_0 = 1\) by convention. The system function (transfer function) is obtained by taking the z-transform of both sides, leading directly to Chapter 2.
Chapter 2: The Z-Transform
2.1 Definition and Region of Convergence
where \(z \in \mathbb{C}\) is a complex variable. The region of convergence (ROC) is the set of \(z\) for which the sum converges absolutely.
The ROC has the form of an annulus \(R_- < |z| < R_+\) in the complex plane. The inner and outer radii are determined by the growth rate of \(x[n]\) for \(n \to -\infty\) and \(n \to +\infty\) respectively. Key geometric facts:
- For a right-sided sequence (zero for \(n < N_1\)), the ROC is the exterior of a circle: \(|z| > R_-\).
- For a left-sided sequence (zero for \(n > N_2\)), the ROC is the interior of a circle: \(|z| < R_+\).
- For a two-sided sequence, the ROC is an annulus (possibly empty).
- For a finite-length sequence, the ROC is the entire complex plane except possibly \(z = 0\) or \(z = \infty\).
2.1.1 Common Z-Transform Pairs
The following pairs are derived by direct summation of geometric series.
\[ \delta[n] \;\longleftrightarrow\; 1, \quad \text{ROC: all } z. \]\[ u[n] \;\longleftrightarrow\; \frac{1}{1 - z^{-1}}, \quad |z| > 1. \]\[ a^n u[n] \;\longleftrightarrow\; \frac{1}{1 - az^{-1}}, \quad |z| > |a|. \]\[ -a^n u[-n-1] \;\longleftrightarrow\; \frac{1}{1 - az^{-1}}, \quad |z| < |a|. \]The last two entries demonstrate that the same rational function corresponds to two different sequences depending on the ROC.
2.2 Properties of the Z-Transform
- Linearity: \(ax[n] + by[n] \leftrightarrow aX(z) + bY(z)\).
- Time shift: \(x[n - n_0] \leftrightarrow z^{-n_0} X(z)\).
- Multiplication by \(a^n\): \(a^n x[n] \leftrightarrow X(z/a)\), ROC scaled by \(|a|\).
- Time reversal: \(x[-n] \leftrightarrow X(1/z)\), ROC inverted.
- Convolution: \((x * h)[n] \leftrightarrow X(z)\,H(z)\).
- Differentiation in z: \(n\,x[n] \leftrightarrow -z\,\frac{dX}{dz}(z)\).
The convolution property is the algebraic engine of LTI analysis: convolution in time becomes multiplication in the z-domain.
2.3 System Function and Poles/Zeros
For an LTI system described by the difference equation \(\sum_k a_k y[n-k] = \sum_k b_k x[n-k]\), the system function is the rational function
\[ H(z) = \frac{\sum_{k=0}^{M} b_k\,z^{-k}}{\sum_{k=0}^{N} a_k\,z^{-k}} = \frac{B(z^{-1})}{A(z^{-1})}. \]Multiplying numerator and denominator by \(z^{\max(M,N)}\) yields a ratio of polynomials in \(z\). The roots of \(B(z)\) are the zeros of \(H\); the roots of \(A(z)\) are the poles. The pole-zero plot in the \(z\)-plane is the primary tool for understanding the frequency-selective behavior of a filter.
2.4 Inverse Z-Transform
Three methods are standard:
Partial-fraction expansion. Express \(X(z)\) as a sum of simple terms, each corresponding to a known transform pair (with ROC consistent with the stated ROC). For a simple pole at \(z = a\):
\[ \frac{A}{1 - az^{-1}} \longleftrightarrow \begin{cases} A\,a^n\,u[n] & \text{if } |z| > |a|, \\ -A\,a^n\,u[-n-1] & \text{if } |z| < |a|. \end{cases} \]Long division. Dividing numerator by denominator as power series in \(z^{-1}\) yields the causal sequence \(x[0], x[1], x[2], \ldots\) directly. This is particularly useful for computing the first few terms.
Contour integral. By the theory of complex analysis, \(x[n] = \frac{1}{2\pi j}\oint_C X(z)\,z^{n-1}\,dz\) where \(C\) is any counterclockwise contour inside the ROC. Application of the residue theorem recovers the partial-fraction result.
Chapter 3: The Discrete-Time Fourier Transform
3.1 Definition and Convergence
The inverse DTFT recovers the sequence:
\[ x[n] = \frac{1}{2\pi} \int_{-\pi}^{\pi} X(e^{j\omega})\,e^{j\omega n}\,d\omega. \]The DTFT converges if \(x[n]\) is absolutely summable (\(\ell^1\)). In \(\ell^2\), convergence holds in the mean-square sense with the Parseval relation
\[ \sum_{n=-\infty}^{\infty} |x[n]|^2 = \frac{1}{2\pi} \int_{-\pi}^{\pi} |X(e^{j\omega})|^2\,d\omega. \]The DTFT is the z-transform evaluated on the unit circle: \(X(e^{j\omega}) = X(z)\big|_{z = e^{j\omega}}\). This is valid only when the unit circle lies inside the ROC — exactly the stability condition.
3.1.1 Periodicity and Symmetry
\(X(e^{j\omega})\) is periodic in \(\omega\) with period \(2\pi\). This is the discrete-time counterpart of the discrete (line) spectrum of periodic continuous-time signals, but here the roles are exchanged: a discrete (sampled) time domain gives a continuous, periodic frequency domain.
For real sequences, \(X(e^{-j\omega}) = X^*(e^{j\omega})\) (conjugate symmetry), which implies \(|X(e^{j\omega})|\) is even and \(\angle X(e^{j\omega})\) is odd.
3.2 Properties of the DTFT
The DTFT inherits linearity, time-shift (\(x[n-n_0] \leftrightarrow e^{-j\omega n_0} X(e^{j\omega})\)), frequency-shift (\(e^{j\omega_0 n} x[n] \leftrightarrow X(e^{j(\omega - \omega_0)})\)), and convolution (\((x * h)[n] \leftrightarrow X(e^{j\omega})\,H(e^{j\omega})\)) properties analogous to those of the z-transform.
Two properties deserve emphasis.
With \(y = x\) this gives the energy identity \(E_x = \frac{1}{2\pi}\int_{-\pi}^{\pi} |X(e^{j\omega})|^2\,d\omega\). The quantity \(S_{xx}(\omega) = |X(e^{j\omega})|^2\) is the energy spectral density.
3.3 Frequency Response of LTI Systems
For a stable LTI system with impulse response \(h[n]\), the DTFT \(H(e^{j\omega})\) is the frequency response. The output to the input \(x[n]\) satisfies
\[ Y(e^{j\omega}) = H(e^{j\omega})\,X(e^{j\omega}). \]The frequency response admits the polar decomposition \(H(e^{j\omega}) = |H(e^{j\omega})|\,e^{j\angle H(e^{j\omega})}\), where \(|H(e^{j\omega})|\) is the magnitude response (gain) and \(\angle H(e^{j\omega})\) is the phase response.
It measures the delay (in samples) experienced by a narrowband signal centered at \(\omega\). A linear-phase filter has constant group delay, so all frequency components are delayed equally — a desirable property for applications requiring undistorted pulse shapes.
Chapter 4: Sampling Theory and Digital Processing of Analog Signals
4.1 The Sampling Theorem
The bridge between continuous-time and discrete-time processing is uniform sampling. Let \(x_a(t)\) be a continuous-time signal with Fourier transform \(X_a(\Omega)\), and suppose \(x_a\) is bandlimited: \(X_a(\Omega) = 0\) for \(|\Omega| > \Omega_N\).
where \(\text{sinc}(t) = \sin(\pi t)/(\pi t)\).
The condition \(\Omega_s > 2\Omega_N\) is the Nyquist criterion. The frequency \(\Omega_N\) is the Nyquist frequency (not to be confused with the Nyquist rate \(2\Omega_N\)).
The relationship between discrete-time frequency \(\omega\) and analog frequency \(\Omega\) is
\[ \omega = \Omega\,T_s = \frac{2\pi\Omega}{\Omega_s}. \]The normalized Nyquist frequency \(\omega = \pi\) corresponds to half the sampling rate \(\Omega_s/2\).
4.1.1 Aliasing
When the Nyquist criterion is violated (\(\Omega_s \le 2\Omega_N\)), the copies of \(X_a(\Omega)\) centered at multiples of \(\Omega_s\) overlap, causing aliasing. The aliased spectrum of the sampled sequence is
\[ X(e^{j\omega}) = \frac{1}{T_s}\sum_{k=-\infty}^{\infty} X_a\!\left(\frac{\omega - 2\pi k}{T_s}\right). \]Aliasing is irreversible: once samples are taken at too low a rate, the information lost cannot be recovered. The remedy is to apply an analog anti-aliasing filter (lowpass with cutoff \(\Omega_s/2\)) before sampling.
4.2 Quantization
Practical analog-to-digital conversion also involves quantization: rounding the sampled amplitude to the nearest representable value. For a uniform \(b\)-bit quantizer with full-scale range \([-X_{\max}, X_{\max}]\), the step size is
\[ \Delta = \frac{2X_{\max}}{2^b}. \]Under the standard model — quantization error is uniformly distributed on \((-\Delta/2, \Delta/2)\) and uncorrelated with the signal — the quantization noise power is
\[ \sigma_q^2 = \frac{\Delta^2}{12} = \frac{X_{\max}^2}{3 \cdot 2^{2b}}. \]For a sinusoid with amplitude \(X_{\max}\), the signal power is \(X_{\max}^2/2\), and the signal-to-quantization-noise ratio is
\[ \text{SQNR} = \frac{X_{\max}^2/2}{X_{\max}^2/(3 \cdot 2^{2b})} = \frac{3}{2}\cdot 2^{2b}. \]In decibels: \(\text{SQNR} \approx 6.02\,b + 1.76\) dB. Each additional bit of resolution adds approximately 6 dB of SQNR — one of the most useful rules of thumb in DSP engineering.
Chapter 5: The Discrete Fourier Transform
5.1 Motivation: From DTFT to DFT
The DTFT \(X(e^{j\omega})\) is a continuous function of frequency and cannot be directly computed. For computational purposes we need a finite, discrete representation. The Discrete Fourier Transform (DFT) is obtained by sampling the DTFT at \(N\) uniformly spaced frequencies on the unit circle.
where \(W_N = e^{-j2\pi/N}\) is the primitive \(N\)th root of unity. The inverse DFT (IDFT) is
\[ x[n] = \frac{1}{N}\sum_{k=0}^{N-1} X[k]\,W_N^{-kn}, \quad n = 0, 1, \ldots, N-1. \]The DFT sample at index \(k\) corresponds to the analog frequency \(\Omega_k = k\,\Omega_s/N\) or the discrete-time frequency \(\omega_k = 2\pi k/N\). The frequency resolution is \(\Delta\omega = 2\pi/N\), which in analog terms is \(\Delta\Omega = \Omega_s/N = 1/(NT_s)\). To improve frequency resolution, one either increases \(N\) or decreases \(\Omega_s\) (slower sampling).
5.1.1 Relationship to the DTFT
For a length-\(N\) sequence padded with zeros if necessary, the DFT is a sampled version of the DTFT:
\[ X[k] = X(e^{j\omega})\big|_{\omega = 2\pi k/N}. \]This is the key interpretive equation: the DFT gives exact DTFT values at the analysis frequencies, but interpolation is required to estimate the DTFT between those frequencies.
5.2 Circular Convolution
The DFT converts convolution into pointwise multiplication — but only circular convolution, not the linear convolution relevant to filtering.
where \(\langle m \rangle_N = m \bmod N\).
Circular convolution wraps the output: values that would extend beyond index \(N-1\) are folded back modulo \(N\). This differs from linear convolution unless specific precautions are taken.
5.2.1 Linear Convolution via the DFT: Zero-Padding
To compute the linear convolution of a length-\(L\) sequence \(x[n]\) with a length-\(M\) sequence \(h[n]\) using the DFT, one zero-pads both sequences to length \(N \ge L + M - 1\) before computing DFTs. The inverse DFT of the product \(X[k]\,H[k]\) then yields the linear convolution.
Zero-padding also increases the apparent frequency resolution of the DFT display (by inserting more sample points between existing DTFT samples), but it does not improve the fundamental spectral resolution, which is determined by the data length.
5.3 Spectral Leakage and Windowing
5.3.1 The Windowing Phenomenon
In practice, the DFT is applied to a finite-length observation of a signal. If the signal is a pure sinusoid \(x[n] = e^{j\omega_0 n}\) and \(\omega_0\) is not exactly one of the DFT analysis frequencies \(2\pi k/N\), the DFT exhibits spectral leakage: energy from the true frequency “leaks” into neighboring bins.
The mechanism is explicit. Applying an \(N\)-point rectangular window is equivalent to multiplying by \(w[n] = 1\) for \(n = 0, \ldots, N-1\) and zero elsewhere. In the frequency domain, multiplication becomes convolution with the DTFT of the window:
\[ W_R(e^{j\omega}) = \sum_{n=0}^{N-1} e^{-j\omega n} = e^{-j\omega(N-1)/2}\,\frac{\sin(\omega N/2)}{\sin(\omega/2)}. \]This is a Dirichlet kernel (discrete sinc function) with a main lobe of width approximately \(4\pi/N\) radians and sidelobes that decay slowly (only as \(1/\omega\)). When the signal frequency does not fall on a DFT bin, the Dirichlet kernel is evaluated off its peak, and its sidelobes spread energy into all DFT bins.
5.3.2 Window Functions
The remedy is to replace the rectangular window with a smoothly tapered window that has smaller sidelobes, at the cost of a wider main lobe.
- Rectangular: \(w_R[n] = 1\). Main-lobe width \(\approx 4\pi/N\); peak sidelobe level \(-13\) dB.
- Hann (Hanning): \(w_H[n] = 0.5 - 0.5\cos(2\pi n/(N-1))\). Peak sidelobe \(-31\) dB; main-lobe width \(\approx 8\pi/N\).
- Hamming: \(w_{Ha}[n] = 0.54 - 0.46\cos(2\pi n/(N-1))\). Peak sidelobe \(-41\) dB; main-lobe width \(\approx 8\pi/N\). The coefficients \(0.54, 0.46\) are chosen to null the first sidelobe.
- Blackman: \(w_B[n] = 0.42 - 0.5\cos(2\pi n/(N-1)) + 0.08\cos(4\pi n/(N-1))\). Peak sidelobe \(-57\) dB; main-lobe width \(\approx 12\pi/N\).
- Kaiser: \(w_K[n] = I_0(\beta\sqrt{1 - (2n/(N-1) - 1)^2})/I_0(\beta)\), where \(I_0\) is the zeroth-order modified Bessel function. The parameter \(\beta\) controls the sidelobe–main-lobe tradeoff continuously.
The fundamental tradeoff is universal: reducing sidelobe level widens the main lobe and hence reduces the ability to resolve two closely spaced frequencies. There is no window that simultaneously achieves arbitrarily small sidelobes and arbitrarily narrow main lobe — this is a consequence of the uncertainty principle.
Chapter 6: The Fast Fourier Transform
6.1 Computational Complexity of the DFT
The direct implementation of the \(N\)-point DFT requires \(N^2\) complex multiplications and \(N(N-1)\) complex additions — \(O(N^2)\) operations. For \(N = 2^{10} = 1024\) this is over a million complex multiplications. The Fast Fourier Transform (FFT) reduces this to \(O(N\log N)\): approximately \(\frac{N}{2}\log_2 N\) complex multiplications for a radix-2 algorithm. The speedup at \(N = 1024\) is by a factor of about 200.
6.2 Radix-2 Decimation-in-Time FFT
The Cooley–Tukey radix-2 decimation-in-time (DIT) FFT exploits the periodicity and symmetry of \(W_N\). For \(N = 2^m\), split the DFT sum into even-indexed and odd-indexed samples:
\[ X[k] = \sum_{n=0}^{N/2-1} x[2n]\,W_N^{k(2n)} + \sum_{n=0}^{N/2-1} x[2n+1]\,W_N^{k(2n+1)}. \]Since \(W_N^{2kn} = W_{N/2}^{kn}\), each sum is an \(N/2\)-point DFT:
\[ X[k] = G[k] + W_N^k\,H[k], \quad k = 0, \ldots, N-1, \]where \(G[k] = \text{DFT}_{N/2}\{x[2n]\}\) and \(H[k] = \text{DFT}_{N/2}\{x[2n+1]\}\), both extended periodically to length \(N\) (so \(G[k + N/2] = G[k]\), \(H[k + N/2] = H[k]\)).
This two-element computation — one complex multiplication by \(W_N^k\) and two complex additions — is called a butterfly.
Applying the decomposition recursively for \(\log_2 N\) stages yields the complete radix-2 DIT FFT. The total arithmetic count is:
- Complex multiplications: \(\frac{N}{2}\log_2 N\) (exactly, when the trivial twiddle factors \(W_N^0 = 1\) are counted).
- Complex additions: \(N\log_2 N\).
6.2.1 Bit-Reversal Permutation
The recursion rearranges the input in a specific order: the input \(x[n]\) is placed at position given by bit-reversing the binary representation of \(n\). For \(N = 8\):
| Natural order \(n\) | Binary | Bit-reversed | Scrambled index |
|---|---|---|---|
| 0 | 000 | 000 | 0 |
| 1 | 001 | 100 | 4 |
| 2 | 010 | 010 | 2 |
| 3 | 011 | 110 | 6 |
| 4 | 100 | 001 | 1 |
| 5 | 101 | 101 | 5 |
| 6 | 110 | 011 | 3 |
| 7 | 111 | 111 | 7 |
The output is in natural order.
6.3 Decimation-in-Frequency FFT
An alternative factorization splits the output indices into even and odd, giving the decimation-in-frequency (DIF) algorithm. For \(N = 2^m\):
\[ X[2k] = \sum_{n=0}^{N/2-1} \left(x[n] + x\!\left[n + \frac{N}{2}\right]\right) W_{N/2}^{kn}, \]\[ X[2k+1] = \sum_{n=0}^{N/2-1} \left(x[n] - x\!\left[n + \frac{N}{2}\right]\right) W_N^n\,W_{N/2}^{kn}. \]The structure is complementary: the input is in natural order and the output requires bit-reversal permutation. The arithmetic count is identical to DIT.
6.4 Extensions: Prime Factor and Split-Radix FFT
When \(N\) is not a power of two, prime-factor algorithms (Good–Thomas) decompose the DFT using the Chinese Remainder Theorem when \(N = N_1 N_2\) with \(\gcd(N_1, N_2) = 1\). The split-radix FFT (Yavne, 1968) uses radix-2 for the even-indexed output and radix-4 for the odd-indexed output, achieving the lowest known exact multiply count for power-of-two transforms.
Chapter 7: FIR Filter Design
7.1 Design Specifications
A digital filter is specified by its desired frequency response \(H_d(e^{j\omega})\). For a lowpass filter with passband edge \(\omega_p\), stopband edge \(\omega_s\), passband ripple \(\delta_p\), and stopband attenuation \(\delta_s\), the specifications are
\[ 1 - \delta_p \le |H(e^{j\omega})| \le 1 + \delta_p \quad \text{for } |\omega| \le \omega_p, \]\[ |H(e^{j\omega})| \le \delta_s \quad \text{for } \omega_s \le |\omega| \le \pi. \]The transition band \([\omega_p, \omega_s]\) is a “don’t-care” region. The four parameters \((\omega_p, \omega_s, \delta_p, \delta_s)\) fully specify the design problem.
7.2 Linear-Phase FIR Filters
A length-\((M+1)\) FIR filter has impulse response \(h[n]\) for \(n = 0, \ldots, M\). The system function is a polynomial in \(z^{-1}\) and therefore has no poles inside the unit circle (except at the origin), guaranteeing BIBO stability.
Symmetric (\(+\)) filters are called Type I (odd \(M\)) or Type II (even \(M\)); antisymmetric (\(-\)) filters are Type III (odd \(M\)) or Type IV (even \(M\)).
Under the symmetry \(h[n] = h[M-n]\), the inner sum satisfies \(\tilde{H}(\omega) = \sum_{n=0}^{M} h[n]\cos(\omega(n - M/2))\), which is a real cosine sum. Therefore \(H(e^{j\omega}) = e^{-j\omega M/2}\tilde{H}(\omega)\) with \(\tilde{H}(\omega)\) real, giving exactly linear phase.
Linear-phase filters introduce constant group delay \(\tau = M/2\) samples at all frequencies — a critical property for audio and communications applications.
7.3 Window Method
The window method is the simplest approach to FIR design.
Step 1. Specify the ideal (infinite-duration) filter. For a lowpass filter with cutoff \(\omega_c\):
\[ h_d[n] = \frac{\omega_c}{\pi}\,\text{sinc}(\omega_c n / \pi) = \frac{\sin(\omega_c n)}{\pi n}, \quad n \ne 0, \quad h_d[0] = \frac{\omega_c}{\pi}. \]Step 2. Multiply by a window of length \(M+1\):
\[ h[n] = h_d\!\left[n - \frac{M}{2}\right] w[n], \quad n = 0, \ldots, M. \]The shift by \(M/2\) centers the ideal response to achieve causality while preserving linear phase.
Step 3. The frequency response is the convolution of the ideal response with the window’s spectrum. The window determines the transition-band width and stopband attenuation.
7.4 Frequency Sampling Method
In the frequency-sampling method, the desired frequency response is specified at the \(N\) DFT frequencies \(H_d[k] = H_d(e^{j2\pi k/N})\), \(k = 0, \ldots, N-1\). The filter impulse response is the IDFT:
\[ h[n] = \frac{1}{N}\sum_{k=0}^{N-1} H_d[k]\,e^{j2\pi kn/N}. \]The resulting filter matches the specification exactly at the sampled frequencies. The behavior between samples depends on interpolation by the DFT basis functions. By optimizing the value of \(H_d[k]\) in the transition band (allowing them to vary freely), one can sometimes achieve better stopband attenuation than the window method.
7.5 Optimal Equiripple Design: Parks–McClellan Algorithm
The Parks–McClellan algorithm finds the FIR filter of given order \(M\) that minimizes the maximum weighted approximation error in the passband and stopband — a minimax optimization problem.
7.5.1 Chebyshev Approximation Theory
For a Type I linear-phase FIR filter of odd length \(M+1\) (even order \(M\)), the frequency response is
\[ H(e^{j\omega}) = e^{-j\omega M/2} P(\omega), \]where
\[ P(\omega) = \sum_{k=0}^{M/2} a_k\cos(k\omega) \]is a real-valued cosine polynomial. The design problem becomes: find the coefficients \(a_k\) that minimize
\[ \max_{\omega \in F} W(\omega)\left|P(\omega) - D(\omega)\right|, \]where \(F = [0, \omega_p] \cup [\omega_s, \pi]\) is the set of constrained frequencies, \(D(\omega)\) is the desired response (1 in the passband, 0 in the stopband), and \(W(\omega)\) is a positive weight function.
This theorem guarantees that the optimal filter has equal ripple height in the passband and stopband (after weighting), which is why the algorithm produces equiripple filters.
7.5.2 The Remez Exchange Algorithm
The Parks–McClellan algorithm implements the Remez exchange iteration to find the optimal alternation points:
- Initialize a trial set of \(M/2 + 2\) alternation frequencies.
- Compute the optimal ripple \(\delta\) and the polynomial \(P(\omega)\) that interpolates the alternation condition at the trial frequencies (using the barycentric form of Lagrange interpolation, which is numerically stable).
- Find the actual extrema of \(W(\omega)[P(\omega) - D(\omega)]\) over \(F\).
- Replace the trial set with the new extremal frequencies.
- Repeat until convergence (typically 5–10 iterations).
The algorithm converges to the unique optimal solution. Compared to the window method, Parks–McClellan filters achieve lower order for given specifications, at the cost of more computational design effort (a one-time cost).
Chapter 8: IIR Filter Design
8.1 Analog Prototype Filters
IIR digital filters are typically designed by transforming a classical analog prototype. The analog prototype is a continuous-time lowpass filter with system function \(H_a(s)\), where \(s = \sigma + j\Omega\) is the Laplace variable.
8.1.1 Butterworth Filters
where \(\Omega_c\) is the 3 dB cutoff frequency.
Key properties:
- The magnitude response is maximally flat at \(\Omega = 0\): all derivatives of \(|H_a(j\Omega)|^2\) up to order \(2N-1\) vanish at \(\Omega = 0\).
- The response is monotone in both passband and stopband.
- The poles lie on a circle of radius \(\Omega_c\) in the left half of the \(s\)-plane, at angles \(\pi/2 + (2k-1)\pi/(2N)\) for \(k = 1, \ldots, 2N\); the causal stable system takes the \(N\) left-half-plane poles.
The filter order required to meet specifications \(|\Omega_p|, |\Omega_s|, \delta_p, \delta_s\) is
\[ N \ge \frac{\log\!\left[\sqrt{(1-\delta_p)^{-2} - 1}\,/\,\sqrt{(1/\delta_s)^{-2} - 1}\right]}{\log(\Omega_s/\Omega_p)}. \](Round up to the next integer.)
8.1.2 Chebyshev Type I Filters
The Type I Chebyshev filter has equiripple behavior in the passband and monotone rolloff in the stopband:
\[ |H_a(j\Omega)|^2 = \frac{1}{1 + \varepsilon^2\,T_N^2(\Omega/\Omega_p)}, \]where \(T_N(\cdot)\) is the \(N\)th Chebyshev polynomial and \(\varepsilon^2 = (1-\delta_p)^{-2} - 1\). Chebyshev polynomials satisfy the recurrence
\[ T_0(x) = 1, \quad T_1(x) = x, \quad T_N(x) = 2x\,T_{N-1}(x) - T_{N-2}(x), \]and oscillate between \(-1\) and \(1\) for \(|x| \le 1\), growing as \(\cosh(N\cosh^{-1}(x))\) for \(x > 1\). For the same specifications, a Type I Chebyshev filter requires lower order than a Butterworth.
8.1.3 Chebyshev Type II and Elliptic Filters
The Type II Chebyshev filter is equiripple in the stopband and monotone in the passband (complementary to Type I). The elliptic (Cauer) filter achieves equiripple in both passband and stopband, giving the lowest possible order for given specifications — but at the cost of a more complex pole-zero structure involving Jacobi elliptic functions. For a given filter order, the hierarchy of sharpness is:
\[ \text{Butterworth} < \text{Chebyshev I or II} < \text{Elliptic}. \]8.2 Analog-to-Digital Transformations
8.2.1 Impulse Invariance
The impulse-invariance method constructs a digital filter whose impulse response is a sampled version of the analog prototype’s impulse response:
\[ h[n] = T_s\,h_a(nT_s). \]For an analog prototype with partial-fraction expansion
\[ H_a(s) = \sum_{k=1}^{N} \frac{A_k}{s - s_k}, \]the impulse-invariance mapping gives
\[ H(z) = \sum_{k=1}^{N} \frac{A_k}{1 - e^{s_k T_s}\,z^{-1}}. \]Each pole at \(s_k\) in the \(s\)-plane maps to a pole at \(e^{s_k T_s}\) in the \(z\)-plane.
8.2.2 Bilinear Transform
The bilinear transform is the standard method for IIR design, avoiding aliasing entirely. It replaces the Laplace variable \(s\) by
\[ s = \frac{2}{T_s}\,\frac{1 - z^{-1}}{1 + z^{-1}}. \]- The left half of the \(s\)-plane (\(\text{Re}(s) < 0\)) maps to the interior of the unit disk in the \(z\)-plane (\(|z| < 1\)).
- The imaginary axis \(s = j\Omega\) maps to the unit circle \(z = e^{j\omega}\).
- The mapping between analog and digital frequencies is the nonlinear (warping) relation: \[ \Omega = \frac{2}{T_s}\tan\!\left(\frac{\omega}{2}\right). \]
Property 2 guarantees no aliasing: the entire analog frequency axis \((-\infty, \infty)\) maps bijectively onto one period \((-\pi, \pi]\) of the digital frequency axis. Property 3 is the price: the frequency axis is nonlinearly compressed (“warped”). High analog frequencies are mapped to digital frequencies near \(\pm\pi\).
8.2.3 Pre-Warping
To compensate for warping, the analog prototype’s critical frequencies (passband and stopband edges) are pre-warped before the prototype is designed. If the digital filter must have a passband edge at \(\omega_p\), the analog prototype is designed with edge
\[ \Omega_p = \frac{2}{T_s}\tan\!\left(\frac{\omega_p}{2}\right). \]After designing \(H_a(s)\) with this analog specification, applying the bilinear substitution \(s \to \frac{2}{T_s}\frac{1-z^{-1}}{1+z^{-1}}\) gives a digital filter \(H(z)\) with the digital passband edge exactly at \(\omega_p\).
Pre-warp: \(\Omega_p = 2\tan(0.15\pi) \approx 2 \times 0.5095 = 1.019\) rad/s.
The first-order analog prototype with cutoff \(\Omega_p\) is \(H_a(s) = \Omega_p / (s + \Omega_p)\).
Apply bilinear substitution \(s = 2(1 - z^{-1})/(1 + z^{-1})\):
\[ H(z) = \frac{\Omega_p}{\frac{2(1-z^{-1})}{1+z^{-1}} + \Omega_p} = \frac{\Omega_p(1 + z^{-1})}{(2 + \Omega_p) + (\Omega_p - 2)z^{-1}}. \]With \(\Omega_p \approx 1.019\): numerator \(\approx 1.019(1 + z^{-1})\), denominator \(\approx 3.019 + (-0.981)z^{-1}\).
Normalizing: \(H(z) = \frac{0.3376(1 + z^{-1})}{1 - 0.3249\,z^{-1}}\). The \(-3\) dB point is exactly at \(\omega = 0.3\pi\).
Chapter 9: Multirate Signal Processing
9.1 Downsampling (Decimation)
That is, one in every \(M\) samples is retained and the rest are discarded.
In the frequency domain, downsampling causes the spectrum to be aliased. If \(X(e^{j\omega})\) is the DTFT of \(x[n]\), the DTFT of \(y[n] = x[Mn]\) is
\[ Y(e^{j\omega}) = \frac{1}{M}\sum_{k=0}^{M-1} X\!\left(e^{j(\omega - 2\pi k)/M}\right). \]This is the sum of \(M\) shifted and scaled copies of the original spectrum, now compressed to fit in \([0, 2\pi]\). Aliasing occurs if \(x[n]\) has spectral content above \(\pi/M\). The remedy is to apply a lowpass anti-aliasing filter with cutoff \(\pi/M\) before downsampling; the combination is called a decimator.
9.2 Upsampling (Expansion)
The DTFT of \(x_e[n]\) is
\[ X_e(e^{j\omega}) = X(e^{jL\omega}). \]The spectrum is stretched by \(L\) in frequency, creating \(L-1\) images of the baseband spectrum at multiples of \(2\pi/L\). Removing these images requires a lowpass filter with cutoff \(\pi/L\), applied after upsampling; the combination is an interpolator.
9.3 Noble Identities
The noble identities allow moving filters through downsamplers and upsamplers, enabling more efficient filter implementations.
- Downsampling noble identity: Filtering by \(H(z)\) followed by downsampling by \(M\) equals downsampling by \(M\) followed by filtering by \(H(z^M)\): \[ \text{LPF}\{H(z)\} \to \downarrow M \quad \equiv \quad \downarrow M \to \text{LPF}\{H(z^M)\}. \]
- Upsampling noble identity: Upsampling by \(L\) followed by filtering by \(H(z)\) equals filtering by \(H(z^L)\) followed by upsampling by \(L\): \[ \uparrow L \to H(z) \quad \equiv \quad H(z^L) \to \uparrow L. \]
The noble identities are the algebraic foundation for the polyphase representation, which allows computationally efficient implementation of decimators and interpolators.
9.4 Polyphase Decomposition
Any filter \(H(z)\) can be decomposed into \(M\) polyphase components:
\[ H(z) = \sum_{k=0}^{M-1} z^{-k}\,E_k(z^M), \]where the \(k\)th polyphase component is
\[ E_k(z) = \sum_{n=0}^{\infty} h[nM + k]\,z^{-n}. \]Each \(E_k(z)\) is a decimated version of the prototype impulse response, shifted by \(k\) samples.
9.4.1 Efficient Decimator
For a decimator (anti-aliasing LPF followed by downsampling by \(M\)), direct implementation requires \(N\) multiplications per input sample (where \(N\) is the filter length). Using the polyphase decomposition and the noble identity, the operations can be reordered: downsample first, then apply the polyphase filters \(E_k(z)\) operating at the output rate \(f_s/M\). The computational cost is now \(N\) multiplications per output sample — a factor of \(M\) reduction.
9.4.2 Efficient Interpolator
Symmetrically, for an interpolator (upsampling by \(L\) followed by LPF), the polyphase decomposition combined with the noble identity yields implementation at the input rate, again saving a factor of \(L\) in computation.
9.5 Filter Banks
A filter bank is a collection of bandpass filters \(H_k(e^{j\omega})\), \(k = 0, \ldots, K-1\), that partition the frequency band \([0, \pi]\) into \(K\) subbands. In an analysis filter bank, the signal is filtered and downsampled in each subband, producing subband signals at reduced rates. A synthesis filter bank upsamples and filters the subband signals to reconstruct the original.
The key design objective for many applications (audio coding, communications) is perfect reconstruction: the reconstructed signal equals the original (possibly with a delay). The theory of perfect-reconstruction filter banks connects to the theory of wavelets.
9.5.1 Two-Channel QMF Filter Bank
The simplest filter bank uses two channels with a lowpass filter \(H_0(z)\) and a highpass filter \(H_1(z) = H_0(-z)\) (quadrature mirror filter), each followed by downsampling by 2. Under the condition that the aliasing cancels in the synthesis bank and the distortion function \(T(z) = H_0(z)F_0(z) + H_1(z)F_1(z)/2\) satisfies \(T(z) = cz^{-n_0}\), perfect reconstruction is achieved (up to a delay of \(n_0\) samples and a gain of \(c\)).
Chapter 10: Short-Time Fourier Transform and Spectrogram
10.1 Limitation of the DTFT for Nonstationary Signals
The DTFT reveals the global frequency content of a signal but gives no information about when particular frequencies occur. Many real-world signals — speech, music, radar returns, vibration data — are nonstationary: their spectral content changes over time. The Short-Time Fourier Transform (STFT) addresses this by analyzing the signal in successive short segments.
10.2 The STFT
where \(m\) is the frame index and \(R\) is the hop size (number of samples between successive frames).
At each frame \(m\), the STFT multiplies the signal by a shifted copy of the window centered at \(n = mR\) and computes the DTFT of the windowed segment. In practice, the DTFT is evaluated at the DFT frequencies by using an \(N\)-point DFT, giving the discrete-time, discrete-frequency STFT
\[ X[m, k] = \sum_{n=0}^{N-1} x[n + mR]\,w[n]\,e^{-j2\pi kn/N}. \]10.2.1 Time-Frequency Resolution Tradeoff
The STFT is subject to a fundamental time-frequency uncertainty: a short window gives good time resolution but poor frequency resolution; a long window gives good frequency resolution but poor time resolution. Formally, if \(\Delta t\) denotes the time width of the window and \(\Delta\omega\) denotes the bandwidth of its Fourier transform, then
\[ \Delta t \cdot \Delta\omega \ge \frac{1}{2}, \]with equality achieved by Gaussian windows. This is the Heisenberg–Gabor uncertainty principle for discrete-time signals. The practical consequence is that there is no single “ideal” window length — the choice depends on the signal characteristics.
10.3 The Spectrogram
It represents the signal’s energy in the time-frequency plane as a function of frame \(m\) and frequency bin \(k\).
The spectrogram is the standard tool for visualizing speech (showing formants, voiced/unvoiced regions, fricatives), music (showing harmonics and their time evolution), and machinery vibration signals. A variant, the log spectrogram \(10\log_{10}(S[m,k] + \epsilon)\) (in dB), compresses the dynamic range for visual clarity.
10.3.1 Overlap and the OLA Reconstruction
When the hop size \(R < N\), consecutive frames overlap. The signal can be reconstructed from its STFT by the overlap-add (OLA) method, provided the window satisfies the Constant Overlap-Add (COLA) constraint:
\[ \sum_{m=-\infty}^{\infty} w[n - mR] = 1 \quad \text{for all } n. \]The Hann window with hop size \(R = N/2\) and the rectangular window with \(R = N\) satisfy COLA.
Chapter 11: Adaptive Filters and the LMS Algorithm
11.1 The Adaptive Filtering Problem
In many applications the desired filter characteristics are not known in advance or change over time: channel equalization in communications (the channel changes with the environment), noise cancellation in audio (the noise path varies), echo cancellation in telephony. Adaptive filters adjust their coefficients automatically based on an error signal, without requiring prior knowledge of signal statistics.
Given a desired (reference) signal \(d[n]\), the error is \(e[n] = d[n] - y[n]\). The adaptive filter seeks to minimize some measure of \(|e[n]|^2\) by updating \(\mathbf{w}[n]\).
11.2 The Wiener Filter: Optimal Solution
The theoretically optimal (in the minimum mean-square-error sense) filter coefficients for a stationary environment are given by the Wiener–Hopf equation:
\[ \mathbf{R}_{xx}\,\mathbf{w}^* = \mathbf{r}_{dx}, \]where \(\mathbf{R}_{xx} = E[\mathbf{x}[n]\,\mathbf{x}^T[n]]\) is the input autocorrelation matrix and \(\mathbf{r}_{dx} = E[d[n]\,\mathbf{x}[n]]\) is the cross-correlation vector. The Wiener solution is
\[ \mathbf{w}^* = \mathbf{R}_{xx}^{-1}\,\mathbf{r}_{dx}. \]In a nonstationary environment, \(\mathbf{R}_{xx}\) and \(\mathbf{r}_{dx}\) change over time; moreover, computing \(\mathbf{R}_{xx}^{-1}\) requires \(O(M^3)\) operations. The LMS algorithm circumvents both problems.
11.3 The Least Mean Squares (LMS) Algorithm
The LMS algorithm is a stochastic gradient descent algorithm for minimizing \(E[|e[n]|^2]\). Instead of using the true gradient of the mean-square error, it uses an instantaneous estimate: \(\widehat{\nabla}_{\mathbf{w}} |e[n]|^2 = -2e[n]\,\mathbf{x}[n]\).
where \(\mu > 0\) is the step size (also called the adaptation constant or learning rate).
The LMS update requires only \(O(M)\) operations per sample — a dramatic reduction from the \(O(M^3)\) Wiener solution.
11.3.1 Convergence and Stability
The mean-square error converges (in the mean-square sense) under the more restrictive condition
\[ 0 < \mu < \frac{1}{\text{tr}(\mathbf{R}_{xx})} = \frac{1}{M\,P_x}, \]where \(P_x = E[|x[n]|^2]\) is the input power. The convergence time constant for the mode corresponding to \(\lambda_k\) is approximately \(\tau_k \approx 1/(2\mu\lambda_k)\) iterations.
11.3.2 Applications of the LMS Algorithm
Echo cancellation. In a telephone system, the near-end speaker’s voice leaks into the microphone of the far-end speaker through a room acoustic path \(h[n]\). The adaptive filter models this path: \(\mathbf{x}[n]\) is the far-end speech (which is available), \(d[n]\) is the microphone signal (true speech plus echo), and \(e[n]\) is the echo-cancelled output. The LMS algorithm continuously tracks changes in the room acoustics.
Noise cancellation. A primary microphone captures the desired signal plus additive noise. A reference microphone captures noise only (the reference). The adaptive filter models the transfer function from reference to primary noise, and \(e[n]\) is the noise-cancelled output.
Channel equalization. In digital communications, a transmitted sequence \(x[n]\) passes through a dispersive channel with impulse response \(h[n]\), producing \(r[n] = (x * h)[n] + v[n]\) where \(v[n]\) is additive noise. The equalizer filter \(\mathbf{w}[n]\) attempts to invert \(h\): during a training phase, \(d[n]\) is a known sequence; after training, the decision device provides \(d[n]\) as a “tentative decision.”
Chapter 12: The Discrete Hilbert Transform and Analytic Signals
12.1 Analytic Signals
In continuous time, the analytic signal associated with a real signal \(x_a(t)\) is the complex-valued signal \(x_+(t) = x_a(t) + j\,\hat{x}_a(t)\), where \(\hat{x}_a(t)\) is the Hilbert transform of \(x_a(t)\) (convolution with \(1/(\pi t)\)). The analytic signal has a one-sided spectrum: \(X_+(j\Omega) = 0\) for \(\Omega < 0\).
The discrete-time counterpart constructs a sequence \(x_+[n]\) with a one-sided DTFT: \(X_+(e^{j\omega}) = 0\) for \(\omega \in (-\pi, 0)\).
12.2 The Discrete Hilbert Transform
or equivalently in the frequency domain
\[ H_{HT}(e^{j\omega}) = \begin{cases} -j & 0 < \omega < \pi, \\ j & -\pi < \omega < 0, \\ 0 & \omega = 0, \pm\pi. \end{cases} \]The discrete Hilbert transformer is an ideal allpass filter with \(\pm 90^\circ\) phase shift. It is noncausal and must be approximated by a linear-phase FIR filter with appropriate delay. The approximation quality depends on the transition-band width near \(\omega = 0\) and \(\omega = \pi\), where the ideal response has jump discontinuities.
12.2.1 Single-Sideband Modulation
The analytic signal is used in single-sideband (SSB) amplitude modulation. The lower sideband of a real bandpass signal centered at \(\omega_c\) is
\[ x_{LSB}[n] = x[n]\cos(\omega_c n) + \hat{x}[n]\sin(\omega_c n), \]and the upper sideband is obtained by replacing \(+\) with \(-\). SSB reduces the transmission bandwidth by a factor of two compared to double-sideband AM, at the cost of requiring a Hilbert transformer.
Chapter 13: Introduction to Filter Banks and Wavelets
13.1 From Filter Banks to Wavelets
The wavelet transform can be understood as a dyadic filter bank: a cascade of two-channel QMF filter banks operating at successively halved sampling rates. At each stage, the lowpass branch is further split; the highpass branches are the wavelet subband signals.
For \(J\) stages of decomposition, the output consists of \(J\) highpass subband signals (each representing detail at a different scale) and one final lowpass subband (the coarse approximation). The scaling function \(\phi(t)\) and wavelet function \(\psi(t)\) of the continuous-time wavelet transform correspond, respectively, to the synthesis lowpass and highpass filters via the two-scale equations:
\[ \phi(t) = \sqrt{2}\sum_k h_0[k]\,\phi(2t - k), \qquad \psi(t) = \sqrt{2}\sum_k h_1[k]\,\phi(2t - k). \]The discrete wavelet transform (DWT) thus inherits all the polyphase efficiency of the filter bank formulation.
13.2 Perfect Reconstruction Conditions
For a two-channel filter bank with analysis filters \(H_0(z), H_1(z)\) and synthesis filters \(F_0(z), F_1(z)\), perfect reconstruction (up to a delay) requires:
Aliasing cancellation:
\[ H_0(-z)\,F_0(z) + H_1(-z)\,F_1(z) = 0. \]Distortion-free condition:
\[ H_0(z)\,F_0(z) + H_1(z)\,F_1(z) = 2z^{-n_0}. \]A standard choice satisfying both conditions is \(F_0(z) = H_1(-z)\) and \(F_1(z) = -H_0(-z)\) (Johnston’s QMF). For FIR filters, the requirement of exact linear phase and perfect reconstruction leads to biorthogonal wavelet bases (e.g., the CDF 9/7 wavelet used in JPEG 2000), where analysis and synthesis filters are different but dual to each other.