ECE 351: Compilers
Werner Dietl
Estimated study time: 4 minutes
Table of contents
Sources and References
Equivalent UW courses — CS 444/644 (Compiler Construction) Primary textbook — Michael L. Scott and Jonathan Aldrich, Programming Language Pragmatics, 5th ed., Morgan Kaufmann, 2025. Supplementary references — Charles N. Fischer, Ronald K. Cytron, and Richard J. LeBlanc Jr., Crafting a Compiler, Addison-Wesley, 2010; Andrew W. Appel with Jens Palsberg, Modern Compiler Implementation in Java, 2nd ed., Cambridge, 2004; Aho, Lam, Sethi, and Ullman, Compilers: Principles, Techniques, and Tools, 2nd ed., Pearson, 2006.
Equivalent UW Courses
ECE 351 and CS 444/644 are Waterloo’s two compiler courses, and they cover the same pipeline: lexing, parsing, AST construction, semantic analysis, intermediate representation, optimization, and (in CS 444/644) code generation. CS 444 and CS 644 are a single cross-listed course — same lectures and project, just an undergrad/grad number split — famous because its group project implements a subset of Java through to x86 assembly. A student who has done ECE 351 has seen the front-end and a real chunk of the middle-end but stops short of full native code generation.
What This Course Adds Beyond the Equivalents
ECE 351’s project target is a subset of VHDL: the labs build a circuit simulator and circuit synthesizer, so the course doubles as an exercise in hardware-description-language semantics and is a natural pairing with the rest of the computer-engineering curriculum. The implementation language is Java, and the course spends proportionally more time on the front end (lexing, recursive descent, LL(1) proofs, term rewriting, visitor patterns) than CS 444/644 does. What it omits relative to CS 444/644 is native code generation — there is no x86 back end — and the deeper dataflow and SSA machinery the CS course develops for its back-end work.
Topic Summary
Course Introduction
Motivation for studying compilers (they are the canonical example of structured program manipulation) and the high-level pipeline: source, tokens, parse tree, AST, IR, target. Sets the stage for each later topic.
Lexical Analysis
Tokens as regular languages, hand-written lexers vs generators, and the relationship between regular expressions and finite automata. Students write a lexer for the project’s VHDL subset in the first lab.
Program Understanding and Transformations
Tree walking with the visitor pattern, AST transformations, and program equivalence checking. This is where the course first forces students to think of programs as data structures that can be rewritten mechanically.
Regular Expressions to DFAs
Thompson construction from regex to NFA, subset construction from NFA to DFA, and DFA minimization. Establishes that lexing can be mechanised completely and efficiently in \( O(n) \) time.
LL(1) Parsing and Proofs
Top-down recursive-descent parsing, FIRST and FOLLOW sets, the LL(1) predictive-parsing condition, and proofs that a given grammar is (or is not) LL(1). Lab work introduces Parboiled as a PEG-style parser generator.
Grammar Design
How to structure a grammar to be parseable and unambiguous, left-recursion elimination, left-factoring, and operator precedence via grammar levels. Shapes the parser front end of the project.
Optimization
Local and peephole optimizations, common subexpression elimination, constant folding and propagation, and an introduction to dataflow analysis. Implemented as tree transformations over the project IR.
Register Allocation
Live ranges, interference graphs, and graph colouring as the standard formulation of register assignment. Gives students a taste of how the back end chooses machine resources.
Garbage Collection
Reachability, mark-and-sweep, copying, and generational collectors; trade-offs between throughput, latency, and memory overhead. Ties runtime support back to language design.