ECE 351: Compilers

Werner Dietl

Estimated study time: 4 minutes

Table of contents

Sources and References

Equivalent UW courses — CS 444/644 (Compiler Construction) Primary textbook — Michael L. Scott and Jonathan Aldrich, Programming Language Pragmatics, 5th ed., Morgan Kaufmann, 2025. Supplementary references — Charles N. Fischer, Ronald K. Cytron, and Richard J. LeBlanc Jr., Crafting a Compiler, Addison-Wesley, 2010; Andrew W. Appel with Jens Palsberg, Modern Compiler Implementation in Java, 2nd ed., Cambridge, 2004; Aho, Lam, Sethi, and Ullman, Compilers: Principles, Techniques, and Tools, 2nd ed., Pearson, 2006.

Equivalent UW Courses

ECE 351 and CS 444/644 are Waterloo’s two compiler courses, and they cover the same pipeline: lexing, parsing, AST construction, semantic analysis, intermediate representation, optimization, and (in CS 444/644) code generation. CS 444 and CS 644 are a single cross-listed course — same lectures and project, just an undergrad/grad number split — famous because its group project implements a subset of Java through to x86 assembly. A student who has done ECE 351 has seen the front-end and a real chunk of the middle-end but stops short of full native code generation.

What This Course Adds Beyond the Equivalents

ECE 351’s project target is a subset of VHDL: the labs build a circuit simulator and circuit synthesizer, so the course doubles as an exercise in hardware-description-language semantics and is a natural pairing with the rest of the computer-engineering curriculum. The implementation language is Java, and the course spends proportionally more time on the front end (lexing, recursive descent, LL(1) proofs, term rewriting, visitor patterns) than CS 444/644 does. What it omits relative to CS 444/644 is native code generation — there is no x86 back end — and the deeper dataflow and SSA machinery the CS course develops for its back-end work.

Topic Summary

Course Introduction

Motivation for studying compilers (they are the canonical example of structured program manipulation) and the high-level pipeline: source, tokens, parse tree, AST, IR, target. Sets the stage for each later topic.

Lexical Analysis

Tokens as regular languages, hand-written lexers vs generators, and the relationship between regular expressions and finite automata. Students write a lexer for the project’s VHDL subset in the first lab.

Program Understanding and Transformations

Tree walking with the visitor pattern, AST transformations, and program equivalence checking. This is where the course first forces students to think of programs as data structures that can be rewritten mechanically.

Regular Expressions to DFAs

Thompson construction from regex to NFA, subset construction from NFA to DFA, and DFA minimization. Establishes that lexing can be mechanised completely and efficiently in \( O(n) \) time.

LL(1) Parsing and Proofs

Top-down recursive-descent parsing, FIRST and FOLLOW sets, the LL(1) predictive-parsing condition, and proofs that a given grammar is (or is not) LL(1). Lab work introduces Parboiled as a PEG-style parser generator.

Grammar Design

How to structure a grammar to be parseable and unambiguous, left-recursion elimination, left-factoring, and operator precedence via grammar levels. Shapes the parser front end of the project.

Optimization

Local and peephole optimizations, common subexpression elimination, constant folding and propagation, and an introduction to dataflow analysis. Implemented as tree transformations over the project IR.

Register Allocation

Live ranges, interference graphs, and graph colouring as the standard formulation of register assignment. Gives students a taste of how the back end chooses machine resources.

Garbage Collection

Reachability, mark-and-sweep, copying, and generational collectors; trade-offs between throughput, latency, and memory overhead. Ties runtime support back to language design.