ECE 320: Computer Architecture

Hiren Patel

Estimated study time: 4 minutes

Table of contents

Sources and References

Equivalent UW courses — CS 251 (Computer Organization and Design), CS 350 (Operating Systems, for the memory hierarchy and virtual memory overlap)

Primary textbook — D. A. Patterson and J. L. Hennessy, Computer Architecture: A Quantitative Approach, 6th Edition, Morgan Kaufmann, 2017.

Supplementary references — Andrew Waterman and Krste Asanović (eds.), The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA; Patterson and Hennessy, Computer Organization and Design RISC-V Edition, 2nd ed., Morgan Kaufmann, 2020.

Equivalent UW Courses

CS 251 is the closest Math-faculty analogue, covering ISA, single-cycle and pipelined datapaths, and an introductory treatment of caches and virtual memory. CS 350 revisits memory hierarchy, page tables, and translation from the operating-systems side, which overlaps meaningfully with the cache and virtual-memory segment of ECE 320. Unlike either CS course, ECE 320 is taught from the Hennessy and Patterson Quantitative Approach volume rather than the introductory Computer Organization and Design text, so it leans toward the upper-year, performance-oriented material.

What This Course Adds Beyond the Equivalents

ECE 320 spends much more time on dynamically scheduled superscalar processors, multicore coherence protocols, memory consistency models, multithreading, and GPU or vector architectures than CS 251 does. These topics are treated only briefly, if at all, in CS 251, and CS 350 does not touch them. The ECE course also uses RISC-V throughout and includes a hardware-focused lab sequence with pipeline deliverables. What it omits relative to the CS pair is the operating-systems perspective on virtual memory (process isolation, demand paging policy), which stays in CS 350, and the assembly-programming emphasis of CS 251.

Topic Summary

Performance and Quantitative Design

Amdahl’s law, CPU time decomposition into instruction count, CPI, and clock cycle time, and the use of benchmarks to compare designs. Speedup from an enhancement applied to a fraction \(f\) of execution is

\[ S = \frac{1}{(1-f) + f/k} \]

where \(k\) is the local speedup of the enhanced component.

Instruction Set Architectures and RISC-V

General ISA taxonomy (register-memory vs load-store, operand formats, addressing modes) followed by the RISC-V base integer ISA used throughout the labs. Emphasis on encoding regularity and why it simplifies decode hardware.

Single-cycle and Pipelined Datapaths

Construction of a single-cycle RISC-V datapath, then the classic five-stage pipeline (IF, ID, EX, MEM, WB). Structural, data, and control hazards; forwarding paths; stalls and branch prediction. Pipeline interrupts and precise exceptions are treated alongside the basic pipeline.

Memory Hierarchy, Caches, and Virtual Memory

Direct-mapped, set-associative, and fully associative caches; write-through vs write-back and write-allocate policies; AMAT analysis. Virtual memory with page tables, TLBs, and advanced address-translation topics drawn from Appendix L of Hennessy and Patterson.

Dynamic Scheduling and Instruction-Level Parallelism

Out-of-order execution using scoreboarding and Tomasulo’s algorithm with register renaming and reservation stations. Speculation through reorder buffers and branch prediction to expose more ILP.

Multicore Synchronization, Coherence, and Consistency

Snooping and directory cache-coherence protocols (MSI, MESI), hardware primitives for synchronization such as load-linked and store-conditional, and the distinction between coherence and memory consistency. Sequential consistency is contrasted with relaxed models and their programmer-visible reordering.

Multithreading, GPUs, and Vector Processors

Fine-grained, coarse-grained, and simultaneous multithreading. SIMD and vector execution, the SIMT execution model used on modern GPUs, and how data-level parallelism is exposed through wide register files and masked lanes.