David Brooks
Haley Family Professor of Computer Science
Maxwell Dworkin 141
33 Oxford Street
Cambridge MA 02138
Phone: 617-495-3989
Fax: 617-496-6404
E-mail: dbrooks@eecs.harvard.edu
Syllabus
Meeting time:
Friday 9:00–11:45AM
MD 119
|
Introduction
The class will review
fundamental structures in modern microprocessor and computer system
architecture design. Tentative topics will include computer
organization, instruction set design, memory system design, pipelining,
and other techniques to exploit parallelism. We will also cover
system level topics such as storage subsystems and basics of
multiprocessor systems. The class will focus on quantitative
evaluation of design alternatives while considering design metrics such
as performance and power dissipation.
Prerequisites
CS 141 (Computing Hardware) or equivalent, C Programming
Textbook
Textbook: “Computer Architecture: A Quantitative Approach,” Third Edition,
John L. Hennessy and David A. Patterson, ISBN 1-55860-596-7A
Course Readings
|
Lecture 1:
Introduction to Computer Architecture
|
|
Lecture 2: CPU
Performance and Metrics
|
|
Lecture 3:
Instruction Set Architecture
|
Readings:
Ruby B. Lee, "Subword Parallelism with MAX-2," IEEE Micro, 16(4),August 1996, pp. 51-59. |
|
Class
Notes |
|
|
Homework
1 |
|
Lecture 4:
Implementation and Pipelining
|
|
Lecture
5: Exceptions, Multi-cycle Ops, Dynamic Scheduling
|
|
Lecture
6: Scoreboarding Example, Tomasulo's Algorithm
|
|
Lecture
7: Dynamic Branch Prediction
|
Readings:
Tse-Yu Yeh, Yale N. Patt, "A Comparison of Dynamic Branch Predictors
that use Two Levels of Branch History," The 20th International Symposium
on Computer Architecture, May, 1993. |
|
Class
Notes |
|
|
Lecture
8: Multiple Issue and Speculation
|
Class
Notes |
|
Readings:
G. S. Sohi and S. Vajapeyam, "Instruction Issue Logic for
High-performance, Interruptable Pipelined
Processors," International Symposium on Computer Architecture, 1987. |
|
Readings:
J. E. Smith and A. Pleszkun, "Implementing Precise Interrupts in
Pipelined Processors," IEEE Transactions on Computers, Volume 37,
Issue 5 (May 1988). |
|
|
Homework
2 |
|
Lecture
9: Limits of ILP, Case Studies
|
Class
Notes |
|
Readings:
David W. Wall, "Limits of instruction-level parallelism," Architectural
Support for Programming Languages and Operating Systems (ASPLOS) 1991.
|
|
Readings:
Subbarao Palacharla, Norman P. Jouppi, James E. Smith,
"Complexity-Effective Superscalar Processors," 24th International Symposium on Computer Architecture
(ISCA-24), June 1997. |
|
Readings:
Eric Rotenberg, Steve Bennett, J. E. Smith, "Trace Cache: A Low Latency
Approach to High Bandwidth Instruction Fetching," 29th International Symposium on
Microarchitecture (MICRO-29), Dec 1996. |
|
|
Lecture 10: Static Scheduling, Loop
Unrolling, and Software Pipelining
|
|
Homework
3 |
|
Lecture 11: Software Pipelining and Global
Scheduling
|
|
Sample Midterm from Fall 2017 |
|
Lecture 12: Hardware Assisted Software ILP
and IA64/Itanium Case Study
|
|
Lecture 14: Introduction to Caches
|
|
Lecture 15: More on Caches
|
|
Homework 4 |
|
Lecture 16: More on Caches
|
|
Lecture 17: Main Memory
|
|
Lecture 18: Virtual Memory
|
|
Lecture 19: Multiprocessors
|
|
Homework 5 |
|
Lecture 20: More Multiprocessors
|
Class Notes |
|
Readings
Simultaneous Multithreading: Maximizing On-Chip Parallelism, D.M. Tullsen, S.J. Eggers, and H.M. Levy, In 22nd Annual
International Symposium on Computer Architecture, June, 1995.
|
|
Readings
L. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese.
Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing. In Proceedings of the 27th Annual
International Symposium on Computer Architecture (ISCA'00), June 2000.
|
|
|
Lecture 21: Multithreading and I/O
|
|
Lecture 22: More I/O
|
|
Lecture 23: Clusters and Wrapup
|
Class Notes |
|
Readings
L. Barroso, J. Dean, and U. Holzle, "Web search for a planet: The Google
Cluster Architecture," IEEE Micro, 23, 2, March-April 2003, pp. 22-28.
|
|
|