EE-748:
Advanced Topics in Computer Architecture
Semester: Jan - Apr 2014
Instructor: Virendra
Singh
Class Timings:
Office Hours:
Syllabus:
Overview
Superscalar and VLIW architectures. Limits of instruction level parallelism (ILP). Simultaneous
multi-threaded (SMT) architecture, Performance enhancement through branch
prediction and value prediction, BulkSMT, Thread
level speculation. Run ahead execution, proactive
instruction fetch, multi-core architectures, data marshaling for multi-core
architectures, power constrained CMPs, heterogeneous core design, Core Fusion,
Transactional memories. Performance evaluation of complex
microarchitectures. On-chip interconnects (Network-on-Chip). Architectural vulnerabilities and reliable architectures.
Patchable design. Secure architectures. Energy efficient
architectures. Power management. Cache design, energy
efficient cache partitioning, fast thread migration, thread throatling.
References:
Current Literature (Papers from ISCA, Micro, HPCA, ICCD, DSN, and IEEE
Trans. on Computers, IEEE Computer Architecture Letters)
Must to read
papers (before coming to
the first class)
List of representative papers (To be discussed in the class)
21.
A. Tumeo et al., `Designing
next-generation massively multithreaded architectures for irregular application`,
IEEE Computers, Aug 2012
22. Emily Blem et al. `Power
struggles: Revisiting the RISC vs CISC debate on
contemporary ARM and x86 architectures`, Proc. of HPCA 2013
23. Manish Arora et al, `Redefining
the role of CPU in the era of CPU-GPU integration`, IEEE Micro magazine
2012
24. Onur Kayiran et al., `Neither more nor less:
Optimizing thread level parallelism for GPGPUs`, Proc. of PACT 2013
25. Adwait Jog et
al., `Orchestrated
scheduling and prefetching for GPGPUs`, Proc. of ISCA 2013
26. Ankita Sethia et al., `A
customized processor for energy efficient scientific computing`, IEEE Tran.
On Computers, 2012
27. D. Lustig and M. Martonosi, `Reducing GPU
offload latency via fine grained CPU-GPU synchronization`, Proc. of HPCA
2013
28. Y. Park et
al., `Libra:
Tailoring SIMD execution using heterogeneous hardware and dynamic
configurability`, Proc. of Micro 2012
29. Jason Power
et al., `Heterogeneous
system coherence for integrated CPU-GPU systems`, IEEE Micro 2013
30. D. Zier and B. Len, `Performance
evaluation of dynamic speculative multithreading with Cascadia architecture`,
IEEE Trans on PDS 2010
31. A. Basu et al., Efficient
virtual memory for big memory servers`, Proc. of ISCA 2013
32. S. Pai et al, `Improving
GPGPU concurrency with elastic kernel`, Proc. of ASPLOS 2013
33.
J. Lee et al., `CPU-GPU
collaboration for data parallel kernels on heterogeneous systems`, Proc. of
PACT 2013
Pre-requisite: CS-683: Advanced Computer
Architecture, OR, EE-739: Processor Design. Instructor`s consent is
mandatory.
Evaluation: Mid sem exam (10%), Quizzes (5%), End sem
exam (15%), Course project (35%), and Class participation (35%)
Class Schedule:
13 Jan: Course
Introduction
15 Jan: High Performance computer Architecture Review
- 1
22 Jan: High Performance computer Architecture Review
– 2
24 Jan: Future of microprocessor (Paper: The
future of microprocessors, Communications of ACM 2011)