CMPUT429/CMPE382 Winter 2001

2/6/02


Click here to start


Table of Contents

CMPUT429/CMPE382 Winter 2001

Recap: Who Cares About the Memory Hierarchy?

Levels of the Memory Hierarchy

The Principle of Locality

Memory Hierarchy: Terminology

Cache Measures

Generations of Microprocessors

Simplest Cache: Direct Mapped

1 KB Direct Mapped Cache, 32B blocks

Two-way Set Associative Cache

Disadvantage of Set Associative Cache

Review: Cache performance

Impact on Performance

Example: Harvard Architecture

4 Questions for Memory Hierarchy

Q1: Where can a block be placed in the upper level?

Q2: How is a block found if it is in the upper level?

Q3: Which block should be replaced on a miss?

Q4: What happens on a write?

Write Buffer for Write Through

Review #4/4: TLB, Virtual Memory

Review: Improving Cache Performance

Reducing Misses

3Cs Absolute Miss Rate (SPEC92)

2:1 Cache Rule

3Cs Relative Miss Rate

How Can Reduce Misses?

1. Reduce Misses via Larger Block Size

2. Reduce Misses via Higher Associativity

Example: Avg. Memory Access Time vs. Miss Rate

Example: Avg. Memory Access Time vs. Miss Rate (cont.)

3. Reducing Misses via a “Victim Cache”

4. Reducing Misses via “Pseudo-Associativity”

5. Reducing Misses by Hardware Prefetching of Instructions & Data

6. Reducing Misses by Software Prefetching Data

7. Reducing Misses by Compiler Optimizations

Merging Arrays Example

Loop Interchange Example

Loop Fusion Example

Blocking Example

Blocking Example

Reducing Conflict Misses by Blocking

Summary of Compiler Optimizations to Reduce Cache Misses (by hand)

Summary: Miss Rate Reduction

Review: Improving Cache Performance

Write Policy: Write-Through vs Write-Back

Write Policy 2: Write Allocate vs Non-Allocate (What happens on write-miss)

1. Reducing Miss Penalty: Read Priority over Write on Miss

1. Reducing Miss Penalty: Read Priority over Write on Miss

2. Reduce Miss Penalty: Early Restart and Critical Word First

3. Reduce Miss Penalty: Non-blocking Caches to reduce stalls on misses

Value of Hit Under Miss for SPEC

4: Add a second-level cache

Comparing Local and Global Miss Rates

Reducing Misses: Which apply to L2 Cache?

L2 cache block size & A.M.A.T.

Reducing Miss Penalty Summary

What is the Impact of What You’ve Learned About Caches?

Cache Optimization Summary

A Modern Memory Hierarchy

Summary: Caches

Summary: The Cache Design Space

IBM POWER4 Memory Hierarchy

Intel Itanium Processor

Future Intel McKinley Processor

Author: Randy H. Katz

Email: amaral@cs.ualberta.ca

Home Page: www.cs.ualberta.ca

Download presentation source