SECR2033 Computer Organization & Architecture
SECR2033
Computer Organization
and Architecture
Lecture slides prepared by “Computer Organization and Architecture”, 9/e, by William Stallings, 2013.
Module 6 2
Memory
Objectives:
q To master the concepts of hierarchical memory
organization.
q To understand how each level of memory
contributes to system performance, and how
the performance is measured.
q To master the concepts behind cache memory
and understands the basic concept of virtual
memory.
m @ May 2021 1
SECR2033 Computer Organization & Architecture 3
Module 6
Memory
6.1 Introduction
6.2 Main Memory
6.3 Cache Memory
6.4 Virtual Memory
6.5 Summary
Module 6 4
Memory
6.1 Introduction q Overview
q Type of Memory
6.2 Main Memory q The Memory Hierarchy
6.3 Cache Memory
6.4 Virtual Memory
6.6 Summary
m @ May 2021 2
SECR2033 Computer Organization & Architecture
6
Overview
n Most computers are built using the Von Neumann model, which
is centered on memory.
n The programs that perform the processing are stored in
___________.
n The memory unit that Only programs and
communicates directly with the data currently needed
by the processor reside
CPU is called main memory*.
in main memory.
n Devices that provide backup storage
are called auxiliary memory.*
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.233. 5
* Mano, Morris M. (1993). Computer System Architecture (3rd Edition). p.445.
6
n Internal memory is often equated with main memory, but there
are other forms of internal memory:
q The processor requires its own local memory à _______.
q The control unit portion of the processor may also require
its own internal memory à _______________.
q Cache is another form of internal memory.
n The complex subject of computer memory is made more
manageable if we classify memory systems according to their
key characteristics (see next table).
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.121. 6
m @ May 2021 3
SECR2033 Computer Organization & Architecture
6
Table: Key Characteristics of Computer Memory Systems.
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.121. 7
6
Type of Memory
Why are there so many
different types of
computer memory?
n The answer à new technologies continue to be introduced in
an attempt to match the improvements in CPU design.
n Even though a large number of memory technologies exist,
there are only two basic types of memory:
RAM (Random Access Memory) ROM (Read-Only Memory)
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.233 8
m @ May 2021 4
SECR2033 Computer Organization & Architecture
(a) RAM (Random Access Memory) 6
n RAM is also the “main memory” or primary memory.
n Used to store programs and data that the The organization
computer needs when executing programs. of RAM is a key
design issue à
However, RAM is volatile, and loses this refers to the
information once the power is turned off. physical
n Two general types of chips used to build the arrangement of
bulk of RAM memory in today’s computers: bits to form
words*.
q SRAM (Static Random Access Memory).
q _______________________________.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.234 9
* William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.124
(b) ROM (Read-Only Memory) 6
n Stores critical information necessary to operate the system,
such as the program necessary to boot the computer.
• ROM is not volatile and always retains its data.
• Maintain information when the power is shut off.
n Some different types of ROM:
q PROM (Programmable ROM)
q EPROM (Erasable PROM)
q ____________________________________
q Flash memory à essentially EEPROM with the added benefit .
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.234 10
m @ May 2021 5
SECR2033 Computer Organization & Architecture
6
The Memory Hierarchy
n Generally speaking, faster memory is more expensive than
slower memory.
n To provide the best performance at the lowest cost, memory
is organized in a hierarchical fashion.
n Small, fast storage elements are kept in the CPU.
à Larger, slower main memory is accessed through the
data bus.
à Larger, (almost) permanent storage in the form of disk
and tape drives is still further from the CPU.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.235 11
Closest to the 6Memories closer
processor.
to the top tend
to be smaller in
size, better
performance.
Figure: The memory hierarchy. 12
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.236
m @ May 2021 6
SECR2033 Computer Organization & Architecture 6Employ in
SRAM** semiconductor
DRAM ** technology
memory*
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.236 13
* William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.127
** Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.378
n To access a particular piece 6
of data, the CPU first sends a
request to its nearest CPU
memory, usually _________.
q If the data is not in cache,
then main memory is
queried.
q If the data is not in main
memory, then the request
goes to disk.
n Once the data is located, then the data, and a number of its
nearby data elements are fetched into cache memory.
14
m @ May 2021 7
SECR2033 Computer Organization & Architecture
Table: The following terminology is used when 6
referring to the memory hierarchy.
Terminology Definition
Hit The data is found at a given memory level.
Miss
______ Rate The data is not found at a given memory level.
______ Rate The percentage (%) of memory accesses found in a
given level of memory.
Hit Time
The percentage (%) of memory accesses not found
Miss Penalty
in a given level of memory à (1 – hit rate)
The time required to access data at a given memory
level.
The time required to process a miss, including the
time that it takes to replace a block of memory plus
the time it takes to deliver the data to the processor.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.235 15
Locality of Reference 6
n An entire blocks of data is copied after a hit because the
principle of locality tells us that once a byte is accessed, it is
likely that a nearby data element will be needed soon.
n There are three basic forms of locality:
1) Temporal locality—Recently accessed items tend to be
accessed again in the near future.
2) Spatial locality—Accesses tend to be clustered in the
address space.
3) Sequential locality—Instructions tend to be accessed
sequentially.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.237 16
m @ May 2021 8
SECR2033 Computer Organization & Architecture
Review Questions 6
6.1.1. What are the advantages of using DRAM for main
memory?
6.1.2 Explain the concept of a memory hierarchy.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.266 17
Module 6 18
Memory
6.1 Introduction q Overview
q Memory Organization
6.2 Main Memory q Memory Capacity
q Memory Interleaving
6.3 Cache Memory
6.4 Virtual Memory
6.6 Summary
m @ May 2021 9
SECR2033 Computer Organization & Architecture
RAM (Read-Access Memory) 6
ROM (Read-Only Memory)
Overview
n The _________________ is the central storage unit in a
computer system.
n The principle technology used for the main memory is based on
semiconductor integrated circuits (RAM) either static or
dynamic operating modes.
n Most of the main memory in a general-purpose computer is
made up of RAM integrated circuit, but a portion of the memory
may be constructed with ROM.
Mano, Morris M. (1993). Computer System Architecture (3rd Edition). p.448 19
6
Memory Organization
n We can envision memory as a matrix of bits.
n Each row, implemented by a register, has a length typically
equivalent to the word size of the machine.
n Each register (more commonly referred to as a memory
location) has a unique address; memory addresses usually
start at zero (0) and progress upward.
N 8-Bit Memory Locations M 16-Bit Memory Locations
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.153 20
m @ May 2021 10
SECR2033 Computer Organization & Architecture
Memory Location 6
n Each part of memory has a separate Address : Memory Words:
memory location, which can be
referred to using a memory address. 000:
001:
n Number of bits for an address uniquely 010:
access to a memory location.
:
2n – 1:
Where n is the bit
of the address
21
6
Memory Capacity
n Memory capacity usually measured in bits:
Total no. of memory locations (words) × size of memory word
n Examples: 3 words (locations/rows)
(a) (b)
Size = 3 words x 8 bits Size = 3 words x 16 bits
= 24 bits = 48 bits
22
m @ May 2021 11
SECR2033 Computer Organization & Architecture
1 Word = 8 bits 6
Memory
Example 1:
Main memory is divided into blocks. 1 Block
The memory word is 8 bit and the = 8 words
size of a block is 8 words. = 23
(a) What is the capacity of the main 1 Block
memory, if the total number of = 8 words
blocks in the memory is 128? = 23
(b) How many blocks in the main
memory if the memory capacity is
32 Kbit?
23
Memory 6
Solution :
m @ May 2021 24
12
SECR2033 Computer Organization & Architecture
6
Memory Interleaving
n A single shared memory module A number of memory
causes sequentialization of access. chips can be grouped
n Memory interleaving à splits together to form a
memory bank
memory across multiple memory
modules (or ________________ ).
Low-Order Interleaving High-Order Interleaving
(LOI) (HOI)
The low-order bits of the The high-order bits of the
address are used to select address are used to select
the bank. e.g. à 00000101 the bank. e.g. à 10100000
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.155 25
LOI (Low-Order Interleaving) 6
Figure: Memory interleaving (LOI) example with 4 banks. Red 26
banks are refreshing and can't be used.
https://en.wikipedia.org/wiki/Interleaved_memory#/media/File:Interleaving.gif (11 Apr 2019)
m @ May 2021 13
SECR2033 Computer Organization & Architecture
6
Example 2: Memory interleaving on 32 addresses.
Figure: High-Order Interleaving (HOI).
Figure: Low-Order Interleaving (LOI). 27
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.156
Address format: 6
n bits 28
Bank address Word in the bank
m bits (n – m) bits
Figure: High-Order Interleaving (HOI).
n bits
Word in the bank Bank address
(n – m) bits m bits
Figure: Low-Order Interleaving (LOI).
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.156
m @ May 2021 14
SECR2033 Computer Organization & Architecture
n Memory capacity is 64 Bytes 6
Example 3: LOI à 26 (Number of bit for memory address = 6)
These 4 bits are n Total bank is 4
same in all banks
à 22 (Number of bit for bank address = 2)
n Bank capacity = 26 – 2= 24 = 16 Bytes
Bank 1 Bank 2 Bank 3 Bank 4
1111 00 60 1111 01 61 1111 10 62 1111 11 63
... ... ... ... ... ... ... ...
4 5 6 7
0001 00 0 0001 01 1 0001 10 2 0001 11 3
0000 00 0000 01 0000 10 0000 11
29
Example 4: LOI 6Given a memory address as 29Ch (10 bits)
and there are 4 memory banks. Determine
the memory bank address and the address
of the word in the bank using LOI.
Solution:
n 4 bank à 22 (Number of bit for bank
address = 2)
n Bank capacity = 210 – 2 = 28 = 256 Bytes
n Memory address = 29Ch = 10 1001 1100
(3FCh) Bank 1 … 01 Bank 2 … 10 Bank 3 (3FFh) Bank 4
... ...
1111 1111 00 1020 … 01 1021 … 10 1022 … 11 1023
... ... … 01 ... … 10 ... ... ...
5 6 … 11 7
0000 0001 00 4 1 2 … 11 3
0000 0000 00 0
(003h) 30
(000h)
m @ May 2021 15
SECR2033 Computer Organization & Architecture
6
LOI: Advantages and Disadvantages
It produces memory A failure of any single
interference. bank would be
catastrophic to the whole
system.
31
n Memory capacity is 64 Bytes 6
Example 5: HOI à 26 (Number of bit for memory address = 6)
These 4 bits are n Total bank is 4
same in all banks
à 22 (Number of bit for bank address = 2)
n Bank capacity = 26 – 2 = 24 = 16 Bytes
Bank 1 Bank 2 Bank 3 Bank 4
00 1111 15 01 1111 31 10 1111 47 11 1111 63
... ... ... ... ... ... ... ...
1 17 33 49
00 0001 0 01 0001 16 10 0001 32 11 0001 48
00 0000 01 0000 10 0000 11 0000
32
m @ May 2021 16
SECR2033 Computer Organization & Architecture
Example 6: HOI 6Given a main memory has 32 Mwords with
the size of each word is 1 Byte, and there
are 16 memory banks. Draw the modular
memory address format if the system is
implemented with HOI.
Solution:
n Main memory à 32 Mwords = 25 Mwords = 25 x 220 bits = 225
(Number of bit for memory address = 25)
n 16 memory bank à 24 (Number of bit for bank address = 4)
n Word in the bank = 25 – 4 = 21 bits
Bank address 25 bits 33
4 bits Word in the bank
21 bits
6
HOI: Advantages
§ Easy memory extension by the addition of one or more
memory modules to a maximum of M - 1.
§ Provides better reliability, since a failed module affects only a
localized area of the address space.
§ This scheme would be used w/o conflict problems in
multiprocessors if the modules are partitioned according to
disjoint or non-interleaving processes (programs should be
disjoint for its success).
34
m @ May 2021 17
SECR2033 Computer Organization & Architecture
6
HOI: Disadvantages
§
§ Scheme will cause memory conflicts in case of pipelined,
vector processors. The sequentially of instructions and data to
be placed in the same module. Since memory cycle time is
much greater than pipelined clock time, a previous memory
request would not have completed its access before the arrival
of the next request, thereby resulting in a delay.
§ Process interacting and sharing instructions and data in
multiprocessor system will encounter considerable conflicts.
§ This technique is useful only in one single user system / single
user multitasking system.
35
Review Questions 6
6.2.1. What is the difference between a byte and a word? What
distinguishes each?
6.2.2. List and explain the two types of memory interleaving and
the differences between them.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.192 36
m @ May 2021 18
SECR2033 Computer Organization & Architecture
Module 6 37
Memory
6.1 Introduction q Overview
6.2 Main Memory q Cache Mapping Schemes
q Replacement Policy
6.3 Cache Memory q Cache Performances
6.4 Virtual Memory
6.6 Summary
6
Overview
n *Cache memory is designed to combine:
q the memory access time of expensive, high-speed memory
with
q the large memory size of less expensive, lower-speed
memory.
n *The cache contains a copy of portions of main memory.
n Unlike main memory, which is accessed by address, cache is
typically accessed by content; hence, it is often called content
addressable memory.
* William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.128 38
m @ May 2021 19
SECR2033 Computer Organization & Architecture
6
Figure: Cache and Main Memory 39
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.128
6
Cache Mapping Schemes
n The “content” that is addressed in addressable cache
memory is a subset of the bits of a main memory address
called a field.
n The fields into which a memory address is divided provide
a many-to-one mapping between larger main memory and
the smaller cache memory.
n Many blocks of main memory map to a single block of
cache.
n A ________ field in the cache block distinguishes one
cached memory block from another.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.239 40
m @ May 2021 20
SECR2033 Computer Organization & Architecture
6
Line / Slot
Figure: Cache/Main Memory Structure 41
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.129
6
n Depending on the mapping scheme, cache may has two or
three fields.
n These fields depends on the particular mapping scheme being
used to determines where the data is placed when it is originally
copied into cache.
Schemes
Direct Block Direct Fully Associative Set Associative
Mapping Mapping Mapping Mapping
Figure: Cache mapping schemes 42
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.240
m @ May 2021 21
SECR2033 Computer Organization & Architecture
(a) Direct Mapping 6
n The simplest technique.
n *A cache structure in which each memory location is mapped to
exactly one location in the cache.
n Address formats:
Main memory address : Cache Memory :
Tag Line/Slot Line/Slot Tag Content
Cache Address Cache Word 43
*Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.384
6
Example 7: Direct Mapping.
A main memory contains 8 words while the cache has only 4
words. Using direct address mapping, identify the fields of the
main memory address.
Solution :
• Total memory words = 8 = 23 Main memory = 3 bits
Tag Line/Slot
à Require 3 bits for main
memory address. 1 bit 2 bits
• Total cache words = 4 = 22 44
à Require 2 bits for cache
address (Line/Slot)
• Tag size = 3 – 2 = 1 bit
m @ May 2021 22
SECR2033 Computer Organization & Architecture
Memory Address = 3 bits 6Cache Memory = 4 words = 22
Line/Slot Tag Content
Tag Line/Slot
2 bits 1 bit m bits
1 bit 2 bits
Address Content Tag 1 Line/Slot Tag Content
Tag 0 11 x m bits
1 11 10 x
1 10 01 x
1 01 00 x
1 00
0 11 (Tag + m) bits
0 10
0 01
0 00 m bits
Main Memory Cache Memory 45
6
Example 8: Direct Mapping.
A main memory contains 32 words while the cache has only 8
words. Using direct address mapping, identify the fields of the
main memory address.
Solution :
• Total memory words = 32 = 25 Main memory = 5 bits
Tag Line/Slot
à Require 5 bits for main
memory address. 2 bit 3 bits
• Total cache words = 8 = 23 46
à Require 3 bits for cache
address (Line/Slot)
• Tag size = 5 – 3 = 2 bits
m @ May 2021 23
SECR2033 Computer Organization & Architecture
Example 9: Direct Mapping. 6Size of cache memory is
64Kword and the size of main
memory is 64M × 8 bit word.
Determine the word size of main
memory, cache and the main
memory address format.
Solution :
• Total words in main memory = 64M = 26 × 220 = 226
à Require 26 bits for main memory address.
• Total cache words = 64K = 26 × 210 = 216 à 16 bits for Line/Slot
• Tag size = 26 – 16 = 10 bits
• Size of main memory word = 8 bits
à Size of cache word = Tag + (No. words in cache × size)
= 10 + (1 × 8) = 18 bits
47
6
Main memory = 26 bits Cache Memory = 64K = 216
Tag Line/Slot Line/Slot Tag Content
10 bits 16 bits 16 bits 10 bits 8 bits
Cache word 18 bits
= (Tag + size of content)
= 10 + 8 = 18 bits 48
m @ May 2021 24
SECR2033 Computer Organization & Architecture
Cache Hit and Miss 6*Cache Miss
Tag Line/Slot Line/Slot Tag Content A request for
11 Data data from the
10 Data
01 Data cache that
00 Data cannot be filled
because the
data is not
present in the
cache.
= Cache Hit
Main Memory Cache Memory
*Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.392 49
Example 10: 6A main memory contains 16
Cache hit and miss.
words while the cache has only 4
words. Using direct address
mapping, identify the fields of the
main memory address.
Solution : Main memory = 4 bits
Tag Line/Slot
• Total memory words = 16 = 24 2 bits 2 bits
à Require 4 bits for main Dec/Hex. Line/Slot Tag Content
memory address. 3 11 xx
2 10 xx
• Total cache words = 4 = 22 1 01 xx
0 00 xx
à Require 2 bits for cache
Cache Memory
address (Line/Slot) 50
• Tag size = 4 – 2 = 2 bits
m @ May 2021 25
SECR2033 Computer Organization & Architecture
Memory address: 6
à (Tag | Line/Slot)
Based on previous example, Hex. Address Content (Hex)
consider the following main
memory with contents. F 11 11 11
E 11 10 10
D 11 01 01
C 11 00 A1
B 10 11 F5
A 10 10 F4
9 10 01 F3
Dec/Hex. Line/Slot Tag Content 8 10 00 F1
7 01 11 FF
3 11 xx xx 6 01 10 91
2 10 xx xx 5 01 01 10
4 01 00 19
1 01 xx xx 3 00 11 13
00 xx xx 2 00 10 02
0 1 00 01 01
0 00 00 EE
Cache Memory Main Memory
(File: Module 6 – CacheHitMiss.doc) – Example 1 51
6A main memory
contains 32 words
Example 11: Cache hit and miss operation. while the cache has
only 8 words. Using
(Read Process)
direct address
mapping, identify the
fields of the main
Solution : memory address.
• Total memory words = 32 = 25 à 5 bits
• Total cache words = 8 = 23 à 3 bits (Line/Slot)
• Tag size = 5 – 3 = 2 bits
Main memory = 5 bits
Tag Line/Slot
2 bits 3 bits
(File: Module 6 – CacheHitMiss.doc) – Example 2 52
m @ May 2021 26
SECR2033 Computer Organization & Architecture
Memory address: Hex./Dec 6Address Content (Hex)
à (Tag | Line/Slot) 11 111 11
1F 31
1E 30 11 110 10
1D 29
1C 28 11 101 01
1B 27
1A 26 11 100 A1
19 25
18 24 11 011 F5
17 23
16 22 11 010 F4
15 21
14 20 11 001 F3
13 19
Dec/Hex. Line/Slot Tag Content 12 18 11 000 F1
11 17
7 111 xx xx 10 16 10 111 FF
……
33 10 110 91
22
6 110 xx xx 11 10 101 10
00
5 101 xx xx 10 100 19
10 011 13
4 100 xx xx 10 010 02
011 xx xx 10 001 01
3 10 000 EE
2 010 xx xx ……
xx 00 011 A3
1 001 xx 00 010 A1
0 000 xx xx 00 001 AB
00 000 01
Cache Memory Main Memory
(File: Module 6 – CacheHitMiss.doc) – Example 2 53
Based on the example, show the contents of the cache as it 6
responds to a series of request (decimal address):
22, 26, 22, 26, 16, 3, 16, 18
Generated Address Format Address
Tag Line/Slot
Address Content
10 110
(Dec) (Binary) (Hex) 11 010
10 110
22 10110 91 11 010
10 000
26 11010 F4 00 011 Complete the
10 000 table
22 10110 91 10 010
54
26 11010 F4 Main Memory
16 10000 EE
3 00011 A3
16 10000 EE
18 10010 02
(File: Module 6 – CacheHitMiss.doc) – Example 2
m @ May 2021 27
SECR2033 Computer Organization & Architecture
Based on the example, show the contents of the cache as it 6
responds to a series of request (decimal address):
22, 26, 22, 26, 16, 3, 16, 18
Format Address Format Address Read/Write operation cache
Line/Slot Tag Content Tag Line/Slot Hit Miss Update Read Write
(Hex) 10 110 cache
111 xx xx 11 010
110 x1x0 x91x 10 110 55
101 xx xx 11 010
100 xx xx 10 000
011 x0x0 xAx3 00 011
010 11x1x110 Fx4Fx402 10 000
001 xx xx 10 010
000 x1x0 xExE
Main Memory
Cache Memory
(File: Module 6 – CacheHitMiss.doc) – Example 2
Exercise 6.1: 6Base on previous example, complete the
following tables as they respond to a series
of request (hexadecimal address):
1D, 14, 1E, 1D, 14, 17, 3, 1C
Format Address Format Address Read/Write operation cache
Hit Miss Update Read Write
Line/Slot Tag Content Tag Line/Slot
cache
(Hex)
111 56
110
101
100
011
010
001
000
Cache Memory Main Memory
(File: Module 6 – CacheHitMiss.doc) – Example 2
m @ May 2021 28
SECR2033 Computer Organization & Architecture
(b) Block Direct Mapping 6
n The simplest technique of direct mapping
can maps each block of main memory into
only one possible cache line.
n Address formats: (2n words, e.g. n = 2)
Main memory address : Cache Memory :
Tag Block Word Block Tag Word 0
n bit Word 1
Word 2
Word (2n – 1 )
Cache Address Cache Word
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.134 57
Block consists of 6
Lines/Slots
Figure: Block direct mapping.
58
m @ May 2021 29
SECR2033 Computer Organization & Architecture
Main memory address : 6
Tag Block
Word
Example 12: 7 bits 7 bits 4 bits
Given the main memory address format using block direct mapping,
if each main memory word is 8 bits, calculate:
a) The main memory capacity.
b) Total cache words.
c) The size of cache words.
59
Main memory address : 6
Tag Block
Word
Solution : 7 bits 7 bits 4 bits
a) Total main memory bits = 7 + 7 + 4 = 18 bits
à Total memory words = 218 = 256Kword
à Total memory capacity = 256Kword x 8 bits = 2Mbit
b) Block = 7 bits
à Total cache words = 27 à 128 words (Block)
c) Total words = 24 = 16 words
à Cache word size = Tag + (No. of words in cache × size)
= 7 + (16 × 8)
= 132 bits
60
m @ May 2021 30
SECR2033 Computer Organization & Architecture
Main memory address : 6
Tag Block
Word
Example 13: 7 bits 4 bits 3 bits
Consider the main memory address.
Suppose a program generates the address 1AA.
à In 14-bit binary, this number is: 00 0001 1010 1010
ü The first 7 bits of the address Padding with bit 0 at
go in the tag field. MSB to complete 14-bit
ü The next 4 bits go in the Main memory address :
block field.
0000011 0101 010
ü The final 3 bits indicate the
word within the block. 61
Main memory address : 6
Example 14: (Cache hit) 1AA 0000011 0101 010
Tag Block Word
Supposed the program It will find the data it is
generates the subsequent looking in cache memory for
address 1AB.
in block 0101, word 011.
à 1AB : 00 0001 1010 1011
Cache Memory :
0101 0000011 000 001 010 011 100 101 110 111
Block Tag Words (23 = 8)
62
m @ May 2021 31
SECR2033 Computer Organization & Architecture
Main memory address : 6
Example 15: (Cache miss) 1AA 0000011 0101 010
Tag Block Word
Supposed the program A miss occurred
generates the address 3AB. (tag is different).
à In 14-bit binary, this number The tag of cache memory for
is: 00 0011 1010 1011 block 0101, must be replaced
with 0000111.
Cache Memory :
0101 00000111 000 001 010 011 100 101 110 111
Block Tag Words (23 = 8)
63
6
Exercise 6.2:
Suppose a system using 15-bit main memory addresses and 64
blocks of cache with each block contains 8 words. If the CPU
generates the main memory address of 1028 (decimal), draw:
a) The main memory addressing format
b) The cache memory with all words fields.
m @ May 2021 64
32
SECR2033 Computer Organization & Architecture 6
Solution :
000000
Block Tag Words (23 = 8)
65
Block Direct Mapping: 6Other cache
Advantages and Disadvantages
mapping schemes
are designed to
prevent this kind of
thrashing.
The technique q
is simple and
inexpensive to q A fixed cache location for any given block.
implement. If a program accesses 2 line/slot/block
that map to same line repeatedly, cache
hit ratio will be low (Known as thrashing).
q Least effective in its utilization - that is, it
may leave some cache lines unused
because a given line/slot has fixed
location.
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.138 66
m @ May 2021 33
SECR2033 Computer Organization & Architecture
(c) Fully Associative Mapping 6
n Overcomes the disadvantage of direct mapping by permitting
each main memory block to be loaded into any location
(line/slot/block) of the cache.
n A memory address is partitioned into only two fields:
Main memory address :
Tag Word
n With associative mapping, there is flexibility as to which block to
replace when a new block is read into the cache.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.245 67
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.138
*Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.403
6
§ When the cache is searched for a specific main memory block,
the tag field of the main memory address is compared to all the
valid tag fields in cache;
ü if a match is found, _________.
ü If there is no match, we have a __________ and the block
must be transferred from main memory.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.245 68
m @ May 2021 34
SECR2033 Computer Organization & Architecture
Example 16: 6Consider a memory
configuration with 214 words,
a cache with 16 blocks, and
blocks of 8 words.
• Each block = 8 words = 23 à 3 bits (Word)
• Tag size = 14 – 3 = 11 bits
Main memory address = 14 bits Cache Memory :
Tag Word Tag Word 0
Word 1
11 bits 3 bits Word 2
Word (2n – 1 )
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.245 69
Main memory address : 6
Example 17: Tag Word
11 bits 3 bits
Supposed the program generates
the memory address 666h.
à In 14-bit binary, this number It will find the data it is
is: 00 0110 0110 0110 looking in cache memory for
in tag 0CCh, word 110.
Cache Memory :
0CCh 000 001 010 011 100 101 110 111
Tag Words (23 = 8)
70
m @ May 2021 35
SECR2033 Computer Organization & Architecture
Example 18: 6Consider a memory configuration with
214 words, and blocks of 8 words. For
data in cache as follows, what is the
memory address (hexadecimal).
Cache Memory : (Data)
256h 000 001 010 011 100 101 110 111
Tag Words
à In 14-bit binary, the number 256h is: 0010 0101 0110 101
à The memory address = 001 0010 1011 0101
= 12B5h
71
6
Fully Associative Mapping:
Advantages and Disadvantages
Flexibility scheme in q
term of which block to
be replaced when a Every line’s tag is examined
new block is read into for a match in fully associative
cache (associative mapping),
the cache. cache searching gets
expensive.
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.138 72
m @ May 2021 36
SECR2033 Computer Organization & Architecture
(d) Set Associative Mapping 6
N à number of
lines in each set
n In this scheme, instead of mapping anywhere in the entire
cache, a memory reference can map only to the subset of
cache slots.
n Set Associative Mapping is a compromise that exhibits the
strengths of both the direct mapping and fully associative
mapping approaches while reducing their disadvantages.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.246 73
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.140
*Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.403
6
n A set-associative cache with n locations for a block is called an
n-way set-associative cache.
n An n-way set-associative cache consists of a number of sets,
each of which consists of n blocks.
n A block is directly mapped into a set, and then all the blocks in
the set are searched for a match.
n In set-associative cache mapping, the main memory address is
partitioned into three pieces:
Main memory address :
Tag Set Word
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.246 74
Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.403
m @ May 2021 37
SECR2033 Computer Organization & Architecture
Example 19: 2-way Main memory address : 6
Tag Set Word
8 bits 3 bits 3 bits
n Suppose we have a main memory of 214 bytes.
n It is mapped to a 2-way set associative cache having 16 blocks
where each block contains 8 words.
n Since this is a 2-way cache, each set consists of 2 blocks, and
there are 8 sets.
n Thus, we need 3 bits for the set, 3 bits for the word, giving 8
leftover bits for the tag.
75
Cache memory (Hex.): (Hex.) 6Memory Address:
Set Tag Word Tag Word Tag | Set | Word
3FFF
x7x FxFx 7x FxFx 6x 3FFE 11111111111111
7 FF 5 FF 4 3FFD 11111111111110
7 FF 3 FF 2 3FFC 11111111111101
7 FF 1 FF 0 3FFB 11111111111100
6 FF 7 FF 6 3FFA 11111111111011
6 FF 5 FF 4 3FF9
6 FF 3 FF 2 3FF8 11111111111010
6 FF 1 FF 0 11111111111001
11111111111000
3FF7 11111111110111
3FF6 11111111110110
3FF5 11111111110101
3FF4 11111111110100
3FF3 11111111110011
3FF2 11111111110010
3FF1 11111111110001
3FF0 11111111110000
76
m @ May 2021 38
SECR2033 Computer Organization & Architecture
Example 20: 4-way Main memory address : 6
Tag Set Word
9 bits 2 bits 3 bits
n Suppose we have a main memory of 214 bytes.
n It is mapped to a 4-way set associative cache having 16 blocks
where each block contains 8 words.
n Since this is a 4-way cache, each set consists of 4 blocks, and
there are 4 sets.
n Thus, we need 2 bits for the set, 3 bits for the word, giving 9
leftover bits for the tag.
77
(Hex.) Memory Address: (Hex.) Memory Address: 6
3FFF 11111111111111 3FF7 11111111110111
3FFE 11111111111110 3FF6 11111111110110
3FFD 11111111111101 3FF5 11111111110101
3FFC 11111111111100 3FF4 11111111110100
3FFB 11111111111011 3FF3 11111111110011
3FFA 11111111111010 3FF2 11111111110010
3FF9 11111111111001 3FF1 11111111110001
3FF8 11111111111000 3FF0 11111111110000
Cache memory (Hex.):
Set Tag Word Tag Word Tag Word Tag Word
3 FF 7 FF 6 FF 5 FF 4
3 FF 3 FF 2 FF 1 FF 0
2 FF 7 FF 6 FF 5 FF 4
2 FF 3 FF 2 FF 1 FF 0
78
m @ May 2021 39
SECR2033 Computer Organization & Architecture
Example 21: Main memory address : 6
Tag Set Word
8 bits 6 bits 4 bits
Suppose we want to read or write a word at the memory address
357A.
à In 18-bit binary of the main memory, this number is:
00 0011 0101 0111 1010
Tag: 13 Set: 23 Word: 10 Under set-associative
mapping, this
translates to (dec).
à So we search only the two tags in cache set 23 to see if either
one matches tag 13. If so, we have a ____________.
79
000 6
001
010 000 001
010 011
Figure: An eight- 011 100 101
block cache
100
configured as direct 101
mapped, two-way 110 110 111
set associative, 111
four-way set
associative, and
fully associative.
000 001 010 011
100 101 110 111
000 001 010 011 100 101 110 111
Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.404 80
m @ May 2021 40
SECR2033 Computer Organization & Architecture
6
Replacement Policy
n *Once the cache has been filled, when a new block is brought
into the cache, one of the existing blocks must be replaced.
How do we determine
which block in cache
should be replaced?
n The algorithm for determining replacement is called the
replacement policy.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.247 81
* William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.145
6
n A replacement policy is invoked when it becomes necessary to
evict a line/slot/block from cache.
n *There are several popular replacement policies:
à One that is not practical but that can be used as a
benchmark by which to measure all others is the optimal
algorithm.
n Although it is impossible to implement an optimal replacement
algorithm, it is instructive to use it as a benchmark for
assessing the efficiency of any other scheme.
* Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.247 82
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.145
m @ May 2021 41
SECR2033 Computer Organization & Architecture
6
Schemes
Direct Block Direct Fully Associative Set Associative
Mapping Mapping Mapping Mapping
q No choice is possible q FIFO (First In First Out)
à There is only one q LRU (Least Recently Used)
possible line/slot/block q LFU (Least Frequently Used)
for any particular
q Random
line/slot/block.
Figure: The replacement algorithms for cache memory.
83
6
Table: The replacement algorithms for cache memory.
Algorithms Operational
FIFO
(First In First Out) Replace block that has been in
LRU cache longest.
(Least Recently Used)
Keeps track of the last time that a
LFU block was assessed and evicts
(Least Frequently Used) block that has been unused for
Random the longest period of time.
Replace block which has had
fewest hits.
Picks a block at random and
replaces it with a new block.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.248 84
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.145
m @ May 2021 42
SECR2033 Computer Organization & Architecture
6
Which replacement
algorithm is the best?
ü The algorithm selected often
depends on how the system
will be used.
ü No single (practical) algorithm
is best for all scenarios.
ü For that reason, designers
use algorithms that perform
well under a wide variety of
circumstances.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.248 85
Cache Write Policies 6
n In addition to determining which victim to select for
replacement, designers must also decide what to do with so-
called ___________ of cache, or blocks that have been
modified in cache.
n Dirty blocks must be written back to memory.
n A write policy determines how this will be done.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.249 86
*William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.145
m @ May 2021 43
SECR2033 Computer Organization & Architecture
6
n *There are two basic write policies:
Write Policies
Write Through Write Back
Figure: Cache write policy techniques
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.249 87
*William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.145
Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.393
Cache Write Policies: 6
(a) Write Through
n A write-through policy updates both the cache and the main 88
memory simultaneously on every write.
q
Every write to the cache requires a main memory
access, essentially slows the system down to main
memory speed.
n However, in real applications, the majority of accesses are
reads so this slow-down is negligible.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.249
m @ May 2021 44
SECR2033 Computer Organization & Architecture
Cache Write Policies: 6n A write-back policy (also called
(b) Write Back copyback) only updates blocks in
main memory when the cache
block is selected as a victim and
must be removed from cache.
q q
q Normally faster than write- q Main memory & cache may
through because time is not not contain the same value
at a given instant of time.
wasted writing information
to memory on each write to q Data in cache may be lost,
cache. if a process terminates
(crashes) before the write
q Memory traffic is also to main memory is done.
reduced.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.249 89
6
Cache Performances
n To improve the performance of cache, one must increase the
hit ratio by using :
q a better mapping algorithm (up to roughly a 20% increase),
q better strategies for write operations (potentially a 15%
increase),
q better replacement algorithms (up to a 10% increase), and
q better coding practices (up to a 30% increase in hit ratio).
n Simply increasing the size of cache may improve the hit ratio by
roughly 1– 4%, but is not guaranteed to do so.
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.249 90
m @ May 2021 45
SECR2033 Computer Organization & Architecture
Average Memory Access Time (AMAT) 6
n To capture the fact that the time to access data for both
hits and misses affects performance, designers sometime
use AMAT as a way to examine alternative cache designs.
AMAT = Time for a hit + (Miss rate Miss penalty)
n AMAT is the average time to access memory considering both
hits and misses and the frequency of different accesses.
Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.402 91
6
Example 22:
Assume that 33% of the instructions in a program are data
accesses. The cache hit ratio is 97% and the hit time is one cycle,
but the miss penalty is 20 cycles.
AMAT = Time for a hit + (Miss rate Miss penalty)
AMAT = 1 cycle + (3% 20 cycles)
= 1 cycle + (0.03 20 cycles)
= 1.6 cycles
If the cache was perfect and never missed, the AMAT would be
one cycle. But even with just a 3% miss rate, the AMAT here
increases 1.6 times!
92
m @ May 2021 46
SECR2033 Computer Organization & Architecture
6
Exercise 6.3:
Find the AMAT for a processor with a 1ns clock cycle time, a miss
penalty of 20 clock cycles, a miss rate of 0.05 misses per
instruction, and a cache access time (including hit detection) of 1
clock cycle. Assume that the read and write miss penalties are the
same and ignore other write stalls .
Patterson, D.A. and Hennessy, J.L. (2014). Computer Organization and Design: The Hardware/Software Interface (5th Edition). United States: Elsevier, p.402 93
Effective Access Time (EAT) 6
n The performance of a hierarchical memory is measured by its
EAT, or the average time per access.
n EAT is a weighted average that uses the hit ratio and the
relative access times of the successive levels of the hierarchy.
H : Cache hit rate This formula can be
AccessC : Access time for cache extended to apply
AccessMM : Access time for main memory to three- or even
four-level memories
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.248 94
William Stallings (2016). Computer Organization and Architecture: Designing for Performance (10th Edition). United States: Pearson Education Limited, p.147
m @ May 2021 47
SECR2033 Computer Organization & Architecture
6
Example 23:
Suppose the cache access time is 10ns, main memory access
time is 200ns, and the cache hit rate is 99%.
The average time for the processor to access an item in this two-
level memory would then be:
Linda Null and Julia Lobur (2003). The Essentials of Computer Organization and Architecture. United States: Jones and Bartlett Publishers. p.248 95
6
Exercise 6.4:
Suppose a single cache fronting a main memory, which has 80
nanosecond access time, and the cache memory has access time 10
nanoseconds. Calculate the effective access time if the hit rate are:
(a) 90% (b) 99%
m @ May 2021 96
48
SECR2033 Computer Organization & Architecture
6
Example 24:
• Suppose a L1 cache with AccessC1 = 4 nanoseconds and HC1 =
0.9
• Suppose a L2 cache with AccessC2 = 10 nanoseconds and HC2 =
0.99
• This is defined to be the number of hits on references that are a
miss at L1.
• Suppose a main memory with AccessMM = 80.0
97
6
m @ May 2021 49
SECR2033 Computer Organization & Architecture
Review Questions 6
6.3.1. What are the differences among direct mapping,
associative mapping, and set-associative mapping?
6.3.2. For a cache mapping, a main memory address is viewed
as consisting of few fields. List and define the fields based
on the following schemes:
(a) Direct mapping.
(b) Fully associative mapping.
(c) Set associative mapping.
William Stallings (2013). Computer Organization and Architecture: Designing for Performance (9th Edition). United States: Pearson Education Limited, p.153. 99
6
6.3.3. A set-associative cache consists of 64 lines, or slots,
divided into four-line sets. Main memory contains 4K
blocks of 128 words each. Show the format of main
memory addresses
6.3.4. A two-way set-associative cache has lines of 16 bytes and
a total size of 8 kB. The 64-MB main memory is byte
addressable. Show the format of main memory addresses.
6.3.5. For the hexadecimal main memory addresses 111111h,
666666h, BBBBBBh, show the following information, in
hexadecimal format :
(a) Direct mapping.
(b) Fully associative mapping.
(c) 2-way Set associative mapping.
William Stallings (2013). Computer Organization and Architecture: Designing for Performance (9th Edition). United States: Pearson Education Limited, p.153. 100
m @ May 2021 50