The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.
Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by zamzilah05, 2022-02-07 22:50:46

Computer Arithmetics

Computer Arithmetics

BASIC COMPUTER
ARCHITECTURE &
ARITHMETIC

AZURA BINTI HARON @ MAHKTAR
HUDA BINTI AZUDDIN
TAN SIEW HUA

JABATAN KEJURUTERAAN ELEKTRIK
POLITEKNIK SEBERANG PERAI

Basic Computer Architecture & Arithmetic i

Copyright Declaration

©2021, Azura Binti Haron @ Mahktar, Huda Binti Azuddin and Tan Siew Hua
eISBN 978-967-0783-81-9

Published by

Politeknik Seberang Perai
Jalan Permatang Pauh, 13500 Permatang Pauh

Pulau Pinang

Perpustakaan Negara Malaysia Cataloguing-in-Publication Data
Azura Haron@Mahktar, 1977-

COMPUTER ARCHITECTURE & ORGANIZATION : BASIC COMPUTER ARCHITECTURE &
ARITHMETIC / AZURA BINTI HARON@ MAHKTAR, HUDA BINTI AZUDDIN, TAN SIEW HUA.

Mode of access: Internet
eISBN 978-967-0783-81-9
1. Computer architecture.
2. Computer arithmetic.
3. Government publications--Malaysia.
4. Electronic books. I. Huda Azuddin. II. Tan, Siew Hua.
III. Title.
004.22

ii Basic Computer Architecture & Arithmetic

COMPUTER ARCHITECTURE & ORGANIZATION
BASIC COMPUTER ARCHITECTURE & ARITHMETIC

Basic Computer Architecture & Arithmetic iii

PREFACE

The subject of Computer Architecture and Arithmetic, which is included in the course,
Computer Architecture and Organization, aimed at helping students to understand a few
fundamental concepts with the aid of tutorials. The first section of this e-book discusses Von
Neumann and Harvard architectures. The second section describes computer architecture,
while the final section highlights arithmetic operations. The tutorials in this e-book covers
discussion and understanding of computer architecture, and arithmetic operations. We hope
the notes and the tutorials will help students better understand this topic.

iv Basic Computer Architecture & Arithmetic

TABLE OF CONTENTS

1. UNDERSTANDING the ORGANIZATION OF THE VON NEUMANN
ARCHITECTURE & HARVARD ARCHITECTURE ................................................. 1
1.1 DESCRIBE THE ORGANIZATION OF A VON NEUMANN MACHINE ................................1
1.2 DESCRIBE VON NEUMANN MACHINE MAJOR FUNCTION & OPERATION UNITS.........2
1.3 PIPELINE TECHNIQUES IN COMPUTER ARCHITECTURE OPERATION............................7
1.4 NUMBER SYSTEM IN COMPUTER ARCHITECTURE........................................................9
1.5 ARITHMETIC OPERATION FOR NUMBER SYSTEM IN COMPUTER ARCHITECTURE.....10
1.6 FLOATING-POINT UNIT (FPU)......................................................................................11
1.7 TUTORIAL ....................................................................................................................13

2. UNDERSTANDING COMPUTER ARCHITECTURE .......................................... 15

2.1 EXPLAIN THE STRENGTHS & WEAKNESSES OF THE VON NEUMANN.........................15
2.2 COMPARE THE VON NEUMANN & HARVARD ARCHITECTURE...................................16
2.3 POSSIBLE STEPS TO BE CARRIED OUT FOR AN INSTRUCTION CYCLE OF PIPELINE
OPERATION .......................................................................................................................16
2.4 FIVE-STEP PIPELINE EXECUTE PROCESS ......................................................................19
2.5 RELATE PIPELINE OPERATION TO IMPROVE COMPUTER PERFORMANCE USING
BLOCK DIAGRAM ..............................................................................................................20
2.6 TUTORIAL ....................................................................................................................22

3. ARITHMETIC OPERATION................................................................................ 26

3.1 INTEGER OPERATION OF BINARY SYSTEM..................................................................26
3.1.1 ADDITION.................................................................................................................26
3.1.2 SUBSTRACTION ........................................................................................................27
3.1.3 MULTIPLICATION .....................................................................................................27
3.1.4 DIVISION ..................................................................................................................28
3.2 USE COMPLEMENT TO REPRESENT NEGATIVE NUMBER ...........................................29
3.2.1 ADDITION AND SUBSTRACTION IN THE FIRST COMPLEMENT ................................29
3.2.2 ADDITION AND SUBTRACTION IN THE SECOND COMPLEMENT .............................30
3.3 COMPUTE FLOATING POINT REPRESENTATION USING IEEE 754 FORMAT ...............33
3.3.1 BINARY REPRESENTATION .......................................................................................34

Basic Computer Architecture & Arithmetic v

3.3.2 ADDITION AND SUBTRACTION ...................................................................... ……. 35
3.3.3 TUTORIAL .................................................................................................................36
REFERENCES...................................................................................................................... 42



Basic Computer Architecture & Arithmetic 1

1. UNDERSTANDING THE ORGANIZATION OF THE VON
NEUMANN ARCHITECTURE & HARVARD ARCHITECTURE

1.1 DESCRIBE THE ORGANIZATION OF A VON NEUMANN MACHINE

The mathematician John von Neumann, who was a consultant on the ENIAC project proposed
his idea in the first publication for a new computer, the EDVAC (Electronic Discrete Variable
Computer) in 1945. This idea, known as the stored-program concept is suggested due to
tedious task of inserting and altering programs for the ENIAC. A suitable form of program
could be represented and stored in memory together with the data. When the computer
requires the instructions, it could read them from the memory. Later, if the program needs to
be set or modified, the values of a portion of memory could be changed. (Stallings, 2013)

John von Neumann and his colleagues began the design of a new stored-program computer
at the Princeton Institute for Advanced Studies in 1946, mentioned to as the IAS computer.
Figure 1 shows the overall structure of IAS computer. The structure consists of four sub-
components in von Neumann architecture as below:

 A main memory to store both data and instructions
 A control unit to decodes the instructions in memory

and initiates them to be executed
 An arithmetic and logic unit (ALU) capable of

operating on binary data
 An input or output (I/O) equipment managed by the

control unit

2 Basic Computer Architecture & Arithmetic

Figure 1.1: Structure of the IAS computer (Stallings, 2013)
1.2 DESCRIBE VON NEUMANN MACHINE MAJOR FUNCTION & OPERATION UNITS
There are four components described under von Neumann model as shown in Figure 1.2.

Figure 1.2: The von Neumann machine (Sanjay J. Patel, 2004)
a. Main memory

Basic Computer Architecture & Arithmetic 3

Main memory consists of a compilation of locations. Each of location can store both
instructions and data. Every location indexed by a unique address, which will be used
to access the location and the contents of the location either the instructions or data
stored in the location (Pacheco, 2011). Figure 1.3 shows the correlation between
memory address and memory size where k is a bit count of address and m is bit count
of data or instruction stored in memory.

Figure 1.3: Size of a memory module
Communication between memory and processing unit consists of two registers, MDR
and MAR. Memory data register (MDR) holds a word to be stored in memory either
comes from I/O unit or to be sent to the I/O unit. While the memory address register
(MAR) gives the address in memory of the word to be read or written into the MDR.
There are two types of basic operations within memory unit. The first is the data or
instructions are fetched or read from memory and transferred to the computer
processing unit, CPU. Secondly, the data or instructions from the CPU to memory are
written or stored to memory. Below are the simple steps taken when READ and WRITE
operations are performed.

4 Basic Computer Architecture & Arithmetic

b. Control Unit
The control unit is responsible for choosing which instructions in a program should be
performed. It interprets instructions and guides processing unit through program,
determining the sequence of instructions. The control unit (CU) is implemented as finite
state machine (FSM). FSM directs all activity in the processing unit using clock-based
step-by-step and cycle-by-cycle processing. The FSM is controlled by clock signal,
instruction register and reset signal (Kholodov, n.d.).
A very fast storage called registers are used in CPU, containing the information about
the state of an executing program are stored. Control unit have two special registers
called the program counter (PC) and instruction register (IR). Program counter register
contains the address of the next instruction to be fetched from memory and be
executed, while the instruction registers store current 8-bit opcode instruction being
executed.

Two main functions of control unit (CU) are to reads the instructions from the memory and
secondly to decode the instructions, generating signals that tell the other components the
next task to be done.

Basic Computer Architecture & Arithmetic 5

c. Arithmetic Logic Unit (ALU)
The arithmetic logic unit (ALU) is a functional part of the digital computer that conduct
arithmetic and logic operations on machine words that represent operands (Avižienis,
2003). ALU is capable of performing arithmetic operations (e.g. ADD, SUBTRACT) and
logical operations (e.g. AND, OR and NOT). Figure 1.4 depicts the structure of ALU.
The size of input capacities of ALU is often described to as word length of the computer.
Many processors nowadays have word length of 32 and 64 bit. The word length or size
is the number of bits normally processed by ALU in one instruction, also indicating the
width of registers. Processing unit also comprises a set of registers for temporary
storage of data and memory addressing.

Figure 1.4: The arithmetic logic unit (ALU) (Sanjay J. Patel, 2004)

6 Basic Computer Architecture & Arithmetic

d. Register (Accumulator)

Most processors nowadays have an accumulator register to automatically store the
outcome of a processing operation from the arithmetic logic unit (ALU). Many modern
microprocessors have a larger number of CPU data registers which can function as
accumulators in complex arithmetic and logic operations. These registers hold
operands, result and data during the program execution (Laughton & Warne, 2003).

Without an intermediate register like an accumulator, the CPU may have to write the
result of each calculation (eg. add, multiplication, shift) directly to the main memory,
possibly only to be read right back again for use in the next execution operation.
Accessing the main memory is much slower than to access a temporary register like the
accumulator due to different type of technology used for the large main memory is
slower but cheaper compared to temporary register.

e. Input / Output unit

Input/output (I/O) systems are used to connect external devices, the peripherals to a
computer. In a personal computer, the devices normally include monitors, keyboards,
mouse, printers and wireless networks. A CPU accesses an input or output device using
the address and data busses similarly to accesses memory unit (Harris & Harris, 2016).
I/O unit is a device for moving data into and out of a computer memory.

Processing unit needs intermediary unit to communicate between the peripherals.
Input and output controller provides the necessary interface to input or output
devices. This controller operates the low-level and device-dependent details. In

Basic Computer Architecture & Arithmetic 7

addition, the controller manages the necessary electrical signal interface. Figure 1.5
illustrates the connection between system bus from the processing unit to the input
or output devices.

Figure 1.5: The input or output controller

1.3 PIPELINE TECHNIQUES IN COMPUTER ARCHITECTURE
OPERATION

A pipeline is the mechanism used to execute instructions in a RISC processor. Pipelining is a
technique applied to enhance the execution throughput of a CPU by utilizing the processor
resources in a more efficient approach. Using a pipeline accelerates execution process by
fetching the next instruction as the previous instructions are being decoded and executed
(Sloss, Symes, & Wright, 2004). Figure 1.6 illustrates the ARM7 3-stage pipeline that is
consisted of fetch, decode and execute process.

Figure 1.6: Pipelined execution of ARM7 instructions (Admin, 2020)

8 Basic Computer Architecture & Arithmetic

The ARM7 has a three-stage pipeline comprises of fetch, decode and execute sequence. Each
of these operations for standard instructions needs one clock cycle. Hence, a typical
instruction requires three clock cycles to be entirely executed, described as the latency of
instruction executions. To be specific, the pipeline has a throughput of one instruction per
cycle.

The three instructions are placed into the pipeline consecutively. In the first cycle, the core
fetches the ADD instruction from the memory. Then the core fetches the SUB instruction and
decodes the ADD instruction in the second cycle. Next, both the SUB and ADD instructions are
shifted along the pipeline in the third time cycle. When the ADD instruction is executed,
simultaneously the SUB instruction is decoded, and the CMP instruction is fetched. This
method is known as filling the pipeline. The pipeline permits the core to execute an instruction
every cycle parallelly. The amount of work completed at each stage is reduced when the
pipeline length is increased. This enables the processor to achieve a higher operating
frequency and consecutively increases the performance.
Each of the ARM family pipeline design is varies. As an example, Figure 1.7 displayed the
ARM9 core upgrades the pipeline length to five stages. The ARM9 slot in a memory and write
back stage, allows the ARM9 to process on average 1.1 Dhrystone MIPS per MHz—an increase
in instruction throughput by around 13% as compared to an ARM7. The maximum core
frequency achievable using an ARM9 is also higher (Sloss, Symes, & Wright, 2004).

Figure 1.7: A five-stage (five clock cycle) ARM9 state pipeline

Basic Computer Architecture & Arithmetic 9

During normal operation, (Arm Developer, Arm Limited, 2002):
 one instruction is being fetched from memory
 the previous instruction is being decoded
 the instruction before that is being executed
 the instruction before that is performing data accesses (if applicable)
 the instruction before that is writing its data back to the register bank.

1.4 NUMBER SYSTEM IN COMPUTER ARCHITECTURE

A number system is a system representing numbers which is also called the system of numeration. It
expresses a set of values correspond to a quantity. A number system is defined as the representation
of numbers by using digits or other symbols in a perpetual approach. The value of each digit in a
number can be defined by a digit, the number’s position, and the base or radix of the number
system as depicted in Figure 1.8 as the four main categories of numbering system widely used. The
numbers are represented in a unique manner and able to manipulate the arithmetic operations such
as addition, subtraction, and division.

Figure 1.8: Types of number system

Computer hardware are build using logic circuits alongside many other types of products.
These products are generally classified as digital hardware. The rationale why the term digital
has been used considering the way in which information in computer system is represented,
as electronic signals which correspond to digits of information. Each of these electronic signals

10 Basic Computer Architecture & Arithmetic

can be assumed of as presenting one digit of information. Each digit is allowed to hold on only
two possible values, generally represented as 0 and 1. This indicates that all information in
logic circuits is denoted as patterns of 0 and 1 digits (Brown & Vranesic, 2009).
In daily life, the use of decimal numbers is quite mundane to us. Nevertheless, since the binary
digit is used to represent a piece of information inside the computer, the binary numbers also
should be conversant similar to other numbering notation. Table 1.1 shows the interrelation
between the four main numbering systems.

Table 1.1: Numbers from 0 to 15 in binary, octal and hexadecimal

1.5 ARITHMETIC OPERATION FOR NUMBER SYSTEM IN COMPUTER
ARCHITECTURE

Arithmetic operations for numbering system are usually involved in:
 Counting
 Addition
 Subtraction
 Multiplication
 Division

Addition of positive numbers is the same for all four number representations. Essentially, the
addition of unsigned numbers is same but there are notable differences when negative

Basic Computer Architecture & Arithmetic 11

numbers are involved in the operation. The problems that occur become noticeable if we
consider operands with different combinations of signs. The difficulties can be easily solved
using the 1st complement or 2nd complement addition, replacing the sign and magnitude
addition.

1.6 FLOATING-POINT UNIT (FPU)

A floating-point unit (FPU) particularly a math coprocessor is a part of a computer system
specifically designed to perform operations on floating point numbers for common
operations such as addition, subtraction, multiplication, division, square root, and bit shifting.
Several older systems can also carry out various transcendental functions such as
trigonometric or exponential calculations, however in most modern processors these are
performed with software library routines.
Once a CPU is executing a program that requests for a floating-point operation, there are
three approaches to implement it out either using a floating-point unit emulator, add-on FPU
or integrated FPU. The floating-point numbers allow an arbitrary number of decimal places to
the right of the decimal point, for example as 0.5 x 0.25 = 0.125. However, this equation can
be expressed in scientific notation as the example below:

The IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-
point computation which was adopted in 1985 by the Institute of Electrical and Electronics
Engineers (IEEE) and been revised in 2008 as the scope and purpose displayed in Figure 1.9
(Saurabh, 2021). This standard was developed to simplify the portability of programs from
one processor to another and to assist the development of sophisticated. The standard has
been widely implemented and is used on virtually all contemporary processors and arithmetic
coprocessors. IEEE 754-2008 covers both binary and decimal floating-point representations

12 Basic Computer Architecture & Arithmetic

(Stallings, 2013). IEEE Standard 754 floating point is the most common representation today
for real numbers on computers, including Intel-based PC’s, Macs, and most Unix platforms.

Figure 1.9: Scope and purpose Standard IEEE 754 (Saurabh, 2021)
There are several ways to represent floating-point number, but IEEE 754 is the most efficient
in most cases. IEEE 754 has three basic components:

IEEE 754 numbers are separated into two categories based on the above three components,
known as single precision and double precision shown in Figure 1.10 and Figure 1.11.

Figure 1.10: Single Precision IEEE 754 Floating-Point Standard

Basic Computer Architecture & Arithmetic 13

Figure 1.11: Double Precision IEEE 754 Floating-Point Standard

Based on the figures above, the comparison of the single and double precision allocation
bits of IEEE 754 Floating-Point Standard can be described in Table 1.2 below. Both sign bit of
the single and double precision allocated on the most significant bit, then followed by
biased exponent and normalized mantissa bits.

Table 1.2: Comparison between Single and Double Precision

1.7 TUTORIAL

a. Describe the major components of the Von Neumann model.
Answer:
There are 4 major components in Von Neumann model as below:
 A main memory to store both data and instructions
 A control unit to decodes the instructions in memory and initiates them to be
executed
 An arithmetic and logic unit (ALU) capable of operating on binary data
 An input or output (I/O) equipment managed by the control unit

14 Basic Computer Architecture & Arithmetic

b. Describe the von Neumann model as proposed in EDVAC 1945.

Answer:
The von Neumann model is proposed by mathematician John von Neumann, who was
a consultant on the ENIAC project. He proposed his idea in the first publication for a
new computer, the EDVAC (Electronic Discrete Variable Computer) in year 1945
suggesting the stored-program concept due to tedious task of inserting and altering
programs for the ENIAC. There are four sub-components in von Neumann architecture,
known as control unit, ALU, memory and input-output unit.

c. Identify the purpose of pipeline techniques in computer operation.

Answer:
Pipelining is a technique applied to enhance the execution throughput of a CPU by
utilizing the processor resources in a more efficient approach. Using a pipeline
accelerates execution process by fetching the next instruction as the previous
instructions are being decoded and executed.

d. Define the allocation bits of Single Precision IEEE 754 Floating-Point Standard.

Answer:
In IEEE 754 Floating-Point Standard for Single Precision, the normalized mantissa is
allocated in bit 0 – bit 22. The biased exponent was distributed in bit 23 – bit 30 and the
sign bit allocated in MSB bit, bit 31. The sketch below shows the division of the assigned
bits.

Basic Computer Architecture & Arithmetic 15

2. UNDERSTANDING COMPUTER ARCHITECTURE

Figure 2.1: What is computer architecture (Admin, 2020)

2.1 EXPLAIN THE STRENGTHS & WEAKNESSES OF THE VON
NEUMANN

The basic difference between von Neumann architecture and others is that von Neumann
shares memory between instructions and data.
Strengths
Requires less hardware
Separate data busses are not required.
Weaknesses
A von Neumann architecture has only one bus which is used for both data transfers and
instruction fetches, and therefore data transfers and instruction fetches must be scheduled -
they cannot be performed at the same time, the whole execution process is slower or bus can
be extremely congested.

16 Basic Computer Architecture & Arithmetic

2.2 COMPARE THE VON NEUMANN & HARVARD ARCHITECTURE

Harvard architecture has the
program memory and data
memory as separate memories
and are accessed from separate

buses.

Von Neumann architecture in
which program and data are
fetched from the same memory

using the same bus.

The Harvard (RISC) architecture utilizes two busses, a data bus and a separate address bus.
RISC stands for Reduced Instruction Set Computer.
In von Neumann (CISC) architecture employs a single bus. CISC stands for Complex Instruction
Set Computer.

2.3 POSSIBLE STEPS TO BE CARRIED OUT FOR AN INSTRUCTION CYCLE
OF PIPELINE OPERATION

To improve the performance of a CPU we have two options:
a. Improve the hardware by introducing faster circuits.
b. Arrange the hardware such that more than one operation can be performed at the
same time.

Basic Computer Architecture & Arithmetic 17

Since, there is a limit on the speed of hardware and the cost of faster circuits is quite high, we
have to adopt the 2nd option.

Pipelining: Pipelining is a process of arrangement of hardware elements of the CPU such that
its overall performance is increased. Simultaneous execution of more than one instruction
takes place in a pipelined processor.

Let us see a real life example that works on the concept of pipelined operation (Saurabh,
2021). Consider a water bottle packaging plant. Let there be 3 stages that a bottle should pass
through, Inserting the bottle (I), Filling water in the bottle (F), and Sealing the bottle(S). Let us
consider these stages as stage 1, stage 2 and stage 3 respectively. Let each stage take 1 minute
to complete its operation.

Now, in a non-pipelined operation, a bottle is first inserted in the plant, after 1 minute it is
moved to stage 2 where water is filled. Now, in stage 1 nothing is happening. Similarly, when
the bottle moves to stage 3, both stage 1 and stage 2 are idle.

But in pipelined operation, when the bottle is in stage 2, another bottle can be loaded at stage
1. Similarly, when the bottle is in stage 3, there can be one bottle each in stage 1 and stage 2.
So, after each minute, we get a new bottle at the end of stage 3. Hence, the average time
taken to manufacture 1 bottle is:

Without pipelining = 9/3 minutes = 3 minutes

I FS

I FS

IF S
With pipelining = 5/3 minutes = 1.67 minutes (9 minutes)

I FS
I FS
IF S

(5 minutes)

18 Basic Computer Architecture & Arithmetic

Thus, pipelined operation increases the efficiency of a system.

The Process of fetching the next instruction while the current instruction is being executed is
called as “pipelining”. Pipelining is supported by the processor to increase the speed of
program execution thus increases throughput. Several operations take place simultaneously,
rather than serially in pipelining.

a. Three stages Pipeline
The three stages used in the pipeline are:
(i) Fetch: In this stage the processor fetches or reads the instruction from the memory.
(ii) Decode: This stage determines which instruction is to be executed.
(iii) Execute: In this stage the processor processes the instruction and writes the result
back to the desired register.

The Pipeline has three stages namely fetch, decode and execute as shown in Figure 2.2 below.

• Instruction fetched from memory

FETCH

• Decoding of registers used in the instsruction

DECODE

• Process the instruction

EXECUTE

Figure 2.2: Pipeline 3 stages
If these three stages of execution are overlapped, we will achieve higher speed of execution.

Basic Computer Architecture & Arithmetic 19

Figure 2.3 shows three staged pipelined instruction.

Instruction Fetch Decode Execute
1 Fetch Decode
Fetch Execute
2 Decode

3 Execute

Time

Figure 2.3 Single cycle instruction execution for a 3 state pipeline

In the first cycle, the processor fetches instruction 1 from the memory in the second cycle the
processor fetches instruction 2 from the memory and decodes instruction 1. In the third cycle
the processor fetches instruction 3 from the memory, decodes instruction 2 and executes
instruction 1. In the fourth cycle the processor fetches instruction 4, decodes instruction 3
and executes instruction 2. The pipeline thus executes an instruction in three cycles i.e. it
delivers a throughput equal to one instruction per cycle.

As more number of cycles are required to fill the pipeline, the system latency also increases.
The data dependency between the stages can also be increased as the stages of pipeline
increase. So, the instructions need to be scheduled while writing the code to decrease data
dependency.

2.4 FIVE-STEP PIPELINE EXECUTE PROCESS

Most modern CPUs are driven by a clock. The CPU consists internally of logic and register (flip-
flop). When the clock signal arrives, the flip flops take their new value and the logic then
requires a period of time to decode the new values. Then the next clock pulse arrives and the
flip flops again take their new values, and so on. By breaking the logic into smaller pieces and
inserting flip flops between the pieces of logic, the delay before the logic gives valid outputs
is reduced. In this way the clock period can be reduced. For example, the classic RISC pipeline
is broken into five stages with a set of flip flops between each stage.

20 Basic Computer Architecture & Arithmetic

Pipeline Stages

RISC processor has 5 stage instruction pipeline to execute all the instructions in the RISC
instruction set. The 5 stages of RISC pipeline with their respective operations are listed below:

 Stage 1 (Instruction Fetch)
In this stage the CPU reads instructions from the address in the memory whose value is
present in the program counter.

 Stage 2 (Instruction Decode)
In this stage, instruction is decoded and the register file is accessed to get the values from the
registers used in the instruction.

 Stage 3 (Instruction Execute)
In this stage, ALU operations are performed.

 Stage 4 (Memory Access)
In this stage, memory operands are read and written from/to the memory that is present in
the instruction.

 Stage 5 (Write Back)
In this stage, computed/fetched value is written back to the register present in the
instructions

2.5 RELATE PIPELINE OPERATION TO IMPROVE COMPUTER
PERFORMANCE USING BLOCK DIAGRAM

a. Three stages Pipeline

3 stage pipeline:
Instruction Fetch (IF) - Instruction Decode (ID) – Instruction Execute (EX)

Non-pipelined processor:
Number of instruction (3) * Number of stages (3) = 9 cycles

Basic Computer Architecture & Arithmetic 21

By using 3 stages non-pipelining processor:
It will take 9 cycles to complete 3 instructions
Assume 1 cycles will take 200ns, so 3 instructions will take a total of 1800ns.

 9 * 200ns = 1800ns
Pipelined processor:
Start-up latency (2) + Number of stages (3) = 5 cycles

If a 3 stages pipeline technique is used. It will only take 5 cycles to complete all
instructions to complete all instruction will take only 5 cycles.
So, the total time is 1000ns

 5 * 200ns = 1000ns
 This shows how pipeline techniques can improve CPU performance.

b. Five stages Pipeline
5 stage pipeline:
Instruction Fetch (IF) - Instruction Decode (ID) – Instruction Execute (EX) –
Memory Access (ME) – Write Back (WB)

Non-pipelined processor:
Number of instruction (5) * Number of stages (5) = 25 cycles

22 Basic Computer Architecture & Arithmetic

By using 5 stages non-pipelining processor:
It will take 24 cycles to complete 5 instruction.
Assume 1 cycles will take 200ns, so 5 instructions will take a total of 5000ns.
Pipelined processor:
Start-up latency (4) + Number of stages (5) = 9 cycles

If a 5 stages pipeline technique is used, it will only take only 9 cycles to complete all
instruction. So, the total time is 1800ns. This shows how pipeline techniques can
improve CPU performance.

2.6 TUTORIAL

a. Explain the strengths & weaknesses of the Von Neumann architecture.

Basic Computer Architecture & Arithmetic 23

Answer
Strengths

i. Requires less hardware since they shares memory between instructions and data.
ii. Only one bus which is used for both data transfers and instruction fetches so the

separate data busses are not required.
Weaknesses

i. A von Neumann architecture has only one bus which is used for both data transfers
and instruction fetches, and therefore data transfers and instruction fetches must be
scheduled - they cannot be performed at the same time, the whole execution process
is slower or bus can be extremely congested.

b. List the differences between Von Neumann & Harvard architecture.
Answer
The basic difference between von Neumann architecture and others is that von
Neumann shares memory between instructions and data.

c. Discuss a real life example that works on the concept of pipelined operation.
Answer
Any related example is accepted. Example: water bottle packaging plant

d. Calculate the time requirement to complete the 4 instructions using 3 stages pipelining
and non-pipelining technique.
Answer

3 stage pipeline:

24 Basic Computer Architecture & Arithmetic

Instruction Fetch (IF) - Instruction Decode (ID) – Instruction Execute (EX)

By using 3 stages non-pipelining processor:
It will take 12 cycles to complete 4 instruction.
Assume 1 cycles will take 200ns, so 4 instructions will take a total of 2400ns.

 12 * 200ns = 2400ns

But if using 3 stages pipeline technique, it will take only 6 cycles to complete all
instruction. So, the total time is 1200ns

 6 * 200ns = 1200ns

e. Calculate the time requirement to complete the 4 instructions using 5 stages pipelining and
non-pipelining technique using block diagram.
Answer

5 stage pipeline:
Instruction Fetch (IF) - Instruction Decode (ID) – Instruction Execute (EX) – Memory
Access (ME) – Write Back (WB)
By using 5 stages non-pipelining processor:

It will take 20 cycles to complete 4 instruction.
Assume 1 cycles will take 200ns, so 4 instructions will take a total of 4000ns.
 20 * 200ns = 4000ns

But if 5 stages pipeline processor is used:

Basic Computer Architecture & Arithmetic 25

It will only take 8 cycles to complete all instruction.
So, the total time is 1600ns
 8 * 200ns = 1600ns

26 Basic Computer Architecture & Arithmetic

3. ARITHMETIC OPERATION

3.1 INTEGER OPERATION OF BINARY SYSTEM

 Addition and Subtraction
 Multiplication and Division

3.1.1 ADDITION

 The four basic rules for adding binary digits are as follows:

0+0=0 carry 0
0+1=1 carry 0
1+0=1 carry 0
1+1=0 carry 1

Basic Computer Architecture & Arithmetic 27

3.1.2 SUBSTRACTION

 The four basic rules for subtracting binary digits are as follows:

0–0=0
1–1=0
1–0=1
0 – 1 = 1 borrow 1

3.1.3 MULTIPLICATION

 Multiplication is achieved by adding a list of shifted
 Multiplicands according to the digits of the multiplier.
 The four basic rules for subtracting binary digits are as follows:

0*0=0
1*1=1
1*0=0
0*1=0

28 Basic Computer Architecture & Arithmetic

 Example

3.1.4 DIVISION

 Follows the same rules as in decimal division. For the sake of simplicity, throw away the
remainder.

Basic Computer Architecture & Arithmetic 29

3.2 USE COMPLEMENT TO REPRESENT NEGATIVE NUMBER

 Addition and subtraction in the first complement system
 Addition and subtraction in the second complement system

3.2.1 ADDITION AND SUBSTRACTION IN THE FIRST COMPLEMENT

Negative Integers – One’s (1’s) Complement
 Computers generally use a system called “complementary representation” to store
negative integers.
 Two basic types are ones and twos complement, of which 2’s complement is the most
widely used.
 The number range is split into two halves, to represent the positive and negative
numbers.
 Negative numbers begin with 1, positive with 0.
 To perform 1’s complement operation on a binary number,
replace 1’s with 0’s and 0’s with 1’s (i.e Complement it!)
+6 represented by: 00000110
-6 represented by: 11111001
 Advantages: arithmetic is easier (cheaper/faster electronics)
 Fairly straightforward addition
– Add any carry from the Most Significant (left-most) Bit to Least Significant (right-
most) Bit of the result
 For subtraction
– form 1’s complement of number to be subtracted and then perform addition
 Disadvantages: still two representations for zero

30 Basic Computer Architecture & Arithmetic

 00000000 and 11111111 (in 8-bit representation)

3.2.2 ADDITION AND SUBTRACTION IN THE SECOND COMPLEMENT

 To perform the 2’s complement operation on a binary number
– replace 1’s with 0’s and 0’s with 1’s (i.e. the one’s complement of the number)
– add 1
+6 represented by: 00000110
-6 represented by: 11111010

 Advantages:
– Arithmetic is very straightforward
– End Around Carry is ignored

 only one representation for zero (00000000)

 Two’s Complement

To convert an integer to 2’s complement, Take the binary form of the number
00000110 (6 as an 8-bit representation)

 Flip the bits: (Find 1’s Complement)
1 1 1 1 1 0 0 1 1’s complement

 Add 1

1 1 1 1 1 0 0 0 (6)

+1

1 0 0 0 0 0 0 0 0 2’s complement of 6

Basic Computer Architecture & Arithmetic 31

 Justification of representation: 6 + (-6) =0?
0 0 0 0 0 1 1 0 (6)

+ 1 1 1 1 1 0 1 0 (2’s complement of 6)

1 0 0 0 0 0 0 0 0 (0)

 Properties of Two’s Complement
The 2’s comp of a 2’s comp is the original number
0 0 0 0 0 1 1 0 (6)
1 1 1 1 1 0 1 0 (2’s complement of 6)

00000101
+

1
0 0 0 0 0 1 1 0 (2’s comp of 2’s comp of 6)

The sign of a number is given by its MSB. The bit patterns:
 00000000 represents zero

 0nnnnnnn represents positive numbers
 1nnnnnnn represents negative numbers

32 Basic Computer Architecture & Arithmetic

 Addition
Addition is performed by adding corresponding bits
0 0 0 0 0 1 1 1 (7)
+ 0 0 0 0 0 1 0 1 (+5))
0 0 0 0 1 1 0 0 (12)

 Subtraction
Subtraction is performed by adding the 2’s complement
Ignore End-Around-Carry
0 0 0 0 1 1 0 0 (12)
+ 1 1 1 1 1 0 1 1 (-5)
1 0 0 0 0 0 1 1 1 (7)

 Interpretation of Negative Results
 0 0 0 0 0 1 0 1 (5)

+ 1 1 1 1 0 1 0 0 (-12)
1 1 1 1 1 1 0 0 1 (-)
 Result is negative
 MSB of result is 1 so it is a negative number in 2’s complement form

Basic Computer Architecture & Arithmetic 33

 Take the 2’s comp of the result to find out since the 2’s comp of a 2’s comp is the
original number

 Result : Negative 7
 the 2’s complement of 11111001 is 00000111 or 710

3.3 COMPUTE FLOATING POINT REPRESENTATION USING IEEE 754
FORMAT

a) Binary representation
b) Additional and subtraction

 The IEEE has established a standard for floating-point numbers
 The IEEE-754 single precision floating point standard uses an 8-bit exponent (with a

bias of 127) and a 23-bit significand.
 The IEEE-754 double precision standard uses an 11-bit exponent (with a bias of

1023) and a 52-bit significand.
 In both the IEEE single-precision and double-precision floating-point standard, the

significant has an implied 1 to the LEFT of the radix point.
o The format for a significand using the IEEE format is: 1.xxx…

o For example, 4.5 = .1001 x 23 in IEEE format is 4.5 = 1.001 x 22. The 1 is
implied, which means it’s does not need to be listed in the significand (the
significand would include only 001).
 Example: Express -3.75 as a floating point number using IEEE single precision.
 First, let’s normalize according to IEEE rules:
o 3.75 = -11.112 = -1.111 x 21
o The bias is 127, so we add 127 + 1 = 128 (this is our exponent)

o The first 1 in the significand is implied, so we have:
o Since we have an implied 1 in the significand, this equates to

34 Basic Computer Architecture & Arithmetic

o -(1).1112 x 2 (128 – 127) = -1.1112 x 21 = -11.112 = -3.75.

3.3.1 BINARY REPRESENTATION

 Example
o 85.125
o 85 = 1010101
o 0.125 = 001
o 85.125 = 1010101.001

=1.010101001 x 2^6
o sign = 0

1. Single precision:
 biased exponent 127+6=133
 133 = 10000101
 Normalised mantissa = 010101001
 ad 0's to complete the 23 bits
 The IEEE 754 Single precision is:
 = 0 10000101 01010100100000000000000
 This can be written in hexadecimal form 42AA4000

2. Double precision:
• biased exponent 1023+6=1029

Basic Computer Architecture & Arithmetic 35

• 1029 = 10000000101
• Normalised mantissa = 010101001
• Add 0's to complete the 52 bits
• The IEEE 754 Double precision is:
• = 0 10000000101 0101010010000000000000000000000000000000000000000000
• This can be written in hexadecimal form 4055480000000000

3.3.2 ADDITION AND SUBTRACTION

 Step to add/subtract two floating-point number
1. Compare the magnitudes of the two exponents and make suitable alignment to the
number with smaller magnitude of exponent.
2. Perform the addition/subtraction.
3. Perform normalization by shifting the resulting mantissa and adjusting the resulting
exponent.

 Example: add 1.1100 x 2^4 and 1.100 x 2^2
1. Alignment : 1.1000 x 2^2 has to be aligned to 0.01100 x 2^4
2. Addition : Add two number to get 10.0010 x 2^4
3. Normalization : Final normalization result is 0.1000 x 2^6 (Assuming 4-bits are allowed
after radix point)

 Example
X = 0100 0010 0000 1111 0000 0000 0000 0000
Y = 0100 0001 1010 0100 0000 0000 0000 0000

36 Basic Computer Architecture & Arithmetic

X Y
S=0 S=0
e = 1000 0100 e = 1000 0011

= 132 – 127 = 5 = 131 – 127 = 4
= 1.000 1111 x 2^5 = 1.010 0100 x 2^4
X= 1.0001111 x 2^5
Y= 1.0100100 x 2^4 = 0.1010010 x 2^5
X + Y = 1.0001111
= 0.1010010
---------------------
1.1100001 x 2^5
===============

3.3.3 TUTORIAL

a. Convert decimal to binary

i. 5310

Answer

5310 = 53 / 2 = 26 remainder 1

26 / 2 = 13 remainder 0

13 / 2 = 6 remainder 1

6 / 2 = 3 remainder 0

3 / 2 = 1 remainder 1

1 / 2 = 0 remainder 1

= 1101012 (6 bits)

Basic Computer Architecture & Arithmetic 37

= 001101012 (8 bits)
(Note: bit = binary digit)

ii. 0.8110

Answer

0.8110 = 0.81 x 2 = 1.62

0.62 x 2 = 1.24

0.24 x 2 = 0.48

0.48 x 2 = 0.96

0.96 x 2 = 1.92

0.92 x 2 = 1.84

= 0.1100112 (approximately)

b. Convert binary to decimal
i. 1110012
Answer
1110012 (6 bits)
= (1x25) + (1x24) + (1x23) + (0x22) + (0x21) + (1x20)
= 32 + 16 + 8 + 0 + 0 + 1
= 5710
ii. 000110102

38 Basic Computer Architecture & Arithmetic

Answer
000110102 (8 bits)
= 24 + 23 +21
= 16 + 8 + 2
= 2610

c. Get the representation of ones complement (6 bits) for the following numbers:-
i. +710
Answer
(+7) = 0001112
ii. -10
Answer
(+10)10 = 0010102
So,
(-10)10 = 1101012

d. Express 3.75 as a floating point number using IEEE single precision
Answer
3 = 11
0.75 = 11
11.112 = 11.112
11.112 = 1.1112 x 21

Basic Computer Architecture & Arithmetic 39

Sign = 0
Single precision:
• biased exponent 127+1=128
• 128 = 10000000
• Normalised mantissa =11100000

• we will add 0's to complete the 23 bits
• The IEEE 754 Single precision is:
• = 0 10000000 11100000000000000000000

e. Express 7.5 as a floating point number using IEEE single precision
Answer
7 = 111
0.5 = 1
7.5 = 111.12
111.12 = 1.111 x 22
Sign = 0
Single precision:
• biased exponent 127+2=129
• 129 = 10000001
• Normalised mantissa =11100000
• we will add 0's to complete the 23 bits
• The IEEE 754 Single precision is:

40 Basic Computer Architecture & Arithmetic

• = 0 10000001 11100000000000000000000

f. Compute 7.5 and 3.75 using IEEE 754 format.

Answer

3.75 7.5

3 = 11 7 = 111

0.75 = 11 0.5 = 1

11.112 = 11.112 7.5 = 111.12
11.112 = 1.1112 x 21 111.12 = 1.111 x 22

So, add 1.1112 x 21 + 1.111 x 22
1. Alignment : 1.111 x 21 has to be aligned to 0.1111 x 22
2. Addition : Add two numbers to get 10.1101 x 22

1.1110
+ 11.1100
101.1010 x 21= 1.011010 x 23
Single precision:
• No sign
• biased exponent 127+3=130
• 130 = 10000010
• Normalised mantissa =011010
• we will add 0's to complete the 23 bits
• The IEEE 754 Single precision is:
• = 0 10000010 011010000000000000000000

Basic Computer Architecture & Arithmetic 41

g. Compute X + Y using IEEE 754 format.

Answer

X = 0 10000001 11100000000000000000000

Y = 0 10000000 11100000000000000000000

X+Y

XY

S=0 S=0

e = 1000 0000 e = 1000 0001

= 128 – 127 = 1 = 129 – 127 = 2

= 1.1110 x 21 = 1.1110 x 22

X = 1.1110 x 21
Y = 1.1110 x 22 = 11.110 x 21

1.1110
+ 11.1100
101.1010 x 21= 1.011010 x 23
Bias is 127 +3 = 130 = (1000 0010)
Answer = 0 1000 0010 011010 00000000000000000

42 Basic Computer Architecture & Arithmetic

REFERENCES

Admin. (2020). What is computer architecture. Cybozu Inc.

Arm Developer, Arm Limited. (2002, September 30). ARM9EJ-S Technical Reference Manual.
Retrieved September 1, 2021, from
https://developer.arm.com/documentation/ddi0222/b/ch01s01s01

Avižienis, A. (2003). Arithmetic-logic unit (ALU). Encyclopedia of Computer Science, 77-81.

Brown, S., & Vranesic, Z. (2009). Fundamentals of Digital Logic with VHDL Design. McGraw
Hill.

Harris, S. L., & Harris, D. M. (2016). Digital Design and Computer Architecture. Elsevier.

Kholodov, I. (n.d.). CIS-77 Introduction to Computer Systems. Retrieved September 1, 2021,
from http://www.c-jump.com/CIS77/CIS77syllabus.htm

Koṫlawī, A. Y. (n.d.). Akhlāq-uṣ-Ṣāliḥīn. Karachi, Pakistan: Maktaba-tul-Madīnaĥ.

Laughton, M., & Warne, D. (2003). Electrical Engineer's Reference Book. Newnes.

Pacheco, P. S. (2011). Chapter 2 - Parallel Hardware and Parallel Software. In An Introduction
to Parallel Programming (pp. 15-20). Elsevier.

Sanjay J. Patel, Y. P. (2004). Introduction to Computing Systems: From Bits and Gates.
McGraw-Hill.

Saurabh, S. (2021, Jun 28). Computer Organization and Architecture | Pipelining | Set 1
(Execution, Stages and Throughput). Retrieved September 3, 2021, from
geeksforgeeks.org: https://www.geeksforgeeks.org/computer-organization-and-
architecture-pipelining-set-1-execution-stages-and-throughput/

Sloss, A. N., Symes, D., & Wright, C. (2004). ARM System Developer's Guide Designing and
Optimizing System Software. Elsevier Inc.

Stallings, W. (2013). COMPUTER ORGANIZATION AND ARCHITECTURE DESIGNING FOR
PERFORMANCE. Pearson.

Wolf, M. (2012). Computers as Components. Elsevier Inc.

Admin. (2020, March 15). What Is Computer Architecture? , from
https://geteducationskills.com/computer-architecture/#google_vignette

Saurabh Sharma. (2021, Jun 28). Computer Organization and Architecture | Pipelining | Set
1 (Execution, Stages and Throughput), from

Basic Computer Architecture & Arithmetic 43

https://www.geeksforgeeks.org/computer-organization-and-architecture-pipelining-
set-1-execution-stages-and-throughput/


Click to View FlipBook Version