(a) A program is run on a 2.1 GHz processor. The program’s code consists of 10,000
instructions, with the following instruction mix and clock cycle count:
Instruction Type Instruction Count
(# of such instructions)
Clock Cycle Count
(# of clock cycles / instr.)
Integer arithmetic 5000 1
Data transfer 3000 2
Floating point 1500 2
Control transfer 500 2
Determine this program’s (i) effective CPI and (ii) the percentage of time doing FP ops.
(b) A new computer with enhanced architecture run 16 times faster than the original
machine, but it is usable only 1/8 of the time. What is the speedup?
(c) A program takes 15 seconds to execute on a single 1.8GHz processor. 30% of the
program is sequential. Assuming zero latency and perfect parallelism in the remaining
code, how long should the code take on a 20 core processor machine?
(d) A computer uses instruction pipelining with P=4 stages named FD, DO, EX, WO.
Each stage takes T seconds to perform its task. Suppose you have N=9 instructions in
your program which has a branch from instruction 4 to 8. Draw a figure to show what
happens over time as each instruction is processed. Indicate the branch penalty.
Question 2.
Consider the following code sequence:
i: R7 R12+R5
i+1: R8 R7-R12
i+2: R5 R8+R7
(a) List the RAW dependencies, if any:
(b) List the WAR dependencies, if any; rewrite code showing how to avoid them:
(c) List the WAW dependencies, if any; rewrite code showing how to avoid them: