Computer Organization and Structure

Homework #2

Due: 2003/11/4

1.      We wish to compare the performance of two different machines: M1 and M2. The following measurements have been made on these machines:

 Program Time on M1 Time on M2 1 10 seconds 5 seconds 2 3 seconds 4 seconds

Which machine is faster for each program and by how much? Consider the two machines and programs as the above. The following additional measurements were made:

 Program Instructions executed on M1 Instructions executed on M2 1 200 x 106 160 x 106

Find the instruction execution rate (instructions per second) for each machine when running program 1. If the clock rates of machines M1 and M2 are 200MHz and 300MHz, respectively, find the clock cycles per instruction (CPI) for program 1 on both machines using the data as the above.

2.      Consider two different implementations, M1 and M2, of the same instruction set. There are four classes of instructions (A, B, C, and D) in the instruction set. M1 has a clock rate of 500 MHz. The average number of cycles for each instruction class on 1 is as follows:

 Class CPI for this class A 1 B 2 C 3 D 4

M2 has a clock rate of 750 MHz. The average number of cycles for each instruction class on 1 is as follows:

 Class CPI for this class A 2 B 2 C 4 D 4

Assume that peak performance is defined as the fastest rate that a machine can execute an instruction sequence chosen to maximize that rate. What are the peak performances of M1 and M2 expressed as instructions per second?

3.      We are interested in two implementations of a machine one with and one without special floating-point hardware. Consider a program, P, with the following mix of operations:

 floating-point multiply floating-point add floating-point divide integer instructions 10% 15% 5% 70%

Machine MFP (Machine with Floating Point) has floating-point hardware and can therefore implement the floating-point operations directly. It requires the following number of clock cycles for each instruction class:

 floating-point multiply floating-point add floating-point divide integer instructions 6 4 20 2

Machine MNFP (Machine with No Floating Point) has no floating-point hardware and so must emulate the floating-point operations using integer instructions. The integer instructions all take 2 clock cycles. The number of integer instructions needed to implement each of the floating-point operations is as follows:

 floating-point multiply floating-point add floating-point divide 30 20 50

Both machines have a clock rate of 1000 MHz. Find the native MIPS ratings for both machines.

4.      The table below shows the number of floating-point operations executed in two different programs and the runtime for those programs on three different machines:

 Program Floating-point operations Execution time in seconds Computer A Computer B Computer C Program 1 10,000,000 1 10 20 Program 2 100,000,000 1000 100 20

Which machine is fastest according to total execution time? How much faster is it than the other two machines?

5.      The following code fragment processes an array and produces two important values in registers \$v0 and \$v1. Assume that the array consists of 5000 words indexed 0 through 4999, and its base address is stored in \$a0 and its size (5000) in \$a1. Describe in one sentence what this code does. Specifically, what will be returned in \$v0 and \$v1?

lw           \$t4, 0(\$t4)

lw           \$t3, 0(\$t3)

bne         \$t3, \$t4, skip

bne         \$t1, \$a1, inner

slt           \$t2, \$t5, \$v0

bne         \$t2, \$zero, next

bne         \$t0, \$a1, outer

Assume that the above code is run on a machine with a 500 MHz clock that requires the following number of cycles for each instruction:

In the worst case, how many seconds will it take to execute this code?

6.      Show the single MIPS instruction or minimal sequence of instructions for this C statement:

x  = x  + c;

Assume that c corresponds to register \$t0 and the array x has a base address of 4,000,000ten.

7.      Consider the following fragment of C code:

for (i = 0; i <= 100; i = i + 1) { a [i] = b [i] + c; }

Assume that a and b are arrays of words and the base address of a is in \$a0 and the base address of b is in \$a1. Register \$t0 is associated with variable i and register \$s0 with c. Write the code for MIPS. How many instructions are executed during the running of this code? How many memory data references will be made during execution?

8.      Write a procedure, bfind, in MIPS assembly language. The procedure should take a single argument that is a pointer to a null-terminated string in register \$a0. The bfind procedure should locate the first b character in the string and return its address in register \$v0. If there are no b’s in the string, then bfind should return a pointer to the null character at the end of the string. For example, if the argument to bfind points to the string “imbibe,” then the return value will be a pointer to the third character of the string.