Exam 1: Computer Abstractions and Technology
Consider two different implementations, M1 and M2, of the same instruction set. There are three classes of instructions (A, B, and C) in the instruction set. M1 has a clock rate of 80 MHz and M2 has a clock rate of 100 MHz. The average number of cycles for each instruction class and their frequencies (for a typical program) are as follows:
(a) Calculate the average CPI for each machine, M1, and M2.
(b) Calculate the average MIPS ratings for each machine, M1 and M2.
(c)Which machine has a smaller MIPS rating ? Which individual instruction class CPI do you need to change, and by how much, to have this machine have the same or better performance as the machine with the higher MIPS rating (you can only change the CPI for one of the instruction classes on the slower machine)?

A.
For Machine M1:
Clocks per Instruction = (60/100)* 1 + (30/100)*2 + (10/100)*4
= 1.6
For Machine M2:
Clocks per Instruction = (60/100)*2 + (30/100)*3 + (10/100)*4
= 2.5
b.
For Machine M1:
Average MIPS rating = Clock Rate/(CPI * 106)
= (80 * 106) / (1.6*106)
= 50.0
For Machine M2:
Average MIPS rating = Clock Rate/(CPI * 106)
= (100 * 106) / (2.5*106)
= 40.0
c.
Machine M2 has a smaller MIPS rating. If we change the CPI of instruction class A for Machine M2 to 1, we can have a better MIPS rating than M1 as follows:
Clocks per Instruction = (60/100)*1 + (30/100)*3 + (10/100)*4
= 1.9
Average MIPS rating = Clock Rate/(CPI * 106)
= (100 * 106) / (1.9*106)
= 52.6
Suppose a program segment consists of a purely sequential part which takes 25 cycles to execute, and an iterated loop which takes 100 cycles per iteration. Assume the loop iterations are independent, and cannot be further parallelized. If the loop is to be executed 100 times, what is the maximum speedup possible using an infinite number of processors (compared to a single processor)?
The sequential part takes 25 cycles. Each iteration of the loop (which takes 100 cycles) can executed independently and there are totally 100 iterations. Applying Amdahl's law,
Execution time after improvement =
(Execution time affected by improvement)/(Amount of Improvement) +
Execution time unaffected
= (100*100)/100 + 25
= 100 + 25
Speedup = 125/100 = 1.25
(Amdahl's law question) Suppose you have a machine which executes a program consisting of 50% floating point multiply, 20% floating point divide, and the remaining 30% are from other instructions.
(a) Management wants the machine to run 4 times faster. You can make the divide run at most 3 times faster and the multiply run at most 8 times faster. Can you meet management's goal by making only one improvement, and which one?
(b) Dogbert has now taken over the company removing all the previous managers.
If you make both the multiply and divide improvements, what is the speed of the improved machine relative to the original machine?
a.
Amdahl's Law states:
Execution time after improvement =
(Execution time affected by improvement)/(Amount of Improvement) +
Execution time unaffected
Assuming initially that the floating point multiply, floating point divide and the other instructions had the same CPI,
Execution time after Improvement with Divide = (20)/3 + (50 + 30) = 86.67 Execution time after Improvement with Multiply = (50)/8 + (20 + 30) = 66.67
The management's goal can be met by making the improvement with Multiply alone.
b. If we make both the improvements,
Execution time after Improvement = (50)/8 + (20)/3 + (30) = 53.33 The speedup relative to the original machine = (100)/(53.33) = 1.88
The design team for a simple, single-issue processor is choosing between a pipelined or non-pipelined implementation. Here are some design parameters for the two possibilities:
(a) For a program with 20% ALU instructions, 10% control instructions and 75% memory instructions, which design will be faster? Give a quantitative CPI average for each case.
(b)
For a program with 80% ALU instructions, 10% control instructions and 10% memory instructions, which design will be faster? Give a quantitative CPI average for each case


Just like we defined MIPS rating, we can also define something called the MFLOPS rating which stands for Millions of Floating Point operations per Second. If Machine A has a higher MIPS rating than that of Machine B, then does Machine A necessarily have a higher MFLOPS rating in comparison to Machine B?
How did the development of the transistor affect computers? What did the transistor replace?
A two-part question:
(Part A)
Assume that a design team is considering enhancing a machine by adding MMX (multimedia extension instruction) hardware to a processor. When a computation is run in MMX mode on the MMX hardware, it is 10 times faster than the normal mode of execution. Call the percentage of time that could be spent using the MMX mode the percentage of media enhancement.
(a) What percentage of media enhancement is needed to achieve an overall speedup of 2?
(b) What percentage of the run-time is spent in MMX mode if a speedup of 2 is achieved? (Hint: You will need to calculate the new overall time.)
(c)What percentage of the media enhancement is needed to achieve one-half the maximum speedup attainable from using the MMX mode?
A two-part question:
(Part B)
If processor A has a higher clock rate than processor B, and processor A also has a higher MIPS rating than processor B, explain whether processor A will always execute faster than processor B. Suppose that there are two implementations of the same instruction set architecture. Machine A has a clock cycle time of 20ns and an effective CPI of 1.5 for some program, and machine B has a clock cycle time of 15ns and an effective CPI of 1.0 for the same program. Which machine is faster for this program, and by how much?
Computer A has an overall CPI of 1.3 and can be run at a clock rate of 600MHz. Computer B has a CPI of 2.5 and can be run at a clock rate of 750 Mhz. We have a particular program we wish to run. When compiled for computer A, this program has exactly 100,000 instructions. How many instructions would the program need to have when compiled for Computer B, in order for the two computers to have exactly the same execution time for this program?
Suppose that we can improve the floating point instruction performance of machine by a factor of 15 (the same floating point instructions run 15 times faster on this new machine). What percent of the instructions must be floating point to achieve a Speedup of at least 4?
A designer wants to improve the overall performance of a given machine with respect to a target benchmark suite and is considering an enhancement X that applies to 50% of the original dynamically-executed instructions, and speeds each of them up by a factor of 3. The designer's manager has some concerns about the complexity and the cost-effectiveness of X and suggests that the designer should consider an alternative enhancement Y. Enhancement Y, if applied only to some (as yet unknown) fraction of the original dynamically-executed instructions, would make them only 75% faster. Determine what percentage of all dynamically-executed instructions should be optimized using enhancement Y in order to achieve the same overall speedup as obtained using enhancement X.
Imagine that you are able to perform benchmarking "races" to compare two computers you are thinking about buying. Come up with a list of 5 benchmark programs or usage scenarios you would use to create your own personalized benchmark suite. For each program you select, justify it. For the benchmark suite as a whole, discuss a method for calculated a weighted average of the different program run-times.
Consider the SPEC benchmark. Name two factors that influence the resulting performance on any particular architecture.
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)