Computer Architecture: A Quantitative Approach --- Hennessy & Patterson, 1996. - the virtual elimination of assembly programming reduced the need for object-code compatibility; - designer: determine important attributes while staying within cost constraints; - architecture: instruction set + organization + hardware; - complexity in design will increase time to market; - memory needed by average program grows by a factor of 1.5-2.0 per year; - transistor density increases 50% per year; - disk density is improving 50% per year; - learning curve - manufacturing costs decrease over time; - cost of IC: Cost die + Cost testing die + Cost packaging and final test ----------------------------------------------------------- Final test yield - from 20 to 180 good dies per wafer, depending on chip dimentions; - cost of a computer: 60% in processor board; - measuring programs: real programs, kernels, toy benchmarks, and synthetic benchmarks; - kernel code is extracted from real programs, while synthetic code is created artificially to match an execution profile; - baseline performance specifies compiler and flags; - Arithmetic Mean (AM): SUM(Time_i) / n - if performance is expressed as a rate, use harmonic mean (HM): n / SUM(1 / Rate_i) - program frequency may be indicated by using weight factors; - Weighted Arithmetic Mean (WAM): SUM(Weight_i x Time_i) - Weighted Harmonic Mean (WHM): 1 / SUM(Weight_i / Rate_i) - average normalized execution time can be expressed as either an arithmetic or geometric mean; - Geometric Mean (GM): square_n(PRODUCT(Execution_time_ratio_i)) - GM is consistent no matter which machine is taken as reference; - arithmetic mean should not be used to average normalized execution times. Different reference machines will give different results; - problem with GM is that it does not predict execution time; * READ PAGE 28; * in general, there is no workload for three or more machines that * will match the performance predicted by GM; - make the common case fast; - Amdahl's Law: Speedup = (Exec time old / Exec time new) Speedup = 1 / ((1 - Fraction enh) + Fraction enh / Speedup enh) - CPU time = Instruction Count x CPI x Clock Cycle Time = IC x CPI x CCT - CPU clock cycles = SUM(CPI_i x IC_i) - is usually easier to use CPU time instead of Amdahl's Law because CPI, IC, and CCT are easy to calculate; - CPI may be calculated by summing factors, like: CPI = CPI_pipeline + CPI_memory + ... - locality of reference is usally the property to explore;