Performance Results for Prediction Cache #3 ------------------------------------------- Each of the following eight tables contain the IPC results for four different cache configurations, a blocking cache, a non-blocking cache, prediction cache #3, and the perfect memory subsystem. The performance of the prediction cache is compared to the performance of the non-blocking cache, which is the base case. The latency tolerated (LatTol) is the relative improvement over the base case, compared to the perfect cache, and the final column is the raw improvement in performance of the prediction cache over the base case. The results are given for each of the following benchmarks: 1 compress 2 espresso 3 linpacks 4 sc 5 spice2g6 6 timefft 7 uncompress 8 wave5 9 xlisp Each table represents a different processor model and memory subsystem. The processor model is interpreted as follows: d32 = 32 instruction reorder buffer d64 = 64 instruction reorder buffer e4 = 4 instructions can issue (execute) per cycle c1 = 1 cache port c2 = 2 cache ports The memory models are: L1 = 8 cycle miss penalty, 4 cycle bus occupancy L2 = 50 cycle miss penalty, 8 cycle bus occupancy IPC table for machine type d32e4c1, and L1 cache Bench Block Nonblk Pred Perfect LatTol Improvement cache cache #3 memory Pred/base ----- ----- ------ ---- ------- ------ --------- 1 1.065 1.416 1.418 1.642 .88 .14% 2 1.266 1.537 1.621 1.740 41.37 5.46% 3 1.097 1.491 1.662 1.737 69.51 11.46% 4 1.358 1.526 1.581 1.712 29.56 3.60% 5 0.845 1.030 1.044 1.163 10.52 1.35% 6 1.471 1.529 1.598 1.636 64.48 4.51% 7 1.162 1.378 1.402 1.603 10.66 1.74% 8 1.298 1.466 1.587 1.693 53.30 8.25% 9 1.024 1.105 1.129 1.210 22.85 2.17% IPC table for machine type d32e4c1, and L2 cache Bench Block Nonblk Pred Perfect LatTol Improvement cache cache #3 memory Pred/base ----- ----- ------ ---- ------- ------ --------- 1 1.076 1.284 1.380 1.712 22.42 7.47% 2 1.197 1.479 1.562 1.773 28.23 5.61% 3 0.659 1.212 1.351 1.807 23.36 11.46% 4 1.399 1.518 1.551 1.736 15.13 2.17% 5 0.598 0.911 0.945 1.190 12.18 3.73% 6 1.634 1.606 1.657 1.667 83.60 3.17% 7 1.223 1.329 1.456 1.660 38.36 9.55% 8 0.976 1.240 1.415 1.756 33.91 14.11% 9 0.691 0.874 0.970 1.221 27.66 10.98% IPC table for machine type d32e4c2, and L1 cache Bench Block Nonblk Pred Perfect LatTol Improvement cache cache #3 memory Pred/base ----- ----- ------ ---- ------- ------ --------- 1 1.203 1.570 1.574 2.001 .92 .25% 2 1.390 1.700 1.809 1.986 38.11 6.41% 3 1.382 1.843 2.225 2.579 51.90 20.72% 4 1.560 1.731 1.796 2.051 20.31 3.75% 5 0.940 1.137 1.157 1.349 9.43 1.75% 6 2.249 2.110 2.287 2.657 32.35 8.38% 7 1.411 1.667 1.711 2.117 9.77 2.63% 8 1.552 1.649 1.845 2.150 39.12 11.88% 9 1.217 1.301 1.340 1.493 20.31 2.99% IPC table for machine type d32e4c2, and L2 cache Bench Block Nonblk Pred Perfect LatTol Improvement cache cache #3 memory Pred/base ----- ----- ------ ---- ------- ------ --------- 1 1.197 1.427 1.546 2.067 18.59 8.33% 2 1.302 1.630 1.734 2.017 26.87 6.38% 3 0.748 1.481 1.688 2.681 17.25 13.97% 4 1.613 1.762 1.808 2.073 14.79 2.61% 5 0.639 0.994 1.036 1.364 11.35 4.22% 6 2.653 2.570 2.707 2.742 79.65 5.33% 7 1.500 1.638 1.852 2.214 37.15 13.06% 8 1.096 1.391 1.616 2.186 28.30 16.17% 9 0.772 0.992 1.120 1.506 24.90 12.90% IPC table for machine type d64e4c1, and L1 cache Bench Block Nonblk Pred Perfect LatTol Improvement cache cache #3 memory Pred/base ----- ----- ------ ---- ------- ------ --------- 1 1.054 1.413 1.414 1.620 .48 .07% 2 1.268 1.567 1.634 1.745 37.64 4.27% 3 1.094 1.504 1.675 1.730 75.66 11.36% 4 1.350 1.589 1.613 1.700 21.62 1.51% 5 0.857 1.056 1.069 1.184 10.15 1.23% 6 1.467 1.555 1.610 1.631 72.36 3.53% 7 1.157 1.374 1.397 1.592 10.55 1.67% 8 1.366 1.594 1.724 1.809 60.46 8.15% 9 1.013 1.106 1.123 1.197 18.68 1.53% IPC table for machine type d64e4c1, and L2 cache Bench Block Nonblk Pred Perfect LatTol Improvement cache cache #3 memory Pred/base ----- ----- ------ ---- ------- ------ --------- 1 1.062 1.320 1.419 1.686 27.04 7.50% 2 1.196 1.553 1.605 1.771 23.85 3.34% 3 0.658 1.443 1.496 1.797 14.97 3.67% 4 1.390 1.575 1.593 1.722 12.24 1.14% 5 0.603 0.969 0.998 1.213 11.88 2.99% 6 1.634 1.625 1.660 1.667 83.33 2.15% 7 1.215 1.361 1.461 1.646 35.08 7.34% 8 1.014 1.524 1.638 1.881 31.93 7.48% 9 0.686 0.933 0.993 1.207 21.89 6.43% IPC table for machine type d64e4c2, and L1 cache Bench Block Nonblk Pred Perfect LatTol Improvement cache cache #3 memory Pred/base ----- ----- ------ ---- ------- ------ --------- 1 1.205 1.588 1.591 2.005 .71 .18% 2 1.402 1.783 1.853 2.011 30.70 3.92% 3 1.396 2.454 2.527 2.627 42.19 2.97% 4 1.578 1.887 1.908 2.082 10.76 1.11% 5 0.966 1.191 1.210 1.403 8.96 1.59% 6 2.445 2.494 2.701 2.935 46.93 8.29% 7 1.404 1.681 1.718 2.101 8.80 2.20% 8 1.722 1.902 2.129 2.490 38.60 11.93% 9 1.213 1.351 1.378 1.485 20.14 1.99% IPC table for machine type d64e4c2, and L2 cache Bench Block Nonblk Pred Perfect LatTol Improvement cache cache #3 memory Pred/base ----- ----- ------ ---- ------- ------ --------- 1 1.199 1.503 1.634 2.071 23.06 8.71% 2 1.307 1.729 1.798 2.028 23.07 3.99% 3 0.756 1.899 1.980 2.794 9.05 4.26% 4 1.633 1.871 1.897 2.103 11.20 1.38% 5 0.651 1.073 1.110 1.422 10.60 3.44% 6 2.966 2.916 3.044 3.077 79.50 4.38% 7 1.489 1.676 1.859 2.191 35.53 10.91% 8 1.180 1.790 1.974 2.547 24.30 10.27% 9 0.770 1.063 1.155 1.499 21.10 8.65%