James E. Bennett


Affiliations

Where I study: Stanford University, in the Stanford Architecture and Arithmetic Group.

Faculty advisors: Michael Flynn, Mendel Rosenblum, and Monica Lam.

Other handy links.


Project

Latency Tolerant Architectures

Synopsis:

Processor cycle times are currently much faster than memory cycle times, and the trend has been for this gap to increase over time. The problem of increasing memory latency, relative to processor speed, has been dealt with by adding high speed cache memory. However, depending on the miss rate, memory latency can still have a significant performance impact. Since the trend of increasing memory latency is expected to continue, the performance impact will become even more significant with time.

Researchers have proposed a variety of techniques for dealing with memory latency, many of which have been implemented. These techniques fall into the categories of dynamic scheduling, hardware prefetching, software prefetching, or supporting multiple contexts. Various combinations of techniques for latency tolerance are possible as well. I would like to investigate the performance of these techniques in the context of modern uniprocessor design.

A more detailed description (my thesis proposal) is available. The simulator I developed to study these issues with is now publically available, from MXS source.


Publications

Two Case Studies in Latency Tolerant Architectures. Technical report CSL-TR-94-639, Stanford University, Computer Systems Laboratory, October 1994.

Performance Factors for Superscalar Processors. Technical report CSL-TR-95-661, Stanford University, Computer Systems Laboratory, February 1995.

Reducing Cache Miss Rates Using Prediction Caches. Technical report CSL-TR-96-707, Stanford University, Computer Systems Laboratory, October 1996.

These technical reports are available from the electronic library project.

Prediction Caches for Superscalar Processors, Micro-30 proceedings, December 1997. Here are some slides from my talk at Micro-30 (in PowerPoint format): Title and Graphs. Some listeners requested additional information on the on the performance of prediction caches, so here are some tables of IPC numbers for prediction caches, on a variety of machine models and memory configurations: Pred. Cache Performance.


Computer Architecture Related Pages

The World Wide Web Computer Architecture Page at the University of Wisconsin is an attempt at a global directory of computer architecture information.

There is a CPU information center with pointers and information about current microprocessors.

Digital Equipment's research labs.

Performance evaluation benchmarks: SPECmarks.

Computer societies, conferences, etc.: IEEE Computer Society and ACM.

The FLASH project, here at Stanford, is a shared memory multiprocessor, a follow on to the DASH project. It uses the SimOS simulator for OS development.

The Center for Reliable and High Performance Computing, CRHC, has a home page with links to their work on the Impact compiler.

The Computer Architecture Group at MIT has a variety of dataflow projects, such as the J-machine, the M-machine, and Alewife (arguably). There is also some work on FPGA's called Virtual Wires.

Carver Mead at Caltech has a group called the Physics of Computation. They are working on "making chips to emulate functions of the nervous system, like retinas and cochleas". There is a set of design tools available there called the Chipmunk design tools.

The Data Diffusion Machine, DDM, at the University of Bristol is a virtual shared memory multiprocessor. The technique used goes by the acronym COMA (Cache Only Memory Architecture). Also their home page has a pointer to a list of related research projects which is great.

The Wisconsin Wind Tunnel project, WWT, is working on a new interface for parallel computation called Tempest, which is being implementing on a CM-5.


If you want me to add stuff to this page, mail your suggestions to:

jbennett@cs.stanford.edu