Chapter 9: Hardware/Software Trade-offs --- - main limitations of directory-based, cache-coherent systems: 1. high waiting time at memory operations: SC is the consistency model of choice, but demands that a processor wait for its previous memory operations to complete before issuing the next one. This has a much higher impact on directory-based than in bus-based systems; it also bad for compilers, that cannot reorder memory operations to shared data if SC is assumed by the programmer; 2. limited capacity for replication: as data is replicated only in the caches, the system may suffer from capacity misses and artifactual communication; 3. high design and implementation costs; - addressing limitation (1): - even though the processor may proceed to the next memory operation, the system guarantees that the effects of writes are seen in order; - use a weaker consistency model; - addressing (2): - cache shared data in memory (pages, objects, chunks, etc); - manage memory also as a hardware cache, providing replication and coherence at the block level as well (this is called Cache-Only Memory Architecture - COMA); - addressing (3): - integrate the communication assist and network less tightly into the processing node (at the cost of increasing comm. latency and assist occupancy); - another is to provide replication and coherence in software instead of hardware;