Can High Bandwidth and Latency Justify Large Cache Blocks in Scalable Multiprocessors?
ROCHESTER UNIV NY DEPT OF COMPUTER SCIENCE
Pagination or Media Count:
An important architectural design decision affecting the performance of coherent caches in shared-memory multiprocessors is the choice of block size. There are two primary factors that influence this choice the reference behavior of application programs and the remote access bandwidth and latency of the machine. Several studies have shown that increasing the block size can lower the miss rate and reduce the number of invalidations. However, increasing the block size can also increase the miss rate by, for example, increasing false sharing or the number of cache evictions. Large cache blocks can also generate network contention. Given that we anticipate enormous increases in both network bandwidth and latency in large-scale, shared-memory multiprocessors, the question arises as to what effect these increases will have on the choice of block size. We use analytical modeling and execution-driven simulation of parallel programs on a large-scale shared-memory machine to examine the relationship between cache block size and application performance as a function of remote access bandwidth and latency. We show that even under assumptions of high remote access bandwidth, the best application performance usually results from using cache blocks between 32 and 128 bytes in size.
- Computer Hardware