Software Performance Modeling in PC Clusters
Abstract:
Execution of course grain parallel programs in PC clusters promises super-computer performance in low cost hardware environments. However the overhead associated with data distribution, synchronization, and peripheral access can easily eliminate any performance gain promised by the individual cluster capacity. Application specific system performance analysis is required both to engineer PC cluster hardware and evaluate the cost effectiveness of parallelizing software components. This paper presents a distributed system performance model and software analysis methodology suitable for estimating the execution times of large grain parallel application programs in clusters of PC hardware. The performance model emphasizes the use of application hardware performance results readily available in most systems. These are combined with single thread application software resource requirements in order to estimate the achievable execution rates in target clusters. A case study of the analysis of a video realistic battlefield simulator implementation in a PC cluster running under Linux is presented. Benchmark results and performance estimates for specific candidate hardware configurations are calculated and compared with actual results.