Hybrid MPI-OpenMP versus MPI Implementations: A Case Study
Conference paper preprint
POLYTECHNIC UNIV OF PUERTO RICO HATO REY
Pagination or Media Count:
In this paper we explore the performance of a hybrid, or mixed-mode MPI-OpenMP, parallel C implementation versus a direct MPI implementation. This case-study provides sufficient amount of detail so it can be used as a point of departure for further research or as an educational resource for additional code development regarding the study of mixed-mode versus direct MPI implementations. The hardware test-bed was a 64-processor cluster featuring 16 multi-core nodes with four cores per node. The algorithm being benchmarked is a parallel cyclic convolution algorithm with no inter-node communication that tightly matches our particular cluster architecture. In this particular case-study a time-domain-based cyclic convolution algorithm was used in each parallel subsection. Time domain-based implementations are slower than frequency domain-based implementations, but give the exact integer result when performing very large, purely integer, cyclic convolution.
- Computer Programming and Software
- Computer Hardware
- Computer Systems