Performance of Barotropic Ocean Models on Shared and Distributed Memory Computers
NAVAL RESEARCH LAB STENNIS SPACE CENTER MS
Pagination or Media Count:
The efficiency of explicit time integration schemes for barotropic models of the Mediterranean were investigated, in context of the vectorization and parallel modeling approaches employed on different architectures. The main focus of interest was the scalability and MFlops output of the codes as a function of domain size. For simulations with real winds, mesh sizes ranged from 25 km down to 1.8 km grids of 180x64 to 2048x1024, with the coarse resolution resolving only major straits like that of Sicily, and the high resolution even narrow straits like Gibraltar and Messina. Since the memory requirement of these grids only ranged to 70 Mbytes, we also performed simulations with idealized, precomputed winds for which mesh sizes ranged down to 280m, to produce a total memory requirement of 4 Gbytes. The analysis and interpretation of the latter results for the Mediterranean has not been performed yet. The explicit scheme consisted of the leap frog scheme for the Coriolis, pressure gradient and advection terms, and lagged times for the diffusion terms. The platforms utilized included the CM500-E with CMF, the Cray C90 and T90 with FT90-03 auto tasking, the Cray T3E with HPF and MPI, the SGI Origin2000 with f77-pfa-02 power fortran, HPF and MPI, the IBM SP2 with HPF and the Sun Global Works with HPF. The MPI version of this code employed a 2-D tiling decomposition, and parallel runs were performed up to 512 processors on the T3E and up to 64 processors on the SGI Origin. The T3E 512 processors achieved an 82 scaling efficiency of 100 vs. 32 processors, but less than linear for smaller number of processors.
- Physical and Dynamic Oceanography
- Computer Systems