High-Performance Data-Parallel Input/Output.
AIR FORCE INST OF TECH WRIGHT-PATTERSON AFB OH
Pagination or Media Count:
First-generation commercial multiple-CPU computers provided little support for parallel disk IO, either in terms of a high-performance parallel disk system or a reasonable programming interface. Today, advances in disk arrays, coupled with the striping of data across powerful IO nodes, provide the means for systems such as the Thinking Machines CM-5, Intel Paragon, Meiko CS-2, and IBM SP-2 to provide reasonable disk bandwidth to parallel applications. Unfortunately, disk bandwidth is necessary, but not sufficient, to support parallel IO operations. Existing parallel file systems are proving inadequate in two important arenas programmability and performance. Both of these inadequacies can largely be traced to the fact that nearly all parallel file systems evolved from Unix and rely on a Unix-oriented, single-stream approach to file IO. More researchers are agreeing that this approach is not ideal for supporting multiprocessor systems. In this dissertation, these issues are addressed in the context of distributed memory parallel computers like the SP-2 and CS-2. The processors on such a machine are connected by a fast network, and parallel file access is provided by a subset of processors acting as IO nodes. File data are striped across the IO nodes, which can communicate with each other and the remaining compute nodes using the parallel network. A generic example of such a system is shown in Figure 1.1.
- Computer Hardware