Parallel Programming Enhancements for Processing Hydrographic Data
NAVAL RESEARCH LAB STENNIS SPACE CENTER MS
Pagination or Media Count:
Parallel programming techniques have been used for years to develop processing software that does the same work in only a fraction of the time. The research in this paper focuses on the IO problem associated with a parallel application writing to a single physical disk. Included in our research are the original ideas that led to the first version of the parallel software, subsequent versions of the software derived from lessons learned from benchmark results, and speedup results of each version. The platform used was a custom built Linux Beowulf cluster running a standard Linux kernel and the MPICH parallel message-passing library. The underlying purpose of this software is to process hydrographic data having a complicated, multi-tiered format. The data processing involves reading tens to hundreds of files containing raw data, filtering out extraneous data values, and writing the filtered data to a single file used in additional processing. The problem is not computationally intensive, but bound by the systems file writing capability. Subsequent versions of the parallel software developed exploit the strengths of the systems hardware to write the output file in the most time efficient manner. Each software version uses advanced software architecture schemes to achieve better results. Results show that the more responsible the software was for organizing the data before writing, the better the speedup. The critical factor for writing data efficiently involved the limitation of writing data over a single IO controller. Our parallel software has fantastic utility where system specifications do not allow for the use of parallel file systems, or writing data over multiple IO controllers.
- Computer Programming and Software