Adaptive Resource Management for Deployable HPC Systems

reportActive / Technical Report | Accession Number: ADA378149 | Open PDF

Abstract:

Project goals were to develop techniques of continual reallocation of resources to maintain application performance despite statically unpredictable change in resource demands. Research was targeted to multiple application systems executing on HPC High Performance Computing platforms. This project built on the results of a previous program, called Adaptive Resource Allocation ARA. In ARA, Honeywell developed techniques for dynamic reallocation of resources to single parallel applications, structured as multi-pipelines, executing on a high performance parallel machine. They extended ARA results to systems with multiple applications and multiple machines connected over a network. In October 1997 DARPA merged the technical effort on this project with the RTARM project funded under Quorum. This did not affect the core statement of work for ARM, but led to extension of its completion date. ARM focused on developing an approach based on adaption models, and addressed best-effort resource allocation in an environment with partitionable rather than shared resources. Parallel HPG platforms were de-emphasized in favor of general distributed computing platforms. Results from ARM are being integrated into RTARM. The layered architecture of ARM has given way to a hierarchical architecture characterized by uniformity across different levels. The MPI-based communication infrastructure in ARM has given way to a CORBA ORB infrastructure. While ARM implementation was targeted to Unix machine connected over Ethernet, the target platform for PTARM consists of Windows NT machines networked over ATM.

Security Markings

DOCUMENT & CONTEXTUAL SUMMARY

Distribution:
Approved For Public Release

RECORD

Collection: TR
Identifying Numbers
Subject Terms