Accession Number : ADA557017


Title :   High Performance Computing Multicast


Descriptive Note : Final rept. Jun 2010-Sep 2011


Corporate Author : CORNELL UNIV ITHACA NY


Personal Author(s) : Birman, Kenneth ; Freedman, Daniel ; van Renesse, Robert ; Weatherspoon, Hakim ; Marian, Tudor


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/a557017.pdf


Report Date : Feb 2012


Pagination or Media Count : 33


Abstract : This investigation of High Performance Computing (HPC) Multicast for High-Speed Publication-Subscription (Pub-Sub) sought to deliver both insight into and implementation of high-performance multicast solutions that enable better utilization of cloud resources. These solutions combine improved scalability with increased consistency ensuring that expected and necessary system conditions are thus met for a myriad of critical national-asset applications that are likely to move to the cloud in the next decade. In the context of this effort, the applicability of the oft-invoked Consistency, Availability and Partition tolerance (CAP) theorem was explored within specific environments of commonly deployed clouds, and novel insights into CAP's tradeoffs were developed between CAP and its conclusion that a replicated service can possess just two of the three. It was determined that there are replicated services for which the applicability of CAP is unclear specifically, the scalable soft-state services that run in the first-tier of a single cloud-computing data center. The challenge is that such services live in a single data center and run on redundant networks. Partitioning events involve single machines or small groups, and are treated as node failures; thus, the CAP proof doesn't apply in a formal sense, as it s proven by forcing a replicated service to respond to conflicting requests during a partitioning failure, triggering inconsistency. Nonetheless, most developers believe in a generalized CAP folk theorem, holding that scalability and elasticity are incompatible with strong forms of consistency. We designed, implemented, and benchmarked the Isis2 platform: a first-tier consistency alternative that replicates data, combines agreement on update ordering with amnesia freedom, and supports both good scalability and fast response. A team of students was lead in the application of Isis2 to build a large-scale distributed computer-vision landmark-recognition system,


Descriptors :   *FAULT TOLERANCE , *HIGH PERFORMANCE COMPUTING , CLOUD COMPUTING , COMMUNICATIONS PROTOCOLS , DISTRIBUTED COMPUTING , LOCAL AREA NETWORKS , SOFTWARE ENGINEERING


Subject Categories : Computer Programming and Software


Distribution Statement : APPROVED FOR PUBLIC RELEASE