Accession Number : AD1041979


Title :   CLEAR: Cross-Layer Exploration for Architecting Resilience


Descriptive Note : Conference Paper


Corporate Author : Stanford University Stanford United States


Personal Author(s) : Cheng,Eric ; Mirkhani,Shahrzad ; Szafaryn,Lukasz G ; Cher,Chen-Yong ; Cho,Hyungmin ; Skadron,Kevin ; Stan,Mircea R ; Lilja,Klas ; Abraham,Jacob A ; Bose,Pradip ; Mitra,Subhasish ; Cheng,Eric ; Mirkhani,Shahrzad ; Szafaryn,Lukasz G ; Cher,Chen-Yong ; Cho,Hyungmin ; Skadron,Kevin ; Stan,Mircea R ; Lilja,Klas ; Abraham,Jacob A ; Bose,Pradip ; Mitra,Subhasish


Full Text : https://apps.dtic.mil/dtic/tr/fulltext/u2/1041979.pdf


Report Date : 01 Mar 2017


Pagination or Media Count : 4


Abstract : CLEAR is a first of its kind framework which overcomes a major challenge in the design of digital systems that are resilient to reliability failures: achieve desired resilience targets at minimal costs(energy, power, execution time, area) by combining resilience techniques across various layers of the system stack (circuit, logic, architecture, software, algorithm). CLEAR automatically and systematically explores the large space of techniques and their combinations (586 cross-layer combinations in this paper), derives cost-effective solutions, and provides guidelines for designing new techniques. Carefully optimized combinations of circuit-level hardening, logic-level parity checking, and micro-architectural recovery provide highly cost-effective soft error resilience for general-purpose processor cores. 50x silent data corruption rate improvement is achieved at 2.1% energy cost for out-of-order (6.1% for in-order) cores, with no speed impact. Selective circuit-level hardening alone, guided by thorough application benchmark analysis, also provides cost-effective solutions (1% additional energy cost for the same 50x improvement).


Descriptors :   digital systems , reliability , algorithms , complex systems , energy , computer programs , circuits , resilience , FAILURE (ELECTRONICS) , cost effectiveness , costs , fault tolerance , computing system architectures , computer architecture


Subject Categories : Computer Hardware


Distribution Statement : APPROVED FOR PUBLIC RELEASE