Share Email Print

Proceedings Paper

Hierarchical Approaches To Fault Tolerance In Processor Arrays
Author(s): Y X Wang; Jose A. B Fortes
Format Member Price Non-Member Price
PDF $17.00 $21.00

Paper Abstract

Because processor arrays have only limited connections between neighboring processors, fault-tolerance schemes may require additional interconnect, switching and control hardware in order to allow for reconfiguration when faults occur. In general, the larger the reconfiguration capability, the greater is the probability that a processor array can survive a given distribution of faults. In other words, the coverage of the reconfiguration procedure increases directly with the amount of extra hardware required to support it. However, this is true only if the added hardware does not fail itself. For this reason and depending on, among other factors, the size of the processor array and the size of each processor, dis-tinct reconfiguration schemes may be best suited for different arrays. Also, in general, previously proposed schemes may still result in unacceptably low reliabilities in very large processor arrays. This paper proposes a class of reconfiguration schemes which have a hierarchical nature. According to this approach, a processor array is logically partitioned into smaller subarrays and, once faults occur, reconfiguration takes place within each of the subarrays (where faults are present) if possible and, otherwise, the full subarray is replaced by a spare subarray. Arrays of this type are referred to as bi-level fault-tolerant processor arrays and, by allowing several levels of reconfiguration, multi-level arrays can be defined similarly. While several levels of reconfiguration are possible, the case of two levels is emphasized in this paper. Also, the reconfiguration schemes used in each level are not necessarily identical. This class of hierarchical reconfiguration schemes provide much higher reliability than previously proposed ones, particularly in the case of very large arrays. To design a hierarchical reconfiguration scheme for a given processor array it is necessary to choose the size of the subarrays for every level in the hierarchy as well as the reconfiguration scheme at that level. A design methodology is provided which mathematically solves these problems, i.e. it enables the choice of the subarrays size and the reconfiguration scheme to be used at each level so to obtain a processor array with optimal reliability.

Paper Details

Date Published: 20 April 1988
PDF: 10 pages
Proc. SPIE 0880, High Speed Computing, (20 April 1988); doi: 10.1117/12.944047
Show Author Affiliations
Y X Wang, Purdue University (United States)
Jose A. B Fortes, Purdue University (United States)

Published in SPIE Proceedings Vol. 0880:
High Speed Computing
David P. Casasent, Editor(s)

© SPIE. Terms of Use
Back to Top
Sign in to read the full article
Create a free SPIE account to get access to
premium articles and original research
Forgot your username?