Examples

From Mobius Wiki
Revision as of 20:43, 14 March 2014 by Duyenle2 (talk | contribs) (Created page with "== <span style="font-size:120%">Fault-Tolerant Multiprocessor System</span> == This section presents an example of a system that can be modeled using Möbius. It starts with ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Fault-Tolerant Multiprocessor System

This section presents an example of a system that can be modeled using Möbius. It starts with a description of the system, and then guides you through one way to build a model of the system and solve it using both simulation and numerical solution. The example is intended to take you step-by-step through the process of creating and solving a model in Möbius, and to exhibit many of the capabilities and features of the tool.


System Description

The system under consideration is a highly redundant fault-tolerant multiprocessor system adapted from [1] and shown in <xr id="fig:ex_multiproc" />. At the highest level, the system consists of multiple computers. Each computer is composed of 3 memory modules, of which 1 is a spare module; 3 CPU units, of which 1 is a spare unit; 2 I/O ports, of which 1 is a spare port; and 2 non-redundant error-handling chips.


<figure id="fig:ex_multiproc">

Multiproc.png


<xr id="fig:ex_multiproc" nolink />: Fault-tolerant multiprocessor system.
</figure>


Internally, each memory module consists of 41 RAM chips (2 of which are spare chips) and 2 interface chips. Each CPU unit and each I/O port consists of 6 non-redundant chips. The system is considered operational if at least 1 computer is operational. A computer is classified as operational if, of its components, at least 2 memory modules, at least 2 CPU units, at least 1 I/O port, and the 2 error-handling chips are functioning. A memory module is operational if at least 39 of its 41 RAM chips, and its 2 interface chips, are working.

Where there is redundancy (available spares) at any level of system hierarchy, there is a coverage factor associated with the component failure at that level. For example, following the parameter values used by Lee et al.[1], if one CPU unit fails, with probability 0.995 the failed unit will be replaced by the spare unit, if available, and the corresponding computer will continue to operate. On the other hand, there is also a 0.005 probability that the fault recovery mechanism will fail and the corresponding computer will cease to operate. <xr id="tab:ex_coverage" /> shows the redundant components and their associated fault coverage probability. Finally, the failure rate of every chip in the system, as in [1], is assumed to be 100 failures per billion hours1.

1 0.0008766 failures per year.


<figtable id="tab:ex_coverage">

<xr id="tab:ex_coverage" nolink />: Coverage probabilities.
Redundant Component Fault Coverage Probability
RAM Chip 0.998
Memory Module 0.95
CPU Unit 0.995
I/O Port 0.99
Computer 0.95
</figtable>


Möbius

Möbius

Motivation

Solution

Graph

Edit Möbius Documentation

“” –

<equation id="eqn:binom" shownumber>

f(k)=\binom{n}{k}p^k(1-p)^{n-k}\quad k=0,1,\dots,n

</equation>

Sort of like <xr id="eqn:binom" />, but not really.


References

  1. 1.0 1.1 1.2 D. Lee, J. Abraham, D. Rennels, and G. Gilley. A numerical technique for the evaluation of large, closed fault-tolerant systems. In Dependable Computing for Critical Applications, pages 95–114. Springer-Verlag, Wien, 1992.

Fault-Tolerant Multiprocessor System[edit]

This section presents an example of a system that can be modeled using Möbius. It starts with a description of the system, and then guides you through one way to build a model of the system and solve it using both simulation and numerical solution. The example is intended to take you step-by-step through the process of creating and solving a model in Möbius, and to exhibit many of the capabilities and features of the tool.


System Description[edit]

The system under consideration is a highly redundant fault-tolerant multiprocessor system adapted from [1] and shown in <xr id="fig:ex_multiproc" />. At the highest level, the system consists of multiple computers. Each computer is composed of 3 memory modules, of which 1 is a spare module; 3 CPU units, of which 1 is a spare unit; 2 I/O ports, of which 1 is a spare port; and 2 non-redundant error-handling chips.


<figure id="fig:ex_multiproc">

Multiproc.png


<xr id="fig:ex_multiproc" nolink />: Fault-tolerant multiprocessor system.
</figure>


Internally, each memory module consists of 41 RAM chips (2 of which are spare chips) and 2 interface chips. Each CPU unit and each I/O port consists of 6 non-redundant chips. The system is considered operational if at least 1 computer is operational. A computer is classified as operational if, of its components, at least 2 memory modules, at least 2 CPU units, at least 1 I/O port, and the 2 error-handling chips are functioning. A memory module is operational if at least 39 of its 41 RAM chips, and its 2 interface chips, are working.

Where there is redundancy (available spares) at any level of system hierarchy, there is a coverage factor associated with the component failure at that level. For example, following the parameter values used by Lee et al.[1], if one CPU unit fails, with probability 0.995 the failed unit will be replaced by the spare unit, if available, and the corresponding computer will continue to operate. On the other hand, there is also a 0.005 probability that the fault recovery mechanism will fail and the corresponding computer will cease to operate. <xr id="tab:ex_coverage" /> shows the redundant components and their associated fault coverage probability. Finally, the failure rate of every chip in the system, as in [1], is assumed to be 100 failures per billion hours1.

1 0.0008766 failures per year.


<figtable id="tab:ex_coverage">

<xr id="tab:ex_coverage" nolink />: Coverage probabilities.
Redundant Component Fault Coverage Probability
RAM Chip 0.998
Memory Module 0.95
CPU Unit 0.995
I/O Port 0.99
Computer 0.95
</figtable>


Möbius

Möbius[edit]

Motivation[edit]

Solution[edit]

Graph

Edit Möbius Documentation

“” –

<equation id="eqn:binom" shownumber>

f(k)=\binom{n}{k}p^k(1-p)^{n-k}\quad k=0,1,\dots,n

</equation>

Sort of like <xr id="eqn:binom" />, but not really.


References[edit]

  1. 1.0 1.1 1.2 D. Lee, J. Abraham, D. Rennels, and G. Gilley. A numerical technique for the evaluation of large, closed fault-tolerant systems. In Dependable Computing for Critical Applications, pages 95–114. Springer-Verlag, Wien, 1992.