- Introduction
- Impact of Scaling on Reliability
- Defects, Faults, Errors, and Reliability
- Reliability and Quality Testing and Measurement
- Reliability Characterization
- Reliability Prediction Procedures
- Reliability Simulation Tools
- Mechanisms for Permanent Device Failure
- Safeguarding Against Failures
- Concluding Remarks
1.5 Reliability Characterization
Reliability of a device, expressed as a function of time after manufacture, denotes the probability that the device will function correctly until that time instant. We shall revisit this definition in Chapter 6 for built-in self-repairable RAMs.
The mean time between failures (MTBF) is defined as the reciprocal of the failure rate . Note that in Chapter 6, we have derived an expression called mean time to fail (MTTF) by integrating the reliability function over time t, with t ranging from 0 to . For reliability functions that have a negative exponential form, these two timing parameters happen to be equal. For example, if R(t) = e- t, then MTTF = MTBF = 1/. If MTBF = 109 hours for a single device, then its failure rate is defined as 1 FIT (failure in time). Reliability is typically characterized by a bathtub curve, shown in Figure 1.8. This curve illustrates three phases of failure in product life: (1) infant mortality; (2) operational failure; and (3) wear-out. In the infant mortality phase, failure rates are very high. During the operational life, failure rates are comparatively lower and level off to a constant value. During the wear-out phase, the failure rate increases once again.
Figure 1.8. The bathtub curve
Infant mortality depends on the quality of the manufacturing process and the amount of burn-in testing done. As discussed before, burn-in accelerates the infant mortality and tends to flatten the bathtub curve. The useful life of a device is characterized by a constant failure rate during field use. Failures that occur during useful life are sporadic in nature, caused by process and design defects, such as due to electrical shorts and opens, mismatch between device parameters, and parametric and timing faults.
In the wear-out region, failure rates decrease continuously with time. Systems are designed with reliability specifications which ensure that the system will never enter into the wear-out region within its active lifetime.