MTBF Degradation of CPU's After Sustained Overtemp
MTBF Degradation of CPU's After Sustained Overtemp
(OP)
I am seeking reliability White Papers or other relevant info on MTBF calculations or "seat-of-the-pants" assessment of long term degradation of microprocessors used in common PC servers after sustained ambient overtemperature conditions.
Background: An operating server room's HVAC system failed over a weekend. Recorded ambient and CPU temps approached 80-100 degrees C. After the HVAC was restored, all systems appeared to function normally.
Question: For a given life expectancy of uP's, is it possible to assess or calculate a percentage loss-of-life reliability quotient?
Thanks, Cliff Michael
cliffmichael@netscape.net
918 625-1563
Background: An operating server room's HVAC system failed over a weekend. Recorded ambient and CPU temps approached 80-100 degrees C. After the HVAC was restored, all systems appeared to function normally.
Question: For a given life expectancy of uP's, is it possible to assess or calculate a percentage loss-of-life reliability quotient?
Thanks, Cliff Michael
cliffmichael@netscape.net
918 625-1563
RE: MTBF Degradation of CPU's After Sustained Overtemp
Now, we're playing with statistics here, and calculating actual impact on a single device is of course impossible (and even on your room full of devices). Given that a server's useful life is probably under 5 years, this 5 month impact is probably not much to worry about since, depending on supplier, the design life is 7 to 20 years...
Other things besides the processor are far more likely to show degraded performance due to that overtemp condition, such as electrolytic capacitors. So watch your power supplies...
RE: MTBF Degradation of CPU's After Sustained Overtemp
Sorry for any confusion.
Mike
RE: MTBF Degradation of CPU's After Sustained Overtemp
RE: MTBF Degradation of CPU's After Sustained Overtemp
Classic failure rate analysis is governed by the Arhenius equation. Thus, all non-catastrophic overages represent accelerations of the basic failure rate, hence, burn-in can be used to accelerate failures and/or to predict what the failure rate will be for given conditions.
TTFN
RE: MTBF Degradation of CPU's After Sustained Overtemp
RE: MTBF Degradation of CPU's After Sustained Overtemp
The real killers for electronics reliability is temperature cycling and vibration. These create mechanical failures in areas we don't usually pay enough attention too, soldering and plated through holes to inner layers for example.
RE: MTBF Degradation of CPU's After Sustained Overtemp
http://www.electronics-cooling.com/html/2001_feb_techbrief.html
ko (www.ecooling.biz)
RE: MTBF Degradation of CPU's After Sustained Overtemp
Reliability of semiconductor devices is affected by both thermal and electric field. Hot electrons, for instance, are only mildly affected by temperature, but raise the voltage and watch out! Punch-through in short-channel devices (and similar problems like leakage) are voltage-driven; that's the main reason VDD keeps going down in new generations of devices....they'd fail in a second operating at 5V!
Mike
--
Mike Kirschner
Design Chain Associates, LLC
http://www.designchainassociates.com