[clug] Effect of operating temps on life of integrated circuits

Michael Cohen scudette at gmail.com
Sat Jan 23 01:28:35 MST 2010

There was a great paper written by google about the hdd failure rates:


They looked at lots of factors but wrt temperature:

Previous studies have indicated that temperature deltas as low as
15C can nearly double disk drive failure rates [4].

They analysis is very interesting but the gist of it:

We can conclude that at moderate temperature ranges it is likely that there
are other effects which affect failure rates much more strongly than
temperatures do.

AFAIK their study is unique because it was run on real equipment in
real data center over many years. Accelerated failure tests may not
apply for long term failure.


On Sat, Jan 23, 2010 at 6:32 PM, steve jenkin <sjenkin at canb.auug.org.au> wrote:
> I've been trying to track down any estimates/rules-of-thumb about the
> effect of increased junction temperature on expected lifetime of IC's
> like CPU's and RAM - and was hoping the Collective Wisdom of CLUG (and
> some better Google-Fu than mine) could come up with quotable figures.
> There's a good paper "Calculation of Semiconductor Failure Rates" by
> William J. Vigrass that lists a bunch of failure mechanisms and models,
> but no values.
> [PDF's available through google]
> Related pages, quick reads, apart from 1st:
> <http://www.tmworld.com/article/320758-What_Causes_Semiconductor_Devices_to_Fail_.php>
> <http://www.tmworld.com/article/324462-Models_Predict_Failure_Rates.php>
> <http://www.tmworld.com/article/317633-Using_Models_to_Predict_Semiconductor_Failures.php>
> <ttp://www.tmworld.com/article/318052-The_Effect_of_Temperature_on_Failure_Rate.php>
> The "Arrhenius Equation" is commonly used as an empirical method for
> estimating MTTF (lifetime) from Accelerated testing. [The proposition is
> that silicon devices fail for physico-chemical reasons, so lifetimes are
> determined by rates of chemical reaction.]
> Arrhenius observed that rates of chemical reactions approx doubled for
> each 10°C rise of temp [why milk 'goes off' quickly when left out].
> Individual reactions can be described by two constants (Acceleration
> Factor and 'Activation Energy').
> While I've come up with pointers to the theory, I wanted some more
> specific numbers to use/chuck around :-)
> How much lifetime do you sacrifice by running your CPU hotter?
> Inversely, how slack can you be with your server room cooling and still
> expect a 5 year life?
> The closest I've come is this nice little table. "TEC" ==
> "Thermoelectric cooler" (peltier devices)
> <http://www.rmtltd.ru/subpages/app_tips_tec_high_temp.htm>
> Operating temperature, °C vs TEC Lifetime, hrs
>  25      70       85     125     150     200
> 9,9E+05 6,0E+04 2,7E+04 4,6E+03 1,8E+03 3,6E+02
> 5years ~= 45,000 hrs, or 4.5E+05.
> or between 125°C and 150°C in the table above.
> BTW, this piece implies MTBF is median life (50%-ile), elsewhere MTTF is
> quoted at 1-std. deviation (63.2%).
> Thanks in advance for any insight/help/pointers.
> regards
> steve
> PS:
> If anyone wants to waste time reading, there's a nice 'Peltier Guide':
> <http://www.heatsink-guide.com/peltier.htm>
> One of the 'gotchas' is their inefficiency, for every Watt moved, they
> can consume  1-2 Watts. If you have a 60W CPU, you end up burning
> another 120W in cooling, needing a total of 180W dissipated by the
> heatsink! :-(
> Another cute factoid I ran into about Fans:
> Noise varies with *5th* power of fan speed.
> Running fans at half-speed is much quieter :-)
> Not sure about the power-draw of fans w.r.t. speed.
> I'd have thought cube of speed, but that article said "square law" :-(
> --
> Steve Jenkin, Info Tech, Systems and Design Specialist.
> 0412 786 915 (+61 412 786 915)
> PO Box 48, Kippax ACT 2615, AUSTRALIA
> sjenkin at canb.auug.org.au http://members.tip.net.au/~sjenkin
> --
> linux mailing list
> linux at lists.samba.org
> https://lists.samba.org/mailman/listinfo/linux

More information about the linux mailing list