[Beowulf] Re: Cooling vs HW replacement

Greg Lindahl lindahl at pathscale.com
Sun Jan 23 14:35:48 PST 2005

On Sun, Jan 23, 2005 at 11:30:30AM -0500, Robert G. Brown wrote:

> So I reiterate -- MTBF for hard disks, as reported by the manufacturer,
> is a nearly useless number.

It is useful if you use it for what it's meant to be used for: the
failure rate in the bottom of the bathtub. I don't know why you were
thinking of using it for anything else, like disk lifetime, or infant
mortality. I have found that my actual failure rates have been 2X-3X
the manufacturer's number, but you always have to worry about dust,
power surges, and excess heat incidents in real machine rooms.

MTBF for just about everything is computed the same way, and most gizmos
have the same bathtub-shaped failure curve.

> They present us with an obviously globally false number that is
> almost unbearably optimistic and cheery.

Operator error, I'm afraid.

-- greg

More information about the Beowulf mailing list