[Beowulf] Re: failure trends in a large disk drive population

Chris Samuel csamuel at vpac.org
Sun Feb 18 14:29:54 PST 2007


On Sat, 17 Feb 2007, Jim Lux wrote:

> I think it's pretty obvious that Google has figured out how to
> partition their workload in a "can use any number of processors" sort
> of way, in which case, they probably should be buying the cheap
> drives and just letting them fail (and stay failed.. it's probably
> cheaper to replace the whole node than to try and service one)...

IIRC they also have figured out a way to be fault tolerant by sending queries 
out to multiple systems for each part of the DB they are querying, so if one 
of those fails others will respond anyway.

Apparently they use more reliable hardware for things like the advertising 
service..

-- 
 Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
 Victorian Partnership for Advanced Computing http://www.vpac.org/
 Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.scyld.com/pipermail/beowulf/attachments/20070219/d05ed3c7/attachment.bin


More information about the Beowulf mailing list