[Beowulf] Interesting google server design

Thu Apr 2 03:41:24 PDT 2009

On Wed, 1 Apr 2009, Ellis Wilson wrote:

>
> Beyond that, building an entire cluster of that size into a large
> shipping container is genius - given access and resources to a crane and
> proper machining expertise.  But then again if your building a cluster
> of that size I suppose the crane, shipping container and a few welders
> are not going to be your biggest worries.
>
> Also interesting is that they use Gigabyte - I haven't been entirely
> impressed with them and that is for purely desktop use.  Perhaps their
> server grade boards are better quality enough to make them worthwhile at
> that scale.

IIRC Google doesn't use "server grade" anything.  They use OTC parts and
do a running computation on failure rates and optimize price performance
dynamically.  They are truly industrial scale production here.  For them
servicing/replacing a system is cheap:  Box dies.  Employee notes this,
grabs box from Big Stack of Boxes, carries it to dead box, removes dead
box, replace it with new working box, presses power switch, walks away.
Problem solved.  All boxes are handmade, so dead box goes into
boneyard/parts pile to be recycled.  I'm guessing that they do a very,
very limited amount of diagnostics on dead box -- enough to determine
problem to be power supply, disk(s), battery, or motherboard -- and if
they can tell they take out offender, replace it from stock shelf, test
rebuilt box, and place it back in Big Stack of Boxes.  Dead part is
probably tallied and junked.  MAYBE they have some sort of return system
to vendors (at the very least they probably get Sam's Club pricing:-) or
maybe they just dump it in the trash and move on.

Because there is zero penalty to node failure other than this fifteen
minutes of human time and maybe a spare part distributed along an
assembly line that handles (I'm guessing) tens of failures an hour,
there is absolutely no advantage to them in using tier 1 parts.  All
they care about is that the stack of parts they reach for is at the
sweet spot of MTBF per dollar spent per unit of "service" delivered by
the device.  All beowulfers should take note -- this is a perfect
exemplar of a principle all cluster builders should use, although of
course for different problems the optimization landscape will differ as
well (some problems are NOT tolerant of single node failure:-).

   rgb

>
> Ellis
>
> Robert G. Brown wrote:
>> On Wed, 1 Apr 2009, Bill Broadley wrote:
>>
>>> http://news.cnet.com/8301-1001_3-10209580-92.html
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>> Very interesting.  "UPS" built right into the box, and the whole thing
>> basically handmade out of OTC components.  Power supply outside of the
>> box (which makes great sense from a cooling point of view!).
>>
>>    rgb
>>
>> Robert G. Brown                           http://www.phy.duke.edu/~rgb/
>> Duke University Dept. of Physics, Box 90305
>> Durham, N.C. 27708-0305
>> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>>
>
>
>
>
>
>
>
>
>

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu