[Beowulf] The Walmart Compute Node?
deadline at eadline.org
Thu Nov 8 16:25:37 PST 2007
> Can you actually show which software you run to get those gflops?
As I said, HPL with Goto BLAS, pretty standard.
> Actual truth currently is that the quadcores are far superior because
> of the power draw in the long run for number crunching and a far bigger,
> though far from 2x faster processing speed than dual cores.
> for real low power number crunching of course you don't put in
> harddrives, that wastes energy for nothing as well as money.
> boot from usb obviously. Core2 is far superior, and barcelona core
> based chips still have to show up;
PXE boot is a bit more obvious (Warewulf).
> AMD's oldie K8 is nowhere near the speed you need for number
> crunching with double precision floating points to core2.
> In all cases the power draw of those cpu's, regardless whether it's
> intel or AMD, is eating up way more watts than they quote on the
> internet for TDP's.
> Calculating with those tdp's is quite impossible.
> A quadcore 2.4Ghz machine, without harddrive, definitely when
> measured with a good multimeter and a reasonable power efficient
> power supply, was measured at around 170 watts.
> If you replace that cpu by some other cpu, you might perhaps get away
> with a tad less, but still close to that 170 watt.
> For number crunching for say a year, 170 watt hammers in bigtime into
> energy costs.
The Norbert cluster (4 nodes, 3 without hard drives) requires
250 Watts when Idle and 371-483 Watts when loaded running HPL.
Measured at the wall socket with Kill-a-watt meter.
Managing power is certainly possible when the three
diskless nodes are not in use. There are two strategies
1) When not in use put the nodes in standby mode (draws about 5 Watts)
and use Wake-on-LAN to start it up. Because it uses PXE boot
and a RAM disk, it boots in about a minute. If you use a scheduler
like SGE or Torque it should be possible to have the nodes
started when needed.
2) The new tickles kernels may be able to drop the nodes
in to power saving Intel C states (the lowest is about 1.5 watts)
when not in use (although tickles kernels (2.6.22+) only work
with 32 processors right now).
The first methods is more general and the second is based on Intel
processors (and is new and untested). Overall it should be possible
to have a small cluster use about as much power as desktop when idle.
> So putting a dualcore chip inside is a ridicioulous thought in itself.
$52 dollars (US) per GFLOPS and 11 Watts per GFLOPS speaks for itself
> On Nov 8, 2007, at 9:36 PM, Douglas Eadline wrote:
>> Having some experience with low cost hardware, If you are
>> doing number crunching multi-core seems to provide the
>> best bang for buck. The following is the HPL performance that
>> you can get for $2500. The Kronos and Microwulf clusters
>> are detailed on http://clustermonkey.net, Norbert is the subject
>> of a November Linux Magazine article.
>> Cluster Name Clock Release HPL
>> Processor Speed (MHz) Date Performance
>> Kronos/Sempron 2500+ (8) 1750 7/2004 14.90 GFLOPS (Atlas)
>> Microwulf/Athlon64 X2 3800+ (4) 2000 8/2005 26.25 GFLOPS (Goto)
>> Norbert/Core Duo E6550 (4) 2333 7/2007 45.55 GFLOPS (Goto)
>> If you draw a line (3 points I know) you get to 80 GFLOPS
>> by 2010. Actually with some tweaking I got Norbert
>> up to 47.7 HPL GFLOPS. And, notice I qaulify the performance
>> as "HPL GFLOPS" as YMMV.
>> With really low cost systems one important aspect is the
>> interconnect. The PCIe buses on low end motherboards allows
>> one to use inexpensive PCIe (Intel) Ethernet cards vs
>> 32 PCI. Some of the on-board GigE implementations are
>> not very good.
>>> Recently, probably you noticed, Walmart began selling a $200 linux
>>> (Apparently the OS is just Ubuntu 7.10 with a small xindow manager
>>> instead of Gnome or KDE). Now Slashdot points to
>>> http://www.linuxdevices.com/news/NS5305482907.html, the MB being sold
>>> separately for $60 ("development board"). It has 1.5GHz CPU,
>>> unpopulated memory (slots for 2GB), one 10/100 connection. Does this
>>> look to y'all like fair FLOPS/$ for a kitchen project? I'm thinking 6
>>> of them as compute nodes per 8 port router, with a bigger head node
>>> for fileserving. (actually I'll use a spare room but you know what I
>>> mean). An arrangement like this might be faster RAM access per core,
>>> compared to multicore, since each core has no competition for is't
>>> memory, right?
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf