dual AMD clusters
siegert at sfu.ca
Fri Jun 22 12:04:38 PDT 2001
On Thu, Jun 21, 2001 at 16:45:27 -0400, Dan Kirkpatrick wrote:
> We're about to build our PoP (Pile of Pentiums) beowulf cluster #2 and are
> trying to build this one even better since our budget is 4x that of the 1st
> There's a wide variety of needs, some small code, some large (200mb-1gb+).
> Mostly threaded, some in the future may be parallel.
> We've got a few questions for the list...
> 3. Processor choice
> What about Dual Athlon's? Are they actually available and reliable? We've
> gotten the feeling that they're fairly new, run hotter (more cooling/larger
> case needed), and more expensive, but for the calculations we're doing, it
> may be more power for the money in the end if they are actually available
> and reliable.
I have almost completed my test of a dual AMD cluster. The test cluster
had two dual nodes. The master node has 5 NICs (2 onboard 3c980, 3 3c905B),
the internal node has 4 NICs (2 onboard 3c980, 2 3c905B). 3 NICs are used
in a channel-bonded configuration, one is used for NFS traffic. The
remaining one on the master node is for the internet connection.
I am using a 2.4.5 kernel on a otherwise RH7.1 system.
I have encountered a few problems, all of which, but one have been solved:
1) Pay close attention to the memory chips that Tyan has approved. Other
chips may not work.
2) The latest BIOS upgrade solves problems with booting the nodes.
3) There is a bug in the 2.4.5 kernel (somewhere in the apic code) that
brings network connections to a grinding halt. Using the -ac versions
(e.g., 2.4.5-ac17) solves this problem.
4) lm_sensors fails to recognize the hardware monitoring chips on the
Otherwise the system has been rock solid: no crashes, very good performance.
Furthermore, Tyan is going to release a no-SCSI version of the motherboard
soon - this will make the dual AMD system very competitive - I finally
made up my mind to go that way.
If only somebody could show me how to patch lm_sensors to detect the
hardware monitoring chips on the Tyan motherboard:
1) I was able to insert the i2c-amd756.o module (after changing the
PCI_DEVICE_ID to 7413).
2) sensors-detect now shows a Winbond W83782D chip, but "modprobe -k w83781d"
brings the box to a full stop (only a hard reset helps). This obviously
needs some work - does anybody know more about this?
Academic Computing Services phone: (604) 291-4691
Simon Fraser University fax: (604) 291-4242
Burnaby, British Columbia email: siegert at sfu.ca
Canada V5A 1S6
More information about the Beowulf