dual AMD clusters

Martin Siegert siegert at sfu.ca
Fri Jun 22 12:04:38 PDT 2001


On Thu, Jun 21, 2001 at 16:45:27 -0400, Dan Kirkpatrick wrote:
> We're about to build our PoP (Pile of Pentiums) beowulf cluster #2 and are 
> trying to build this one even better since our budget is 4x that of the 1st 
> one...
> There's a wide variety of needs, some small code, some large (200mb-1gb+).
> Mostly threaded, some in the future may be parallel.
> 
> We've got a few questions for the list...
<snip>
> 3. Processor choice
> What about Dual Athlon's?  Are they actually available and reliable?  We've 
> gotten the feeling that they're fairly new, run hotter (more cooling/larger 
> case needed), and more expensive, but for the calculations we're doing, it 
> may be more power for the money in the end if they are actually available 
> and reliable.

I have almost completed my test of a dual AMD cluster. The test cluster
had two dual nodes. The master node has 5 NICs (2 onboard 3c980, 3 3c905B),
the internal node has 4 NICs (2 onboard 3c980, 2 3c905B). 3 NICs are used
in a channel-bonded configuration, one is used for NFS traffic. The
remaining one on the master node is for the internet connection.

I am using a 2.4.5 kernel on a otherwise RH7.1 system.

I have encountered a few problems, all of which, but one have been solved:
1) Pay close attention to the memory chips that Tyan has approved. Other
   chips may not work.
2) The latest BIOS upgrade solves problems with booting the nodes.
3) There is a bug in the 2.4.5 kernel (somewhere in the apic code) that
   brings network connections to a grinding halt. Using the -ac versions
   (e.g., 2.4.5-ac17) solves this problem.
4) lm_sensors fails to recognize the hardware monitoring chips on the
   Tyan motherboard.

Otherwise the system has been rock solid: no crashes, very good performance.
Furthermore, Tyan is going to release a no-SCSI version of the motherboard
soon - this will make the dual AMD system very competitive - I finally
made up my mind to go that way.

If only somebody could show me how to patch lm_sensors to detect the
hardware monitoring chips on the Tyan motherboard:
1) I was able to insert the i2c-amd756.o module (after changing the
   PCI_DEVICE_ID to 7413).
2) sensors-detect now shows a Winbond W83782D chip, but "modprobe -k w83781d"
   brings the box to a full stop (only a hard reset helps). This obviously
   needs some work - does anybody know more about this?

Cheers,
Martin

========================================================================
Martin Siegert
Academic Computing Services                        phone: (604) 291-4691
Simon Fraser University                            fax:   (604) 291-4242
Burnaby, British Columbia                          email: siegert at sfu.ca
Canada  V5A 1S6
========================================================================




More information about the Beowulf mailing list