Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

scyld scyld on an ASUS A7V266-C

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Jorge M. Pacheco pacheco at cii.fc.ul.pt
Thu Jun 27 16:04:06 PDT 2002


   Dear ALL,

Just to let you know the happy ending of a cluster upgrade story.

As I stated before, I had a small scyld beowulf cluster with AMD's 
XP-1600+ & SDRAM PC133 running flawlessly for 6 months, 24h/day.
In view of this fantastic performance, we decided to buy some extra 
nodes, and we got a good deal with brand new XP 2000+ & DDR PC2100.
We expected our task to involve the trivial upgrade for a scyld beowulf, 
namely, that all we needed to do was to floppy-boot each node (no 
cdroms, please) and beoboot-install the operating system on the HD's...
WRONG.
We started to have quite a few problems which, after some tweaking, 
meant we could add the new nodes but they turned out to be quite 
resilient to run whatever program you would submit to them - needless to 
say, MPI programs would simply collapse.
Furthermore, slave node behaviour would depend sensitively on memory 
timings & other bios setup parameters...
This would happen at the same time that all sorts of pings to the new 
nodes from the main node would invariably give 0% packet loss.
Strange hein ? Also, if you take into account that when booting, the 
only strange thing that would happen was the complaint "neighbour table 
overflow"... this would drive you into the thought of a network problem...

Well, the truth is that, as an act of desperation, I decided to install 
THE SAME scyld beowulf software in one of the new machines, and 
transform this new machine into the main node.
Installation was perfect & smooth; at the end, all new nodes could be 
added without a single complaint. Moreover, programs would now run 
nicely (serial & parallel), so everyting went back to normal.
And what about the old nodes ?
Very well, I tried and... they did work fine. No complaints whatsoever.

So... if you decide for an expansion of your nice & stable scyld beowulf 
machine, and if you start getting strange complaints, try & set the 
fastest & most up-to-date hardware as main node, and all the rest as 
daughter-nodes.
If it works for you the same way it worked for us, you're bound to 
become a happy human again.

			Cheers,  J. M. Pacheco




More information about the Beowulf mailing list