[Beowulf] How to Diagnose Cause of Cluster Ethernet Errors?
smulcahy at aplpi.com
Mon Apr 2 07:28:37 PDT 2007
Things I look out for in switches in general are reliability and build
quality. I'd have some cheaper switches which worked but got worryingly
warm to touch. The 3com switches we use in our office in general tend to
be solid and don't seem to heat up as much as some of the SMCs. Having
said that, I've heard good things (here mostly) about some specific SMC
I generally don't pay for managed switches unless I have clear needs to
work with my traffic at that level. I don't have those needs for a small
office or department environment.
For a cluster, given the budgets typically involved, I'm inclined to err
on the side of a switch with a good reputation and a more extensive
feature-set then I actually need since it is such a critical piece of
For clusters, the overall bandwidth of the switch is also a huge issue.
It's still not clear to me how reliable manufacturers figures for switch
bandwidth are though. The procurve we have in our cluster seems to be
performing well, and as I said, I've heard good things about some of the
SMCs (tigers?) but short of going with what others are using
successfully I'm not sure. Has anyone tested a dozen switches in a lab
for backplane bandwidth?
I'm sure the more experience members will have more concrete pointers
but maybe my comments give you a starting point - it's an interesting,
and very relevant, question.
Jon Forrest wrote:
Stephen Mulcahy, Applepie Solutions Ltd, Innovation in Business Center,
GMIT, Dublin Rd, Galway, Ireland. http://www.aplpi.com
More information about the Beowulf