reliable network for cluster

Brian Beuning bbeuning at mindspring.com
Wed Jul 4 17:20:11 PDT 2001


Many people on this list seem to be interested in the comptational
power of clusters.  If their network goes down, they might just restart
the problem they are crunching on.

For some of us, the beauty of clusters is their reliability.  If one
node
fails, the rest of the nodes can continue processing.  Of course, a
cluster
is only as reliable as the network that connects the nodes.  Channel
Bonding seems great for getting more bandwidth but only helps with half
of the network reliability issue.  (It lets a node send packets out in a
broken
network, but does not necessarily let clients get to the box using
Channel
Bonding.)

What I would like is a way to have multiple NIC cards in a node have
the same IP address, each NIC is connected to a different switch/hub
and have the routing figure out which paths are up and down.  It should
also do load balancing (aka scale the network).  Since our clusters all
use commodity hardware, paying for some esoteric network routers is
not part of the plan.

How do you folks solve this issue?

Thanks,
Brian Beuning







More information about the Beowulf mailing list