[Beowulf] Experience of using multiple network devices on a node in cluster
Michael T. Prinkey
mprinkey at aeolusresearch.com
Mon May 16 07:14:33 PDT 2005
In many of the clusters I've built, I've used extra NICs to connect
nearest neighbor nodes to build a 1D torus. The extra cost was very low
but the results was much improved performance in several codes. For 1D
domain decompositions, this is an excellent fit with most of the traffic
crossing the crossover links and avoiding the switch entirely.
With gigabit, the improvement was less pronounced than with FE. Latency
is the real hinderance for us now.
On Mon, 16 May 2005, Mark Hahn wrote:
> > We have implemented clusters using one interface for parallel traffic
> > (Score) and one for general purpose/NFS traffic.
> segregating traffic is a common suggestion, but I don't really understand
> why it would be sensible. a node is unlikley to be running some mixture
> of MPI and IO jobs, at least the normal kind of node (dual).
> control/monitoring really ought to be minimal in bandwidth (per-node), no?
> failing to use both ports seems like a shame to me - host ports are
> smarter than switch ports, and let you build extremely high bisection
> networks. for instance:
> - NxM grid of nodes.
> - each N nodes across plugs into the m-th "row" switch.
> - each M nodes down plugs into the n-th "column" switch.
> - each switch plugs into its peers: the m-th row switch
> plugs into M-1 other row switches.
> each route is 1-2 switch hops; nodes only do the initial bit of routing
> (which port to use). this mainly makes sense where you have cheap switches,
> but more nodes than FNN can use (and biger switches are expensive.)
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf