[Beowulf] fast interconnects

Eugen Leitl eugen at leitl.org
Tue May 23 09:58:15 PDT 2006


On Tue, May 23, 2006 at 10:52:18AM -0400, Mark Hahn wrote:

> > Switches (crossbars) don't scale for very many ports. Especially if you have to
> 
> I'm a little mystified by this comment, since you can build arbitrary
> topologies using xbar primitives.  for instance, quadrics fabrics are 
> usually full-bisection fat-trees, and are composed of (iirc) 8x xbar
> switching chips.  myrinet is <handwave>similar</handwave>.

No disagreement. But if you have a large number of nodes, all connected
to a single crossbar (regardless, how it's implemented internally) then
the wire length is getting up, and thus the signal length. You can't
have 1000 ports on one die, so you have to use multiple chips, with
each connection burning power and costing time. A neuron has a connectivity
of some 10^4, and some few manage to handle some 10^5 convergent fibres.
Blue Gene is designed to scale to some 10^6 nodes (admittely, not all 
of them discrete), so here you're hitting the limits of a global
crossbar. So you would have to go to a mixture of a switch with a moderate
number (some 4-32) ports integrated into your processing element, and
wire mostly locally. If the wire is your fifo, effectively the entire
cluster mesh between message source and message origin is your medium.
Very much like photons across vacuum, only a smart vacuum which makes
routing decisions.
 
> > do cut-through switching at those high speeds. It would be good if each NIC
> > would came with an integrated switch, with enough ports to wire at least a 3d
> > torus (where you route/switch messages via Bresenham).
> 
> while I like multi-port designs, they do become huge investments in wire.

True, but TANSTAAFL. If you want to have a giant crossection, you
have to wire it somehow. Most of the brain is wire, and most of it
is local (few connects are long-range). The brain also does the FIFO
within the wire thing (120 m/s spike).
 
> > In regards to keeping the wires short, does this IBM trick of keeping all
> > wires equal-length work well on 3d lattices, and above? This would seem to
> > be a must for those coming (hopefully) Hypertransport motherboards with 
> > connectors.
> 
> afaikt, everyone now takes the approach of putting eq/deskew/etc logic
> on every pair or smallnumber of wires.  HT certainly does that (including
> a clock per 8 data bits).  I think pci-e is based on separate analog
> processing for each lane.

I think I've seen external PCI-E connectors somewhere, and I've seen
also a switch mentioned in some dead tree magazine somewhere.

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820            http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 191 bytes
Desc: Digital signature
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20060523/256cd5ea/attachment.sig>


More information about the Beowulf mailing list