[Beowulf] torus versus (fat) tree topologies
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Underwood, Keith D kdunder at sandia.govFri Nov 12 20:10:39 PST 2004
- Previous message: [Beowulf] CFP: MDC'05
- Next message: [Beowulf] CFP: PGaMS'05 Programming Grids and Metacomputing Systems Workshop (fwd from rabenseifner@hlrs.de)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> Going trough a crossbar cost about 100-150ns these days. I would expect > the cost of one hop to be roughly the same in a torus. Now, the > interesting number is the number of hops for a given number of nodes. > With a 32-ports crossbar, you have one hop for 32 nodes. For 1280 nodes, > you can build a Clos topology of diameter 5, thus 5 hops. With a 3D > Torus, you route on a hypercube, so you have one hop for 8 nodes. I let > the pleasure to compute the number of hops for a 1280 nodes 3D torus to > a volunteer :-) The latency is more like 30-50 ns per hop on a torus (smaller switch is lower latency) and could be even lower. On Red Storm, a 1280 node system would be 14 cabinets (13.3, actually), so let's say 16 cabinets (each cabinet is 1x4x24 for X, Y, Z). Assuming a bi-directional torus in all dimensions (8x8x24), your worst case hop count is 4+4+12 or 1000 ns for 1536 nodes vs. 750 ns with your 1280 nodes. The dirty secret of most Fat tree style topologies (that nobody likes to talk about) is the ugly step functions (that I think someone mentioned). With a torus, if you want an extra cabinet, you buy an extra cabinet and cable it in. With a Clos topology, you can do the same thing --- until you hit the limit of the number of nodes that can be supported by the switch configuration. Then you get to buy a whole extra layer of switch (at high cost) and engage in a re-cabling nightmare. Nonetheless, for certain classes of bisection bandwidth limited problems, Clos/Fat tree topologies can provide a better solution as long as the link speeds are high enough and the system is large enough that comparable bisection bandwidth couldn't be found in a torus. i.e. an 8x8x8 torus with 3 GB/s links would have higher bisection bandwidth than a Clos network of the same number of nodes and 500 MB/s links (if I did my math right ;-) Keith
- Previous message: [Beowulf] CFP: MDC'05
- Next message: [Beowulf] CFP: PGaMS'05 Programming Grids and Metacomputing Systems Workshop (fwd from rabenseifner@hlrs.de)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
