[Beowulf] Infiniband: beyond 24 ports
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comMon Aug 25 15:38:50 PDT 2008
- Previous message: [Beowulf] Infiniband: How to go beyond the 24-port barrier?
- Next message: [Beowulf] Infiniband: beyond 24 ports
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Gus Correa wrote: > Hello Rocks network experts > > Consider a cluster with 24 compute nodes, one head node, and one storage > node. > Imagine that one wants to install Infiniband (IB) and use it for MPI > and/or > for NFS or parallel file system services. > IB switches larger than 24 ports are said to be significantly more > expensive than the 24-port ones. > > Questions: > > 1) What is the cost-effective yet efficient way to connect this cluster > with IB? Understand that cost-effective is not necessarily the highest performance. This is where over-subscription comes in. > 2) How many switches are required, and of which size? With the new 36 port switch chips from Mellanox, you should need 1. For a reasonable oversubscription (16 down, 8 up), you would need 3 switches ... 1 master and 2 leaf switches. > 3) How should these switches be connected to the nodes and to each > other, which topology? Hard to draw in ascii art. > 4) Does the same principle and topology apply to Ethernet switches? Sort of, though in Ethernet switches there is usually less stress on oversubscription of links. If you are building a gigabit MPI cluster, you really want the switch ports as flat as possible. Daisy chaining is fine for offices, it is a bad idea for MPI networks. > > If anyone has a pointer to an article or a link to web page that > explains this, > just send it to me please, don't bother to answer the questions. > My (in)experience is limited to small clusters with a single switch, > but hopefully the information will help other folks in the same situation. > > I saw a 24+1-node IB cluster with the characteristics above - > except that the head node seems to double as storage node. > The cluster has *four* 24-port IB switches. One switch has 24 ports > connected, two others have 16 ports connected, and the last one has 17 > ports connected. > Hard to figure out the topology just looking at the connectors and the > tightly bundled cables. > In my naive thoughts the job could be done with two switches only. You could if you don't really care about bandwidth and oversubscription. Since these nets are designed for high performance it makes sense to try to run them at high speed, and only oversubscribe if you must, and only by the amount you need. Extra contention usually means timing jitter/delays/slower runs. If your storage node can handle multiple IBs in, it might not be a bad idea in some cases. If you are looking to use the high speed net for storage, please be aware that 2.6.25 and later kernels contain support for NFS over RDMA (needed on both client and server). We have test kernels we are using with JackRabbit for this. Over SDR IB, we see ~460 MB/s for a link that gets ~750 MB/s using the ib_rdma_bw tool. Compare this to NFS over IPoIB, and you will get about 250 MB/s or so there. Other modalities for high speed storage are possible. Joe > > Thank you > Gus Correa > -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615
- Previous message: [Beowulf] Infiniband: How to go beyond the 24-port barrier?
- Next message: [Beowulf] Infiniband: beyond 24 ports
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
