[Beowulf] building Infiniband 4x cluster questions

Mon Nov 7 17:53:55 PST 2011

The latency numbers are more or less the same between the IB vendors on SDR, DDR and QDR. Mellanox is the only vendor with FDR IB for now, and with PCIe 3.0 latency are below 1us (RDMA much below...). Question is what you are going to use the system for - which apps.

Gilad

> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf-
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Monday, November 07, 2011 3:58 PM
> To: jhh3851 at yahoo.com
> Cc: beowulf at beowulf.org
> Subject: Re: [Beowulf] building Infiniband 4x cluster questions
> 
> 
> On Nov 8, 2011, at 12:44 AM, Joseph Han wrote:
> 
> > To further complicate issue, if latency is the key driving factor for
> > older hardware, I think that the chips with the Infinipath/ Pathscale
> > lineage tend to have lower latencies than the Mellanox Inifinihost
> > line.
> >
> > When in the DDR time frame, I measured Infinipath ping-pong latencies
> > 3-4x better than that of DDR Mellanox silicon.  Of course, the
> > Infinipath silicon will require different kernel drivers than those
> > from Mellanox (ipath versus mthca).  These were QLogic specific HCA's
> > and not the rebranded Silverstorm HCA's sold by QLogic.  (Confused
> > yet?)  I believe that the model number was QLogic 7240 for the DDR
> > version and QLogic 7140 for the SDR one.
> >
> > Joseph
> >
> 
> Claim of manufactuer is 1.2 us one-way pingpong for QLE7240. Of course to
> get to that number possibly they would've needed to use their grandmother
> analogue stopwatch, but even 1.2 us ain't bad :)
> 
> 95 dollar on ebay.
> 
> Anyone having even better news?
> 
> Vincent
> 
> >
> >
> > Message: 2
> > Date: Mon, 07 Nov 2011 14:21:51 -0600
> > From: Greg Keller <Greg at Keller.net>
> > Subject: Re: [Beowulf] building Infiniband 4x cluster questions
> > To: beowulf at beowulf.org
> > Message-ID: <4EB83DDF.5020902 at Keller.net>
> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >
> >
> > > Date: Mon, 07 Nov 2011 13:16:00 -0500
> > > From: Prentice Bisbal<prentice at ias.edu>
> > > Subject: Re: [Beowulf] building Infiniband 4x cluster questions
> > > Cc: Beowulf Mailing List<beowulf at beowulf.org>
> > > Message-ID:<4EB82060.3050300 at ias.edu>
> > > Content-Type: text/plain; charset=ISO-8859-1
> > >
> > > Vincent,
> > >
> > > Don't forget that between SDR and QDR, there is DDR.  If SDR is too
> > > slow, and QDR is too expensive, DDR might be just right.
> > And for DDR a key thing is, when latency matters, "ConnectX" DDR is
> > much better than the earlier "Infinihost III" DDR cards.  We have
> > 100's of each and the ConnectX make a large impact for some codes.
> > Although nearly antique now, we actually have plans for the ConnectX
> > cards in yet another round of updated systems.  This is the 3rd
> > Generation system I have been able to re-use the cards in (Harperton,
> > Nehalem, and now Single Socket Sandy Bridge), which makes me very
> > happy.  A great investment that will likely live until PCI-Gen3 slots
> > are the norm.
> > --
> > Da Bears?!
> >
> > > --
> > > Goldilocks
> > >
> > >
> > > On 11/07/2011 11:58 AM, Vincent Diepeveen wrote:
> > >> >  hi Prentice,
> > >> >
> > >> >  I had noticed the diff between SDR up to QDR,  the SDR cards are
> > >> > affordable, the QDR isn't.
> > >> >
> > >> >  The SDR's are all $50-$75 on ebay now. The QDR's i didn't
> > find cheap
> > >> >  prices in that pricerange yet.
> > >> >
> > >> >  If i would want to build a network that's low latency and had
> > a budget
> > >> >  of $800 or so a node of course i would  build a dolphin SCI
> > >> > network, as that's probably the fastest
> > latency
> > >> >  card sold for a $675 or so a piece.
> > >> >
> > >> >  I do not really see a rival latency wise to Dolphin there. I
> > bet most
> > >> >  manufacturers selling clusters don't use  it as they can make
> > >> > $100 more profit or so selling other
> > networking
> > >> >  stuff, and universities usually swallow that.
> > >> >
> > >> >  So price total dominates the network. As it seems now
> > infiniband 4x is
> > >> >  not going to offer enough performance.
> > >> >  The one-way pingpong latencies over a switch that i see of
> > it, are not
> > >> >  very convincing. I see remote writes to RAM  are like nearly 10
> > >> > microseconds for 4x infiniband and that
> > card is the
> > >> >  only one affordable.
> > >> >
> > >> >  The old QM400's i have here are one-way pingpong 2.1 us or
> > so, and
> > >> >  QM500-B's are plentyful on the net (of course big
> > disadvantage: needs
> > >> >  pci-x),
> > >> >  which are a 1.3 us or so there and have SHMEM. Not seeing a
> > cheap
> > >> >  switch for the QM500's though nor cables.
> > >> >
> > >> >  You see price really dominates everything here. Small cheap
> > nodes you
> > >> >  cannot build if the port price, thanks to expensive network
> > card,
> > >> >  more than doubles.
> > >> >
> > >> >  Power is not the real concern for now - if a factory already
> > burns a
> > >> >  couple of hundreds of megawatts, a small cluster somewhere on
> > the
> > >> >  attick eating
> > >> >  a few kilowatts is not really a problem:)
> > >> >
> >
> >
> >
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> > Computing To change your subscription (digest mode or unsubscribe)
> > visit http://www.beowulf.org/mailman/listinfo/beowulf
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> Computing To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf