[Beowulf] Mellanox UFM question
jeffrey.c.becker at nasa.gov
Tue Sep 15 10:55:03 PDT 2015
On 09/15/2015 10:30 AM, Jörg Saßmannshausen wrote:
> Hi Jeff,
> no, not yet.
> What I want to avoid is: I try the OFED subnet manager and it does not work
> and then I have to wait until I get the licence. This project has enough
> delays right now and I don't want to add to it. Hence my question.
> Having said that: are you happy with the OFED one?
Just to be clear, we are using the subnet manager from Mellanox OFED,
and yes we are happy with it. However, note that Mellanox made several
improvements over opensm (from OFED), and in fact it is one of the few
parts of Mellanox OFED that is NOT open (ibutils2 is another), and is
distributed as a binary only. Although we have been told that the subnet
manager improvements will eventually make it to opensm, I don't know
when that might happen.
> All the best
> On Dienstag 15 September 2015 Jeff Becker wrote:
>> Hi Jörg,
>> Have you tried using the subnet manager from Mellanox OFED (which is
>> free)? That's what we use on our big heterogeneous cluster at NASA.
>> On 09/15/2015 08:55 AM, Jörg Saßmannshausen wrote:
>>> Dear all,
>>> I am a bit confused and I was wondering whether somebody on the list
>>> could give me a bit of advice here.
>>> I was previously using QLogic for my QDR InfiniBand network. I got one
>>> master switch which got the licence for the InfiniBand installed and
>>> things appear to work ok. At least I cannot detect any problems despite
>>> adding switches and nodes to the fabric.
>>> Now, we recently purchased a new cluster with 20 cores per node and here
>>> I decided to go for FDR to be a bit more future proofed as well. So I
>>> got the 'normal' licence from Mellanox for the cluster. I got one
>>> licence per node so I assumed that was ok.
>>> Now, we are in the process to set up another cluster with a mixture of
>>> older and newer hardware. Again I have decided to opt for the FDR simply
>>> to be a bit more future proofed. And this is where the confusion comes
>>> Apparently I do need now the UFM (Unified Fibre Manager) from Mellanox to
>>> run the InfiniBand. However, the normal licence is only for up to 16
>>> cores per node and I would need the more expensive exhanced licence.
>>> From what I and a colleague of mine can see the UFM is nothing more than
>>> requires subnet manager plus some diagnostic tools.
>>> There are two questions here:
>>> - do we really need the exhanced UFM licence or is that just a way to
>>> make money?
>>> - would the open source subnet manage work as well and would the open
>>> source diagnostic tools be ok?
>>> - why do I need to pay for a licence for each node? Somehow I cannot
>>> recall having done that in the past.
>>> Unfortunately, InfiniBand is not my strong side and thus I would
>>> appreciate and advice here.
>>> All the best from a meanwhile sunny London
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf