[Beowulf] Defective Mellanox EDR Switches

Ryan Novosielski novosirj at rutgers.edu
Wed Jun 6 15:48:13 PDT 2018


Something to be aware of, potentially, if you happen to own any of this equipment:

100% of our Mellanox SwitchIB2 SB7890 EDR externally-managed switches that were manufactured on 2016-11-28 have failed. I’ve been told there was a manufacturing defect related to capacitors in the voltage regulators.

Mellanox apparently didn’t see fit to really notify us, even after diagnosing one of our switches, and has been slow in offering up specific information about the remedy or what dates can be expected to be affected, so hopefully this information can be of use to someone else. It’s possible that there’s a software update that fixes this, from what I gathered from Mellanox, but I’ve not been able to find anything specific yet.

The symptom is all switch port lights turning amber, and all connectivity being lost. A power cycle corrects the problem — until the next time it happens.

--
____
|| \\UTGERS,       |---------------------------*O*---------------------------
||_// the State     |         Ryan Novosielski - novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ     | Office of Advanced Research Computing - MSB C630, Newark
    `'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20180606/9f31005c/attachment.html>


More information about the Beowulf mailing list