[Beowulf] any creative ways to crash Linux?: does a shared NIC IMPI always remain responsive?

Rahul Nabar rpnabar at gmail.com
Mon Oct 26 09:48:53 PDT 2009


On Mon, Oct 26, 2009 at 11:34 AM, Bogdan Costescu <bcostescu at gmail.com> wrote:
> On Mon, Oct 26, 2009 at 4:50 PM, Rahul Nabar <rpnabar at gmail.com> wrote:

> The BMC is a CPU running some firmware. It's a low power one though,
> as it doesn't usually have to do too many things and it should not
> consume significant power while the main system is off. Some BMCs even
> run a ssh or http daemon to allow an easier interaction.

To me the additional services seem one of the root causes of problems.
Complexity just means more places for stuff to go wrong at. It may not
be super difficult to impliment ARP and IPMI but when you start adding
ssh and http you are pretty much writing some pretty complex daemons I
suppose.

I just discovered that my BMC will also serve out http pages if I
point a browser to its I/P. While this might be "cool" it just
increases the load on the BMC and leaves more scope for coding bugs.

>> It's funny that in spite of this the
>> IPMI gets hung sometimes (like Gerry says in his reply). I guess I can
>> just attribute that to bad firmware coding in the BMC.
>
> Sometimes the BMC can simply become overloaded. I've been told that
> some BMCs can't cope with a high network load, especially with
> broadcast packets.

I see. Any pointers to relieve this load? Any tweaks? Or precautions?

> I have always considered the BMC as a blackbox or
> appliance, good for only one thing, so maybe someone with a better
> understanding of its inner architecture can provide some more
> details...

Yes, it's a black box for me too!

-- 
Rahul



More information about the Beowulf mailing list