[Beowulf] Best Practices SOL vs Cyclades ACS
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Marian Marinov mm at yuhu.bizFri Oct 9 22:33:23 PDT 2009
- Previous message: [Beowulf] Best Practices SOL vs Cyclades ACS
- Next message: [Beowulf] Best Practices SOL vs Cyclades ACS
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Saturday 10 October 2009 08:09:45 Mark Hahn wrote: > > We have more then 400 machines. Every month there is one machine that we > > can not reboot using IPMI or the SOL is not working. > > we have something like 2500 nodes, mostly HP dl145g2's, and have a > BMC-wedge probably 6-12 times/year. can I ask what brand/model has such > flakey IPMI? if you run "ipmi mc reset" on the node, does it resolve the > problem? I wonder whether flakiness might also correspond to some config or > usage pattern. (ours dhcp from a local server - actually all the traffic > is local.) These are only Dell machines used for shared hosting. Usually these problem appear when there is DoS/DDoS or very high system resource usage(for example load over 100 on machine with 4 cores). Our problem is that in such situations IPMI sometimes is unreliable as you can not connect on serial nor reboot the machine. -- Best regards, Marian Marinov
- Previous message: [Beowulf] Best Practices SOL vs Cyclades ACS
- Next message: [Beowulf] Best Practices SOL vs Cyclades ACS
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
