[Beowulf] Node Drop-Off
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at physics.mcmaster.caSun Nov 12 21:15:19 PST 2006
- Previous message: [Beowulf] Node Drop-Off
- Next message: [Beowulf] Node Drop-Off
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> I have a compute node that has started dropping off. When I say drop off, I > mean the node (while running a job) will lose all connectivity and the > machine does not respond. I have viewed the logs and can find no reason for > the node to cease functioning. if you connect a console to such a node, is it simply panic'ed? > Has anyone ever seen such behavior? I have the occasional node which turns itself off under load. the IPMI reports power being off, so it's distinct from panics. the IPMI system-error-log doesn't show any reason. we (and the vendor) regard this as grounds for repair (usually the power supply). regards, mark hahn.
- Previous message: [Beowulf] Node Drop-Off
- Next message: [Beowulf] Node Drop-Off
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
