[Beowulf] Logging MCE information on next warm boot?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Henning Fehrmann henning.fehrmann at aei.mpg.deMon Jan 25 23:58:40 PST 2010
- Previous message: [Beowulf] Logging MCE information on next warm boot?
- Next message: [Beowulf] Logging MCE information on next warm boot?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi David, On Mon, Jan 25, 2010 at 10:46:31AM -0800, David Mathog wrote: > Is it possible to have the Machine Check Exception (MCE) information > saved to disk automatically on the next warm boot? > > Long form: > > A K7 node crashed yesterday and left an MCE on the screen which I copied > down as: > > CPU 0 machine check exception 0000000000000007 > Bank 1 F000000000000853 > Bank 2 940040000000017A at 00000000001511C0 > Kernel panic, not syncing, Unable to Continue > > Copying all of those numbers down is very error prone. As I understand > it the MCE values stay in the registers of the CPU after the crash, and > may be retrieved at the next warm boot (via a front panel reset, for > instance). But this save seems not to happen automatically, or at least > I could not find anything that looked like an MCE dump in /var/log or > /var/log/kernel when the system came up. So I want to set things up, if > possible to save this information to disk. We loaded the netconsole module. This works at least for the 2.6.27 kernel. AFAIK for older kernel one has to compile it into the kernel. It sends printk messages to a remote syslog-ng server which collects the information. I don't know how much netconsole sends in the case of a panic. netconsole needs paramter: modprobe netconsole netconsole=own_port at onw_ip/NIC,remote_port at remote_IP/remote_mac Cheers, Henning
- Previous message: [Beowulf] Logging MCE information on next warm boot?
- Next message: [Beowulf] Logging MCE information on next warm boot?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
