[Beowulf] Kill zombies after a parallel run
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Toon Knapen toon.knapen at fft.beMon May 8 05:31:09 PDT 2006
- Previous message: [Beowulf] Bonding Ethernet cards / [was] 512 nodes Myrinet cluster Challenges
- Next message: [Beowulf] Linux Cluster Forum/BWBUG meeting this afternoon
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Peter Jakobi kindly just gave me following reply: There's - zap: the kill example in Larry Wall's Perlbook # interactive verification # any regular expression matching any # string in the output of ps -ef, ... # I tend to keep hacking my ancient copy of this # so currently my copy can be run non-interactively, # kill children, kill per tty (carefully craft your # regex, otherwise DO NOT use with -y to randomly # kill wrong processes!!!), or list/nice processes # instead of killing. # for a short while, I've put a copy here: http://www.oa.shuttle.de/kefk/tmp/zap non-internactive and a bit heavy-handed: - killall; by name,can also kill acc. to PGID (process groups) - killproc; by name of executable; -G incl. children in current process group or session(check that these are identical?). -g to kill the incl. other process in the group. # you are also able to get the list of processes # the use a specific file via lsof, than pass the # pids to kill. Quickly, but pid reuse hopefully # doesn't occur within a few secs. You'd need to # check the kernel to be certain that this is # the case (any other kernel behaviour I'd consider # a bug). - skill/snice # adds selection by tty, command, ... . But still # only command binary name in the sense of killproc. > I think what the OP is asking is how to kill (automagicallY) all processes in a parallel run once one process crashed (due to segmentation failure or soth.) > Generally if one process (in the whole bunch of processes) crashes, all other processes will wait eternally from the moment they try to communicate with the crashed process or at the MPI_Finalize. So how can one kill all remaining processes?
- Previous message: [Beowulf] Bonding Ethernet cards / [was] 512 nodes Myrinet cluster Challenges
- Next message: [Beowulf] Linux Cluster Forum/BWBUG meeting this afternoon
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
