[Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Rahul Nabar rpnabar at gmail.comMon Dec 22 17:28:51 PST 2008
- Previous message: [Beowulf] OOM errors when running HPL
- Next message: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I just installed Nagios to try and monitor my 256 compute nodes centrally. It seems to work like a charm for all the public services (ping, ssh etc.) but now I was getting more ambitious and wanted to try to monitor the private services too (disk usage; process loads; torque ; pbs etc.). I was just confused whether (1) to use the NPRE plugin (seems like a pain to deploy onto all 256 nodes) or (2) go via the check_by_ssh route. (I already have paswordless logins from master-nodes to slave-nodes) I'd like (2) because it is more secure and seems easier to deploy but I'm a bit afraid if this will overtax my central server. Any suggestions? Are other users using Nagios here? -- Rahul
- Previous message: [Beowulf] OOM errors when running HPL
- Next message: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
