[Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Alex Younts ayounts at tinkergeek.comTue Dec 23 11:05:07 PST 2008
- Previous message: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh
- Next message: [Beowulf] using SNMP to monitor disk usage and load factors on compute-nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
We have quite a few different PBS servers running PBSPro 9.x. Our Nagios box has a bare install of the PBSPro and we wrote a check script that runs "pbsnodes -s $cluster-head-node $nodehostname" and checks to see if PBS thinks the node is happy. (We determine which PBS server to hit up based on the host name of the node.) Alex Younts On Tue, Dec 23, 2008 at 1:24 PM, Rahul Nabar <rpnabar at gmail.com> wrote: > On Mon, Dec 22, 2008 at 10:23 PM, Alex Younts <ayounts at tinkergeek.com> wrote: >> At my employer, we use a variety of monitoring tools for our various >> clusters. Our nagios box is a VM with a single processor and 512MB of >> memory. Currently, we monitor 1700 hosts, each with three or four >> service checks a piece (two of which SSH to nodes to run scripts). We >> check services about every 30 minutes. > > Thanks Alex! I will give that a shot now! Are there any torque / pbs / > maui monitoring Nagios scripts out there? I wanted to avoid > reinventing the wheel if at all possible! > > -- > Rahul >
- Previous message: [Beowulf] using Nagios to monitor compute nodes: NPRE vs check_by_ssh
- Next message: [Beowulf] using SNMP to monitor disk usage and load factors on compute-nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
