[Beowulf] Beowulf and Ganglia config help needed

dgmr at optonline.net dgmr at optonline.net
Wed Jul 6 18:39:02 PDT 2005


Hello all
I have some general Beowulf/Ganglia configuration woes that I am seeking help with!
 
1>I have two beowulf style clusters.
I would like to use cluster A to monitor Cluster B.  Cluster A is 18 nodes cluster B is 90 nodes.
 
Monitoring on Cluster A  is no problem.  But on Cluster B, for whatever reason, the gmetd that is running on the headnode only "sees" about half of the gmonds running on the corresponding compute nodes.  I know the gmonds are running on each of  the 90 compute nodes as a simple ps  tells me so.  Further I can go to each compute node in turn and do a localhost 8649 I see the spewage of XML.  But, yet the gmetd on the headnode only see about half of the compute nodes.  Have any idea why>
 
2> Does a gmetd need to be running on the headnode of cluster B if I wish to monitor Cluster B from Cluster A?  Also in general should a gmond be running on my headnodes?  I have seen that when a gmond is running on the headnode as well, the corresponding gmetd ignores all the other gmonds and only reports the one of the headnode.
 
3>  On cluster B as the data_source line in the gmetd.conf file should I put the IP address of all the corresponding compute nodes?  I seem to get a variety of results and behaviors depending on what I may put..
 
4> The ganglia conf files seem much happier if I use IP addresses instead of FQDN.  Is this really the case?
 
5> In general what should be on the data_source line of my gmetd.conf file?  All the IP addresses of every single gmond running in my corresponding computer nodes?
 
If you have some general docs on how to correctly setup up ganglia on a grid of beowulfs clusters that would be great to have!
Thanks for any and all help!
Sincerely
Dan Roberts
 




More information about the Beowulf mailing list