[Beowulf] moniter performance

Mark Hahn hahn at physics.mcmaster.ca
Tue Jun 21 11:26:09 PDT 2005


> can anyone please recommend me some software to moniter the performance of
> a cluster? such as the usage of the resource by different parts of the
> application code, the time used etc etc, so that i can find the
> "bottleneck" of a job and improve the performance? 

if you're talking about MPI, you want an MPI profiler,
of which there are several.  or are you talking about 
the looser definition of cluster (random stuff running on 
random nodes)?  for that I would probably do it by hand
(just collect and timestamp 'ps' output from each node
every 10 seconds or so.  stitch it together by time, and 
think hard.)

regards, mark hahn.




More information about the Beowulf mailing list