Cluster Monitoring software?
plesher at ibb.gatech.edu
Wed Oct 25 15:53:01 PDT 2000
How much of the graphing/visual aids are in the open source version?
I've worked with the non opensource and the Irix version and really liked
it. Especially watching the 3D views of say all the CPU's loads in the
cluster. It really made it easy to see how things are going.
On Thu, 26 Oct 2000, Ken McDonell wrote:
> On Wed, 25 Oct 2000, Patrick Lesher wrote:
> > SGI has a really nice package called Performance Co Pilot.
> > It monitors all kinds of different things and they are are always adding
> > more on.
> > You can download it from their oss web site (
> > http://oss.sgi.com/projects/pcp/ )
> > If I remember correctly, this version isn't as complete as what you can
> > purchase from them or comes in their ACE package, I can't remember what
> > the differences are.
> The things that PCP brings to the table for monitoring cluster performance
> - centralized monitoring and management of distributed processing
> (PCP uses very efficient TCP/IP protocols to move the data about)
> - a unified API to access _all_ performance data (from the h/w,
> the o/s, the service layers and the applications) ... this
> includes all of the metrics Joseph was asking about, and lots
> more ... the same API works for different operating systems,
> so monitoring tools are insulated from the details of dredging
> interesting numbers from dark corners of each o/s
> - the available performance data can be easily extended via a
> a plugin architecture
> - real-time and historical data sources are unified under the
> same API
> - an inference engine for detecting common performance scenrios
> and raising arbitrary alarms (can be used with both real-time and
> historical data sources)
> There is an open source stripchart monitoring tool developed by Michal
> Kara (details from the News page off the oss.sgi.com projects page).
> Other SGI-developed monitoring tools (including 3-D visualization of
> performance) are not open sourced, but are are sold as part of the SGI
> Linux solutions (ACE is one example)
> We are always keen to communicate with people who'd be interested in
> adapting or expanding PCP into new performance monitoring scenarios.
More information about the Beowulf