[Beowulf] Background Survey for thesis papers

Eric Thibodeau kyron at neuralbs.com
Wed Mar 29 05:55:50 PST 2006

Le Mardi 28 Mars 2006 05:14, Björn Lindberg a écrit :
> The subjects of the three thesis papers are: 1) Availability, 2) Capacity 
handling, 3) Resource Managing. And now we are carrying out a background 
survey within these three subjects.  These thesis papers will be used to 
create SLA (Service Level Agreement) and help out queuing jobs into the queue 
handler. In our background survey we have some questions.
> We are supposed to get continuous measurable variables. What variables would 
you measure?
> Which parameters should be taken into consideration regarding these 
applications? E.g. swap size, CPU load, bandwidth, memory, etc.
> What kind of tools would you use to get a hold of this data, and are there 
any applications that could gather all data necessary?
> It would be most appreciated if you could add links to homepages or papers 
that could help us with our tasks.

Since I don't see you mentioning it, I'd suggest you start by googling for 
ganglia and Beowulf. You'll hit quite a few live Beowulf clusters being 
monitored using ganglia. Visit a few of these since ganglia is customizable 
and different metrics of interest could show up. This should at least get rid 
of the "blank page" phenomenon. Then would follow the 1001 journals articles 
in ieeeXplore concerning computer metrics in an HPC environment ;)

Some kinks ;)
http://ganglia.sourceforge.net/ (click on the live demos...but google will 
give you lots more)

Ganglia is nice for a quick, coarse/medium grained overview of your system. 
Non-intrusice, fine grained monitoring might require more sophisticated 
equipment (network packet analyser for net performance comes to mind...)

rgb's wip has a chapter on metrics and tools to measure them as well as what 
to watch out for... 

Eric Thibodeau
Neural Bucket Solutions Inc.

More information about the Beowulf mailing list