xmlsysd, wulfstat (cluster monitor apps, beta)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduTue Apr 30 16:03:57 PDT 2002
- Previous message: Dolphin Wulfkit
- Next message: Screen dump analysis:
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dearest DBUG (and beowulf list) persons, Announcing xmlsysd and its companion application, wulfstat. xmlsysd is a lightweight, throttleable daemon that runs either as a forking daemon or out of xinetd (the latter by default). When one connects to it it accepts a very simple command language that basically a) configures it to deliver certain kinds of /proc and systems-call-derived information, generally throttling it so it doesn't return anything you aren't interested in; and b) causes it to wrap up that information in an xml-formatted message and return it to the caller. Security is managed any of several ways -- by ipchains or iptables, using tcp wrappers, or using xinetd's internal ip-level security features (or using ssl or ssh tunnels, for the truly paranoid or those who want to monitor across a WAN). wulfstat is a companion client application that uses the xmlsysd's running on a collection of cluster nodes or LAN workstation hosts to gather information about the nodes or hosts and present it in a simple tty (e.g. xterm, konsole) accessible tabular form, updating the table every N seconds (default 5). Think of it as vmstat, procinfo, ifconfig, uptime, free, date, the upper part of the top command, and a bit more all rolled into a single application so that you can monitor whole connected sets of this information across an entire cluster with some reasonable granularity. Such a tool has obvious uses -- for cluster users, it allows them to monitor host load averages, look for idle resources, monitor memory usage, obtain information at a glance about remote cpu type and clock, cache size, monitor network loads, and even see what fraction of a cluster node's up time has been spent "doing work" instead of idle. Most of this is equally useful to systems administrators seeking to monitor LAN host activity -- crashing systems are often signalled by anomalous consumption of memory or a steady rise in cpu usage, for example. The toolset has now been in use for some time and has been reasonably stable for several weeks (in spite of my constant poking at it to add new features or fix tiny problems). I am therefore releasing it as version 0.1.0 BETA for wider testing, although at the moment it seems to be doing fine in production. It is expected that wulfstat is just the first of a number of monitoring applications that will be developed that use the daemon. The daemon, for example, can also be used to monitor tasks on remote nodes by username and/or taskname and/or run status, although the application that actually permits name and task lists to be managed on the user side and the returned results properly displayed has yet to be written. Full GUI and/or web applications should also be straightforward to build, although this time I learned my lesson and built the tty application FIRST (for xmlsysd's predecessor, procstatd I built a GUI application and have regretted it ever after). It is also expected that at least a few more features will be added to the daemon (it lacks e.g. lm-sensors support at this point, for example). The daemon >>should<< have just enough power to form the basis for a load balancing or job distribution system -- it can certainly efficiently provide realtime monitoring of many of the components upon which a queuing decision might be based, including load, memory and network utilization, non-root tasks running or waiting to run, and even CPU type, clock, and cache. It does not run as a privileged user, however, and is not designed to manage the actual distribution or control of jobs. Still, I expect and hope that wulfstat and xmlsysd together will be immediately useful to cluster people who install it. The included documentation should be adequate although not overwhelming -- there are man pages for both xmlsysd and wulfstat that are very nearly up to date -- and I'm available to help with installations that don't seem to work correctly. The one "gotcha" of wulfstat is that it does require libxml2 (and hence probably RH 7.2 or better) to run -- you will need to ensure that this RPM is installed on the hosts where wulfstat is to run. xmlsysd similarly requires libxml to run on the cluster nodes. I would greatly appreciate feedback and bug reports, if any, from anybody who chooses to install it and give it a try. To retrieve it in RPM form, you can use the URL's below: http://www.phy.duke.edu/brahma/xmlsysd-0.1.0-beta.i386.rpm http://www.phy.duke.edu/brahma/xmlsysd-0.1.0-beta.src.rpm http://www.phy.duke.edu/brahma/wulfstat-0.1.0-beta.i386.rpm http://www.phy.duke.edu/brahma/wulfstat-0.1.0-beta.src.rpm If anybody needs it in tarball form (not in source or binary rpm form) they should contact me directly. I can easily generate one (or it can be extracted from the source rpm) but I guarantee the instructions for installation or configuration -- they are encapsulated already in the RPMs. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: Dolphin Wulfkit
- Next message: Screen dump analysis:
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
