[Beowulf] What services do you run on your cluster nodes?y
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduTue Sep 23 09:39:45 PDT 2008
- Previous message: [Beowulf] What services do you run on your cluster nodes?
- Next message: [Beowulf] Pretty High Performance Computing
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, 23 Sep 2008, Perry E. Metzger wrote: > > "Robert G. Brown" <rgb at phy.duke.edu> writes: >> You can run xmlsysd as either an xinetd process or forking daemon >> (the former is more secure, perhaps, the latter makes it stand alone >> and keeps one from having to run xinetd:-). > > Arguably, running processes under inetd can make them more secure, not > less, in so far as they do not need their own network listening and As I said. > daemon management code (reducing code size means less code to audit), > and the processes can be run as non-root even if they need to listen > on so-called "privileged" ports (a vile invention, but never mind, one > has to live with its existence.) All this presumes inetd runs > correctly, of course, which clearly is an assumption that may or may > not be warranted. And xinetd can be configured to use at least TCP wrappers and limits on incoming addresses to restrict access to specific LAN or WAN addresses. Not much by way of security, admittedly, but par for the course. As a forking daemon you can just run it by hand from userspace and it works just fine as it needs no special privileges to run and do its (far more limited) job (than e.g. ganglia or bproc). Being able to "just run it" -- especially with a verbose debugging mode turned on -- makes development and debugging far more "pleasant" than it might otherwise be, and it means I can instantly monitor (linux) systems on which I don't have root privileges without having to install packages. The two modes share most of their code. xinetd mode just reads/writes to stdin/stdout/stderr instead of to/from sockets, so a single CL flag can easily switch between the distinct setup modes. Not really a lot of extra code to audit, actually, and most of the socket code is pretty much boilerplate straight out of e.g. Stevens. >> It costs you one fork to run the initial daemon in the latter case, and >> a fork per connection BUT the connections are persistent TCP connections >> and hang out indefinitely. > > Actually, it need not cost a fork per connection to run a daemon under > inetd. One can run a TCP wait service instead of the usual TCP nowait > service. That means that the daemon still needs to know how to do > accept, of course. It need not, but it does...;-) Boilerplate out of Stevens (and a couple of other references that I started with), as I said. As also noted, there are things I'd mess with if I were to take up working on it again because it became wildly popular or if somebody else also adopted it as their personal hobby and joined the club of developer/maintainers. This is one of them. I honestly don't know whether my current solution scales optimally; it certainly is adequate for the limited number of connections one creates under normal usage, which is usually "one" but has been as high as "four or five" on the clusters I've deployed on. However, since the connection is persistent, the overhead of the fork itself is irrelevant. One fork amortized over hours to days of socket utilization? Not like a UDP daemon, connectionless, constantly making and breaking. There's a nontrivial amount of work involved in intializing the daemon -- opening files in /proc, creating data structures -- so constantly creating and freeing, opening and closing is to devoutly be avoided and bundled in with the one piece of startup "overhead" so it too becomes irrelevant. However, doing it all REALLY just one time might be even better, although one would probably have to rewrite the core information parser to avoid collisions (two requests for information on different ports before one is complete) and so on. Be interesting to try it, especially on a really big cluster where one could actually see scaling problems with the design. The other place wulfstat is weak (not so much xmlsysd per se) is in managing dropped connections in wulfstat because e.g. an xmlsysd host it is monitoring crashes. This is, of course, a common problem in parallel application design. With TCP it is not easy to distinguish a hung or crashed host from an asynchronous delay, and one has to manage e.g. TIME_WAIT states and so on that can eat into your socket budget on retries. One really would like to be able to tell positively if a host is up "out of band", as well. Alas ICMP systems/socket commands (one easy way to see if a host is up and its network is responsive) seems to be privileged (hence ping is suid root) and creating a monitoring tool that was suid root seemed unwise, to me anyway. nmap supposedly contains a "non-ping ping" command that doesn't require privileges and one day I'll have time to dig it out and see if I can use it in wulfstat. Otherwise every ATTEMPT to connect to the TCP service itself costs one at least a timeout of some sort, which leads in turn to blocking problems or the need to thread the application and try reconnections in the background while the monitoring tool runs the existing "good" connections in the foreground. This is where xmlsysd/wulfstat is interesting. It is an example -- a fairly rare example -- of a master-slave parallel application outside of HPC per se, that uses raw networking and could be adapted to doing certain classes of hpc computation. After all, the nodes could be doing ANY work requested of them by the requesting client; they just happen to be doing the work of rewinding the files that are needed to produce the requested message, encapsulating the results in XML, and writing them to the output socket. Could just as easily compute a strip of the mandelbrot set, do a random number generator test, render a graphical frame, or any other nicely partitioned master-slave structured task. Not as easy to code as e.g. PVM, but arguably less overhead and more if rawer control. rgb > > > -- Robert G. Brown Phone(cell): 1-919-280-8443 Duke University Physics Dept, Box 90305 Durham, N.C. 27708-0305 Web: http://www.phy.duke.edu/~rgb Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
- Previous message: [Beowulf] What services do you run on your cluster nodes?
- Next message: [Beowulf] Pretty High Performance Computing
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
