Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Cluster install and admin approach (newbie question)

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at mcmaster.ca
Fri Aug 28 08:07:32 PDT 2009


> * if the /var filesystem is shared, race conditions happen (all nodes
> want to write on the same files). I had this problem and moved to a
> local /var filesystem.

indeed, shared /var is simply a bug.  non-shared NFS /var is viable,
but generally pointless.

> * if /var is local (which it may because the disks do exist), the
> whole point of central point for easy admin vanishes, because I would

eh?

> had to create all the /var structure that packages need to work, on
> each node (would be easier to do: "for $node; ssh $install_cmd; done",
> than guessing which dirs I need to create or files to copy).

but if your nodes are nfs-root, you won't be installing anything on them:
you'll be installing on the nfs-root.

> * if /var is tmpfs all forensics are certainly gone after failure
> (Murphy told me this one ;).

syslog is very happy to log over the network.

> Everything I read on the subject do underline the advantages of
> diskless approaches but miss to alert to this problem and/or to solve
> it. On the other side, the distributed approach tools (where every
> node is autonomous) seem to be halted (as systemimager - which is used
> in the Oscar project) or discontinued, or truly overblown for my
> reference scale (IBM's xCat); so it really seems that I'm missing

there's also OneSIS.

> something.
>
> The question is what you do about this ?

setting up your own nfs-root cluster is a simple exercise.  if you're not
very familiar with *nix booting/daemons/init scripts, it will take a few 
tries to get the config right, but the end result is pretty simple and
robust.  remote syslog, preferably with console-over-net (ipmi sol,
netconsole) means that there's nothing interesting on the local /var.



More information about the Beowulf mailing list