[Beowulf] best archetecture / tradeoffs
dgs at gs.washington.edu
Tue Aug 30 17:45:57 PDT 2005
> This is a joke reply, right:-)
> Let's see. Yes, there are tokens, which often expire mid-computation.
> There is kerberos, a lot of overhead indeed in a firewalled private
> internal network. There is AFS's "I didn't really mean it" attitude
> towards implementing e.g. fflush on mounted AFS volumes, where writeback
> occurs only when a file is closed (making it impossible to use shared
> files across nodes for a variety of purposes without adding all sorts of
> file stats and overhead).
Renewing tokens in mid-computations isn't hard to handle. What's
difficult is a job sitting in a queue for a couple of days, then
coming up to run with an expired token. And as I wrote, the root
volume is mounted read-only with "system/anyuser" permissions, in
AFS-speak. You need some write-able local space for parts of '/var',
but that's about all.
> The one think I do like about AFS is its "real" ACLs. Unix's file perms
> suck as a mechanism for enabling shared/group work, although they do
> give sysadmins that warm fuzzy feeling of approximate control.
> Seriously, we've kicked AFS around as a cluster FS before here, and I've
> even used it in some computations I did a decade or so ago (where I
> learned the hard way about the hahaha attitude towards implementing
> fflush()) but it doesn't seem to be robust or at all easy to manage for
> this sort of thing -- overkill in one place (security), underkill
> someplace else (behaving like a rational/reliable filesystem).
> Although RO AFS -- that one I haven't heard of, and obviously fflush
> problems are irrelevant. So it might not be a joke after all:-) Don't
> you need a pretty special kernel to make that work?
Anything that'll run AFS.
More information about the Beowulf