[Beowulf] What services do you run on your cluster nodes?

Robert G. Brown rgb at phy.duke.edu
Thu Sep 25 15:34:50 PDT 2008


On Thu, 25 Sep 2008, Greg Lindahl wrote:

> On Thu, Sep 25, 2008 at 03:20:15PM -0400, Robert G. Brown wrote:
>
>> The fundamental problem is (as Don said as well) that as far as I know
>> there ARE NO really good solutions to the problem of the representation,
>> encapsulation, and transmission of hierarchical data structures in a
>> portable and efficient way.  If you know of one, please correct me
>
> As I said previously, there aren't. But in many cases, domain-specific
> files work better. For example, many people use Python's ConfigParser
> for configuration files. That's much, much more human-readable and
> writable than XML, and it doesn't need XML's hierarchy features, nor
> the huge libraries that go with reading XMl.

Sure.  And there are lots of places where configurations are written in,
say /etc, with an unbelievable historical jumble of methods.  In this
one tabs mean something.  In that one whitespace will do, but trailing
whitespace kills everything.  In still another you MUST use tabs (and to
the human eye all mixes of spaces and tabs look alike).  /etc/passwd is
colon delimited, but didn't have enough fields, so sysadmins have for
generations overloaded fields with sents of fields.  /proc overloads
fields all the time (used to be worse than it is now -- somebody
obviously noticed that doing this is insane and cut back a bit).

> A way to express hierarchy is NOT an obvious win, because you often
> have to change your source code that's consuming the data if you
> change your hierarchy. Sure, you can express it, but adding new levels
> can be annoyingly expensive.

This IS the point.  Changing the hierarchy -- the basic data objects --
is always expensive.  Period.  No markup language or parser will hide
the fact and if one thinks that it is or should be possible to do so,
well, it isn't.  Maybe in some special (simple) case, but in general,
absolutely not.

So yeah, XML isn't a magic bullet.  I think half of the anger people
seem to feel for it is because they think it somehow should be.  It is a
particular prestructured branch in a decision tree when dealing with
providing a portable view of your data that can be parsed with canned
tools in nearly any language.  IF you are manipulating data that can
reasonably be structured in a tree, then XML provides you with a set of
rules for constructing a reasonably portable data representation, one
that is remarkably similar to a filesystem tree but with user specified
variability in twig and leave attributes.  It maps nearly perfectly into
linked lists (the actual libxml interface uses linked lists afaict
internally for nearly everything).

Now, are linked lists perfect?  No, of course not.  Are they useful?
Don't be silly.

So stop thinking of XML as a language specification or a hierarchy or
anything else.  Think of it as a way to parse or store or transmit
linked lists.  It sucks for data (like matrices) that aren't naturally
or normally linked list material.  It's readability (or lack thereof)
isn't the point -- you can choose tag names like <a> or <f> or <s112> or
use human readable names, but they are never going to be more than
mnemonics for variables and neither more nor less readable than the
contents of the underlying data structs or linked lists they represent
in actual code.  That's not the point -- the point is that they are easy
to PARSE so one can easily write and debug a PARSER (or encoder) without
having to e.g. count lines, make decisions about what to use as a
delimiter, realize later that whitespace was a poor choice because some
fields contain spaces and switch to colons, only to have THAT bomb
because some data contains colons.

>> In this PARTICULAR case, I prefer to let the output speak for itself:
>
> Looks like unreadable crap to me. If this is a good example, it
> doesn't lend much of a positive spin to your rant.

Oh, please.  Unreadable crap?  Compared to what, the original contents
of /proc that it encapsulates?  I >>know<< you've looked at /proc.  And
the point, once again, that it is actually (to anyone seeking to write a
UI) READABLE crap that is EASY TO PARSE.  Compared specifically to
parsing out the original information that is in /proc, it is (believe
me) a complete and total piece of cake.

    rgb

-- 
Robert G. Brown                            Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Web: http://www.phy.duke.edu/~rgb
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977



More information about the Beowulf mailing list