[Beowulf] Re: Rackable / SGI

Jason Riedy jason at acm.org
Sun Apr 5 08:00:23 PDT 2009


And Greg Lindahl writes:
> An example would be HDFS, the Hadoop Distributed FS.

Last time I checked, it strongly encouraged marshaling into text.
The libraries didn't have an obvious way to handle binary data,
but that was from a somewhat cursory read by someone less than
fully interested in Java contortions.

Certainly worth checking again.  Someone is implementing a
ScaLAPACK-like matrix package on Hadoop, so there must be some
method...  HDFS is guided by a particular usage pattern.  I'm not
sure how its methods match to non-reduction computations, and I'm
also thinking of separate compute and storage nodes.  (Although
obviously moving computation to the storage can be a huge win.)

There also are tons of distributed key/value stores that would
be interesting file systems for some applications.  There's one
based on Tokyo Cabinet that I should try for data analysis,
LightCloud from Plurk.

But yes, I mostly was asking about POSIX-ish file systems.  I
suppose I could have asked about pNFS as another layer to add a
POSIX-ish feel to the others.  (Hm.  An HDFS layout?)

> A similar situation exists in the node management space, where
> existing solutions like CFengine were pretty much ignored by HPC
> people.

Ha!  Cfengine was pretty much ignored by *everyone*, including
its author for quite some time.  Promising (pun intended) the
next great advance and not passing current maintenance to others
loses users quickly.

Also, cfengine is (or was, when last I used it) designed to be a
pull-based system that polls a configuration server.  The design
was more focused on asynchronous updates, and I think most HPC
folks would prefer a push model that updates everyone "at once."
Cfengine had a push system, but to me it didn't feel like a good
fit with the rest.

I'm more shocked that no one has written up using cfengine for
managing laptops.  It seems a perfect model.  With the more open
development model, perhaps it'll come back.  But its competitors
are more "web 2.0 cool."

Jason



More information about the Beowulf mailing list