Locality and caching in parallel/distributed file systems

Mark Hahn hahn at physics.mcmaster.ca
Tue Dec 3 10:33:52 PST 2002

> node, and gives local I/O speed to simultaneous reads and writes (to the
> same/different files) across the single namespace.  This def may not be

concurrent writes are always nontrivial; would you be happy assuming
that the app always knows what it's doing, so the OS doesn't have to?
for instance, if you write at offset 14k in a file, do you need to keep
in mind that it falls within the third page-sized block (the fundamental
pagecache unit on ia32 systems), which might correspond to, say, the 
second 8K filesystem block?

> The idea in building any scalable resource (net, computing, disk, etc)
> is to avoid single points of information flow.  Maintaining metadata for
> file systems represents exactly that.  You get hot-spot formation, and
> start having to do interesting gymnastics to overcome it (if it is at
> all possible to overcome).

even strict consistency doesn't imply there is necessarily a bottleneck,
since, for instance, your filesystem will probably not be one big, flat 
directory.  Coda (and probably others) have examined weaker consistency.

More information about the Beowulf mailing list