[Beowulf] I/O workload of an application in distributed file system
hahn at mcmaster.ca
Thu Nov 22 07:15:25 PST 2007
> sytem (eg. database, webserver and so on). But i want to find more
> info on distributed file systems (eg. checkpoint read/write).
our experience with filesystems is that you can model checkpoints
as large, multithreaded, sequential IO. but while that may be an
important IO mode, it's not the dominant one - most of our IO
is almost certainly smallish, metadataish stuff. users compiling,
doing 'ls' over and over waiting for their job to produc output, etc.
with that in mind, my opinion is that cluster IO testing should be
a combination of:
- parallel streaming IO to separate files - resembling a checkpoint,
or an IO-intensive app reading, or an app where the user forgot to
turn off debugging.
- smallish metadata-heavy traffic like time(tar zxf;make;make clean).
More information about the Beowulf