[Beowulf] GlusterFS 1.2-BENKI (GNU Cluster File System) -
hahn at mcmaster.ca
Fri Feb 9 15:21:15 PST 2007
>>> On IB - nfs works only with IPoIB, whereas glusterfs does SDP (and ib-verbs,
>>> from the source repository) and is clearly way faster than NFS.
>> "clearly"s like that make me nervous. to an IB enthusiast, SDP may be
>> more aesthetically pleasing, but why do you think IPoIB should be noticably
>> slower than SDP? lower cpu overhead, probably, but many people have no
>> problem running IP at wirespeed on IB/10GE-speed wires...
> As I understand it, one reason why SDP is faster than IPoIB is that the
> way IPoIB is currently spec'ed requires there be an extra copy relative
> to SDP.
that's what I meant by "cpu overhead". but the point is that current
CPUs have 10-20 GB/s of memory bandwidth hanging around, so it's not
necessarily much of a win to avoid a copy. even in olden days,
it was common to show some workloads where hosts doing TCP checksumming
actually _benefited_ performance by populating the cache.
> It is also specced with a smaller MTU, which makes a fair
> difference. I believe there is movement afoot to change the spec to
> allow for a larger MTU, but I'm not an IB expert and don't follow it
MTU is another one of those things that got a rep for importance,
but which is really only true in certain circumstances. bigger MTU
reduces the per-packet overhead. by squinting at the table in question,
it appears to show ~300 MB/s on a single node. with 8k packets, that's
~40K pps, vs ~5k pps for 64k MTU. seems like a big win, right? well,
except why assume each packet requires an interrupt?
reducing the overhead, whether through fewer copies or bigger MTUs
is certainly a good thing. these days, neither is necessarily essential
unless you're really, really pushing the limits. there are only a few
people in the universe (such as Cern, or perhaps the big telescopes)
who genuinely have those kinds of data rates. we're a pretty typical
supercomputing center, I think, and see only quite short bursts into
the GB/s range (aggregate, quadrics+lustre).
I'm genuinely curious: do you (anyone) have applications which sustain
many GB/s either IPC or IO?
regards, mark hahn.
More information about the Beowulf