[Beowulf] Re: GlusterFS 1.2-BENKI (GNU Cluster File System) - Announcement (Mark Hahn)

Harshavardhana harsha at zresearch.com
Tue Feb 13 00:56:36 PST 2007


Hi Mark,
>>>> On IB - nfs works only with IPoIB, whereas glusterfs does SDP (and
>>>> ib-verbs,
>>>> from the source repository) and is clearly way faster than NFS.
>>>
>>> "clearly"s like that make me nervous.  to an IB enthusiast, SDP may be
>>> more aesthetically pleasing, but why do you think IPoIB should be
>>> noticably
>>> slower than SDP?  lower cpu overhead, probably, but many people have no
>>> problem running IP at wirespeed on IB/10GE-speed wires...
>>
>> As I understand it, one reason why SDP is faster than IPoIB is that the
>> way IPoIB is currently spec'ed requires there be an extra copy relative
>> to SDP.
>
> that's what I meant by "cpu overhead".  but the point is that current
> CPUs have 10-20 GB/s of memory bandwidth hanging around, so it's not
> necessarily much of a win to avoid a copy.  even in olden days,
> it was common to show some workloads where hosts doing TCP checksumming
> actually _benefited_ performance by populating the cache.
>

"CPU Overhead"  it's a nice word to use in many of the cases. I am not
able to understand what are you trying to prove with IPoIB v/s SDP.  SDP
is better as seen with latency issues and these will help for many of the
Engg Applications in Aviation, Energy and Health Care Research
departments.
Imagine a 1 million Cell contact problems on LS-DYNA for STRESS analysis
needs a Higher Network I/O and Disk Speed. As they require days to
complete, with a small latency improvement even can bring up a larger gain
when the Jobs run for days.  Yes there are bottlenecks to the application
too comes into picture as the LS-DYNA doesn't scale well after running
24CPUS. But in a very big environment with 500odd machines. With 1000 of
users submitting their jobs helps a lot writing and communicating onto a
single shared directory through the master server's.

>> It is also specced with a smaller MTU, which makes a fair
>> difference.  I believe there is movement afoot to change the spec to
>> allow for a larger MTU, but I'm not an IB expert and don't follow it
>> religiously.
>
> MTU is another one of those things that got a rep for importance,
> but which is really only true in certain circumstances.  bigger MTU
> reduces the per-packet overhead.  by squinting at the table in question,
> it appears to show ~300 MB/s on a single node.  with 8k packets, that's
> ~40K pps, vs ~5k pps for 64k MTU.  seems like a big win, right?  well,
> except why assume each packet requires an interrupt?
>
> reducing the overhead, whether through fewer copies or bigger MTUs
> is certainly a good thing.  these days, neither is necessarily essential
> unless you're really, really pushing the limits.  there are only a few
> people in the universe (such as Cern, or perhaps the big telescopes)
> who genuinely have those kinds of data rates.  we're a pretty typical
> supercomputing center, I think, and see only quite short bursts into
> the GB/s range (aggregate, quadrics+lustre).
>
> I'm genuinely curious: do you (anyone) have applications which sustain
> many GB/s either IPC or IO?
>
> regards, mark hahn.
>
Regarding NFS which is well used by many of the companies around the world
for their clustering. You name one i can show you their data centers
running NFS SGI servers with Filers for each 100nodes eg most renowned
names like Intel, GE Global Research, Texas Intruments, Analog Devices..
and many more.

Bechmarking against the NFS was to give an idea to the Industry of the
benefits of a parallel filesystem against their present running
environment. Against LustreFS yes we are coming up with a benchmark which
is followed by present NFS benchmark.

GlusterFS is trying to prove Scaling with increase in performance and also
has a privilege of being in Userspace, which gives us the handling
software out of the Kernel Policies of which handling has been proved
cumbersome in many cases.

Present benchmark is for people to "Give a Sight" into glusterfs and
working through it.

Regards & Thanks.
--
Harshavardhana

"For a successful technology,
 reality must take precedence over public relations,
 for nature cannot be fooled."




More information about the Beowulf mailing list