[Beowulf] Network Filesystems performance

Michael Will mwill at penguincomputing.com
Thu Aug 23 13:43:10 PDT 2007

I tested several NFS server configurations with a 19 node cluster. The
first advise is
to stay away from redhat for file servers since they have some bursty
I/O bugs and don't support XFS. We use SLES10 for fileservers. That is
probably the only real advise you can
generalize here, so take the following with a grain of salt:

The following benchmark series was not conducted as marketing material
and was not optimized for fastest possible throughput, but rather to
highlight the impact of using SLES10 vs. RHEL4 as well as vanilla fibre
card + trunked onboard gige vs. montillo for otherwise identical default

Note that we used sync nfs not async, and that sync, not async NFS,
writing large (twice as much as RAM in the client) files streaming
read/write with dd, blocksize 1M (broken down by the OS to a smaller
size). Each NFS server was dual fibre attached to a Xyratex F5402E with
two 6 drive raid5 volumes (7.2k-rpm SATA drives), using LVM2 to stripe
across the two luns.

If write speed and scalability to large amount of nodes is of
importance, and your i/o patterns happen to match what I tested with dd
(large file streaming read/write) then
the results might tell you that investing into a Montillo Rapidfile NFS
offloading engine pays off. If read speed is your only concern, you can
do better without. 

1. NFS server with SLES10, QLA2462, XFS, gige 4x 1G port trunked
- Single client to NFS server is about 40MB/s write, 80MB/s read
- 19 compute nodes in parallel: 25MB/s write aggragate and 109MB/s read
- Single dd within the NFS server directly to the fibre attached
filesystem: 148MB/s write, 177MB/s read

2. NFS server with SLES10+Montillo RapidFile NFS offloading engine, XFS,
2x 1G port trunked (mode0)
- Single client to NFS server 85 MB/s write, 95MB/s read
- 19 compute nodes in parallel: 43MB/s write aggragate, 90 MB/s read
- Single dd within the NFS server directly to the fibre attached
filesystem: 140MB/s write, 220MB/s read

3. NFS server with RHEL4+Montillo+ext3 otherwise as above:
- Single client to NFS server 24MB/s write, 54MB/s read
- 19 compute nodes in parallel: 29MB/s write aggragate, 69MB/s read
- Single dd within the NFS server directly to the fibre attached
filesystem: 78MB/s write, 84MB/s read

It's kind of frustrating that the NFS net bandwidth is so much below
what we see locally on the fibre attached 
filesystem, and it can only partially be explained by the clients dual
onboard NIC's with one odd and one even mac address each, which means
using eth0 on all compute nodes results in hasing onto the same
server-trunked gige port...


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org]
On Behalf Of Glen Dosey
Sent: Thursday, August 23, 2007 10:12 AM
To: Beowulf
Subject: Re: [Beowulf] Network Filesystems performance

Perhaps I should just ask the simple question. Does anyone on-list
obtain greater than 40 MB/s performance from their networked filesystems
( when the file is not already cached in the servers memory ) ?

(Yes it's a loaded question because if you answer affirmatively, then I
know who to interrogate with further questions :)

Beowulf mailing list, Beowulf at beowulf.org To change your subscription
(digest mode or unsubscribe) visit

More information about the Beowulf mailing list