wsb at paralleldata.com
Fri Oct 5 11:55:34 PDT 2001
Tim Carlson wrote:
> On Fri, 5 Oct 2001, W Bauske wrote:
> > Tim Carlson wrote:
> > >
> > > I am willing to be enlightened as to how my test is flawed. I'll run
> > > different tests if asked. Is my test too trivial?
> > >
> > Add some other net traffic to the NIS server. Basically giving NIS
> > the whole bandwidth of your network is unrealistic, at least for things
> > I do. For example, if your NIS server happens to also be an NFS file
> > server, also start a couple reads/writes to the NFS file systems and
> > repeat your experiment and see how that effects your delays.
> Then I should rename this thread.. "Re: NFS?".. or "we all know NFS sucks"
> I think that is hard to design a good experiment in this case. If I
> simultaneously create 120Meg files in all six nodes.. effectively taking
> all the network traffic :)
> foreach node (`echo compute-0-0 .... compute-0-5`)
> rsh $node dd if=/dev/zero bs=1024k count=120 of=/some/file/in/nfs &
> Run my previous test of executing 100 rsh's and ls'ing a directory
> that requires a couple of NIS lookups, then my experiment time doubles but
> finishes before any of the NFS writes have completed on the compute
> nodes. The NFS writes complete in about 125 seconds so I end up with
> around (7200Mb/125) 50Mb/s of NFS traffic on my 100Mb/s network. Somebody
> will point out that 10Mb \neq 1MB. There is some slop in this calculation.
> I think the above is a poorly designed experiment. Can you really learn
> anything? All it really says is that if NFS is working as hard as it can,
> you can still punch through a bunch of rsh's and do some NIS things.
OK. Load the network with netpipe or netperf. Actually, to be clear, I use
NIS to handle local host names and that is what causes my NIS problem.
Userid's are not the problem on my systems. NIS would take down my parallel
jobs every day if I didn't rsync files.
> FYI, syslogd is taking more CPU time on the master node
> logging all of the rsh/pam data than either ypserv of nfsd.
CPU time isn't the problem. Network bandwidth/delay is.
More information about the Beowulf