[Beowulf] MPI and Redhat9 NFS slow down
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Kumaran Rajaram kums at mpi.mpi-softtech.comTue Aug 24 07:30:41 PDT 2004
- Previous message: [Beowulf] MPI and Redhat9 NFS slow down
- Next message: [Beowulf] MPI and Redhat9 NFS slow down
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jack, If your application does not use MPI-IO to read/write from/to NFS, you may disable the 'noac' mount option. The 'noac' mount option prevents the client from caching file attributes. This means that every file operation on the client that requires file attribute information results in a GETATTR operation to retrieve a file's attribute information for the server. You can also try different values for rsize, wsize mount options and try enabling Jumbo frames in the GigE switch + set MTU of eth0 appropriately. Probably you can try diagnosing the NFS performance using 'nfsstat' utility. You can as well approach 'nfs at lists.sourceforge.net' and they might give you some valuable suggestions. -Kums __ Kumaran Rajaram Verari Systems, Inc. Phone: 205-314-3471 x208 On Mon, 23 Aug 2004, Jack Chen wrote: > Hi all, > > I'm not sure if this is the right place to post this question. If it > is not, please tell me where's the best place to get help on this, > thanks.. > > We recently built a 8-node PC Linux cluster running RedHat 9 (kernel: > 2.4.20-8smp #1 SMP). We use this system to run EPA's CMAQ > photochemical grid model. I have installed the latest MPICH 1.2.6 > with Portland Group Compiler (5.2-1) using ssh. Everything worked > fine with the mpi example programs (cpi, pi3p etc)and 'make testing'. > However when I tried to run any program that write output to other nfs > mounted drives I get very long delay. I'm not sure where the problem > is. I know the NFS automount is working fine because if I start the > job with just one processor (mpirun -np 1), I don't experience the > slow down. > > For example: > If I start the job on master node using 4 processors (mpirun -np 4) > and write to the master node (master2 0), > PIxxx file: > master2 0 /master2/home/chenj/CMAQ_v4.3/Run/cctm/CCTM_e2a > node103 1 /master2/home/chenj/CMAQ_v4.3/Run/cctm/CCTM_e2a > node103 1 /master2/home/chenj/CMAQ_v4.3/Run/cctm/CCTM_e2a > node104 1 /master2/home/chenj/CMAQ_v4.3/Run/cctm/CCTM_e2a > > the run takes 168 sec > > If I start the same job but write the output to any other nfs mounted > drives besides the master node, the job will be extremely slow. In > this case the same job took 10962 sec. > > I have tried to mount the drive using different parameters (rw,soft > and rw,hard,bg,intr,noac) and increased the nfsd daemon from 8 to 16 > on the NSF server, but nothing change. > > If you have any idea on what is going on, please help! > > Any help/suggestion are greatly appreciated. > > Jack > > Jack Chen > Laboratory for Atmospheric Research > Dept.of Civil & Environmental Engineering > Washington State University > Pullman, WA 99164-2910 > 509.335.5738 > 509.335.7632 (FAX) > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >
- Previous message: [Beowulf] MPI and Redhat9 NFS slow down
- Next message: [Beowulf] MPI and Redhat9 NFS slow down
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
