[Beowulf] MPI and Redhat9 NFS slow down
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
David Mathog mathog at mendel.bio.caltech.eduTue Aug 24 08:51:05 PDT 2004
- Previous message: [Beowulf] MPI and Redhat9 NFS slow down
- Next message: [Beowulf] Regarding Beowulfery
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jack Chen chimou at mail.wsu.edu wrote: >If I start the same job but write the output to any other nfs mounted >drives besides the master node, the job will be extremely slow. In >this case the same job took 10962 sec. There could be a lot of different things going on here, off the top of my head: 1. If the NFS is served from a slave node AND that slave node is running your code as well as managing the disks, AND your code sucks up 99.9% of the CPU time, the code itself could be competing with the NFS daemon. Try it again with the "other node" which is serving the NFS disks NOT running your code locally. 2. Are the disks on the master node and "other node" the same? If you hvae scsi 320 disks on the master and el cheapo ATA on the slave you might see an effect like this. Ditto if the disk buffers are radically different in size. Ditto if the system memory available is much larger on one than the other (because of the built in disk caching in linux.) 3. Have you verified that the bandwidth you can achieve from "other node" -> "master node" == "other node 1" -> "other node 2"? 4. I have often observed code that writes a lot of small messages to the output file at a very high rate. This sort of code tends to overwhelm anything marginal in a network/NFS configuration. If it's possible, try reconfiguring the program to write output to the local disk /tmp/filename and then when done, copy the completed output files in one operation from that disk to the final location. Note, this works best if the nodes complete asynchronously. If they all finish simultaneously then you'll want to copy the output files sequentially (more or less, you may be able to do 2 or 3 at once without penalty). Else it will be less efficient as the NFS server disk heads jump all over the place trying to write 8 files at once. Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech
- Previous message: [Beowulf] MPI and Redhat9 NFS slow down
- Next message: [Beowulf] Regarding Beowulfery
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
