[Beowulf] NFS cache vs. local reading
YXU11 at PARTNERS.ORG
Fri Sep 15 06:51:55 PDT 2006
I am maintaining a cluster that is using NFS and LSF. There is one user need
to run a large mount jobs with only few nodes. In each of his job, he needs to
read a gread deal of data from the home directory which is shared and mounted
to every computing node. Many times the data file are same, but will change
every 100 job finishes. If every job on the computing node(s) just go straight
to read data from the home directory, it (they) will go through NFS and the
network to get the file. Seems a lot waste of efforts. So, I suggested to use
script (by using job array and LSB_JOBINDEX) to determine whether to copy the
data to local disk in the first job of every 100 job, then the rest 99 job will
just read from the local disk.
My question is, since NFS also have cache, how much benefit this approach will
improve the performance? Because, if I were NFS and I am smart enough, I shall
be able to know whether I am reading the same file over and over again.. then,
will NFS cache size matter?
Somebody can give a comment?
More information about the Beowulf