[Beowulf] automount on high ports

Perry E. Metzger perry at piermont.com
Wed Jul 2 04:50:48 PDT 2008


Tim Cutts <tjrc at sanger.ac.uk> writes:
> On 2 Jul 2008, at 8:26 am, Carsten Aulbert wrote:
>
>> OK, we have 1342 nodes which act as servers as well as clients. Every
>> node exports a single local directory and all other nodes can mount
>> this.
>>
>> What we do now to optimize the available bandwidth and IOs is spread
>> millions of files according to a hash algorithm to all nodes (multiple
>> copies as well) and then run a few 1000 jobs opening one file from one
>> box then one file from the other box and so on. With a short autofs
>> timeout that ought to work. Typically it is possible that a single
>> process opens about 10-15 files per second, i.e. making 10-15 mounts
>> per
>> second. With 4 parallel process per node that's 40-60 mounts/second.
>> With a timeout of 5 seconds we should roughly have 200-300 concurrent
>> mounts (on average, no idea abut the variance).
>
> Please tell me you're not serious!  The overheads of just performing
> the NFS mounts are going to kill you, never mind all the network
> traffic going all over the place.
>
> Since you've distributed the files to the local disks of the nodes,
> surely the right way to perform this work is to schedule the
> computations so that each node works on the data on its own local
> disk, and doesn't have to talk networked storage at all?  Or don't you
> know in advance which files a particular job is going to need?

Perhaps it makes sense given their job load. Perhaps it doesn't.

If they need access to far more storage than a single node can hold,
it might make sense. If individual nodes need lots of I/O but only on
a very rare basis, so the disk bandwidth would be unused on most
nodes most of the time if they were doing everything locally, perhaps
it might make sense. I'll agree that it isn't an obviously good
solution to most workloads, but we don't really know what their
workload is like so we can't say that this is a bad move ab initio.


Perry



More information about the Beowulf mailing list