Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

Cluster programming...

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Jakob Oestergaard jakob at unthought.net
Thu Jan 23 23:57:39 PST 2003


On Wed, Jan 22, 2003 at 10:52:31AM -0500, Karl Bellve wrote:
> 
> I am running into a little problem about multiple writes to a single 
> file via NFS.

Ok, first of all that sounds like a bad idea to begin with.

Why not have each node write it's own file, and run a "cat node.* >
bigfile" afterwards?

Quadratish, praktisch, gut  ;)

> An application is spawned on a number of nodes. When they are done, they 
> all write to a specific, but non-overlapping area of the NFS mounted 
> file.

If the parts are non-overlapping, I assume that the offset and data
length of each node's write is fixed - correct ?

> I use fcntl (fd, F_SETLKW, &lck) to lock to file, or wait until it 
> can lock the file for writing. Fcntl() is capable to lock across NFS. 

If I was correct above - why do you need to lock the file?

A seek() + write() should do the trick as I see it - but maybe there's
something I don't see  :)

> However, some nodes fail to write their result to the file. It isn't the 
> same nodes every time. I am not seeing any write errors. I tend to think 
> it is a NFS caching issue. All writes get flushed before releasing the 
> lock via fsync() and close().
> 
> The fileserver is a Redhat 8.0 system. I uprgaded to the latest Kernel 
> offered to RH8.0. That didn't fix the problem. I compiled a new kernel 
> (2.4.20) and that didn't fix the problem. The nodes are Alpha's running 
> RH6.2.
> 
> I am thinking about alternate means of locking but fnctl() should be the 
> trick.

I completely agree with you that locking should work - and you have
already received many good suggestions from fellow 'wolfers on how to
test/check/improve the locking on your systems.

What I'm curious about is, if you need locking at all.  While it should
of course work, avoiding it would solve the problem completely.

-- 
................................................................
:   jakob at unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:



More information about the Beowulf mailing list