[Beowulf] network filesystem
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert Latham robl at mcs.anl.govTue Mar 6 10:44:10 PST 2007
- Previous message: [Beowulf] network filesystem
- Next message: [Beowulf] network filesystem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Mar 06, 2007 at 11:09:18AM -0500, Mark Hahn wrote: > >I would contend that writing to different sections of a file *must* be > >supported by any file system deployed on a cluster. How else would > >you get good performance from MPI-IO? > > who uses MPI-IO? straight question - I don't believe any of our 1500 users > do. Excellent question. Direct users? Probably not very many. We do find that straight-up MPI-IO isn't a good fit for a lot of scientific applications. The convienence factor you mentioned is indeed important. MPI-IO thinks of data as "stream of bytes", while applications think in terms of "multidimentional typed data" (a slice of upper atmosphere). Libraries like Parallel-HDF5 and Parallel-NetCDF bridge the gap and provide a convienent, familiar API. The app is still using MPI-IO, just not directly. > NFS certainly does as well. you just have to know the constraints. > are you saying you can never get pathological or incorrect results from > parallel operations on the same file on any of those FS's? You observe correctly that file systems offer a set of rules on what to expect from I/O patterns. These consistency semantics are not set in stone: MPI-IO consistency semantics are more relaxed than POSIX, yet generally sufficent for parallel scientific applicaitons. We would consider it a serious bug in PVFS if simultaneous non-overlapping writes corrupted data. If the only file system I had access to was NFS, I'd do one file per process as well. > starting with the question: "do you have a good reason to be writing in > parallel to the same file?". I'm not saying the answer is never yes. > > I guess I tend to value portability by obscurity-avoidance. not if it makes > life utter hell, of course, but... one file per processor falls down on systems like BGL (where even a small run is 1024 processes, and 128k is not unheard of). One file per process also robs the higher layers of the I/O software stack from an opportunity to optimize access patterns. All processes reading a collumn out of a row-major array is noncontiguous (and generally slow) in file-per-processor, but can be contiguous in single-file after applying data shipping or two-phase collective buffering optimizations. Jeff touched on the data management issues of file-per-processor. If file-per-processor really is the most portable and convienent way to work on data, well, I can't argue with that. On NFS, that's probably the only way to get correct results. The single-file approach, however, has significant benefits on the modern parallel file systems available today. As I hope you could tell, this kind of discussion is a lot of fun for me. Thanks! ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B
- Previous message: [Beowulf] network filesystem
- Next message: [Beowulf] network filesystem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
