Leandro Tavares Carneiro
leandro at ep.petrobras.com.br
Thu Oct 31 08:55:45 PST 2002
Thank's for you reply. I will comment in the text below.
Robert G. Brown wrote:
> You have at least a limited ability to tune filesystem block sizes and
> other parameters for good high bandwidth performance at the kernel
> level, before fortran or anything else sees it. Read:
> man hdparm
> man tune2fs
> man mke2fs
The filesystem is already tunned. In the SGI machine, we are trying to test
block sizes bigger than 4k, like 8k and 16k, and the performance grows a little.
> I think you'll want to create your partitions with -T largefile4, 4K
> blocksize (the maximum, which I recall is also the kernel page size
> which is probably WHY it is the maximum) which allocates one inode for
> every 4 MB of disk.
> There are also wsize,rsizes that can be adjusted in nfs/fstab that
> adjust how data is buffered and written back to NFS mounts.
All the filesystems used by this application are local.
> If you want to improve and tune any system with this sort of thing, be
> sure to really read all about it and proceed with extreme caution -- a
> lot of things you might try will destroy your filesystem(s) or totally
> wreck performance. The default tuning is probably a lot more efficient
> than you think, and the advantage of monkeying with it probably minimal.
> Is fortran really all that inefficient at reading in blocks from disk?
> Could you do something like build a big ramdisk and preload the files
> into it and read them from there? Disk itself is always "slow" in terms
> of random access latency and kernel overhead on a stat and open, but a
> fast scsi or ide disk can deliver large datafiles to an application at
> 10's of MB/sec which isn't totally horrible. Except compared to memory,
> of course.
We can't use ramdisks because the ammount of data nedeed to be load is very
huge, something about 100Gb of information needed to process the data. These
are tables with a lot of information, and they are loaded when demanded, and
wich tables are loaded depends of the data is going to process.
Now, we are trying to run this on an SGI machine, but this application will
run also on our clusters, but with a different parallelism.
The development team is searching to how to write an routine in C for read
with different block sizes, like a "dd" do, but they are Fortran especialists,
and they are searching for help...
Leandro Tavares Carneiro
Analista de Suporte
More information about the Beowulf