[Beowulf] Re: file IO benchmark - scalable storage
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Ole W. Saastad ole at scali.comThu Nov 24 05:09:51 PST 2005
- Previous message: RS: RS: [Beowulf] Sempron compile optimization
- Next message: FW: [Beowulf] file IO benchmark
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Temporary storage during a run can be stored at different places. The most common is to use local disks within the compute node. Configuring the two or more disks in a compute node to RAID 0 (failover is not an issue for scratch areas) in order to get maximum bandwidth may yield somewhat over 100 MBytes/sec. In some cases more depending on number of disks and disk controller within the node. Cheap compute nodes do not have high performance RAID controllers. On the other hand you might share a pool of fast scratch space. The use of a parallel file (LUSTRE and IBRIX are just two examples) system with a cluster of file servers. This will yield close to wire speed for Gigabit Ethernet (app. 115 MB/sec) per server or between 450-500 MBytes/sec if you have a parallel file system with InfiniBand support. The file server cluster must be capable of delivering the needed performance. This is another issue, but as the file systems do scale it is just a question of more hardware (to a limit). If you have a large cluster running several instances of an application that only uses scratch space once now and then the chances are that these do not overlap in time and all bandwidth to and from the file server cluster can be used by this single application. 450 MBytes/sec per server is more than normally seen for local disk storage. A file server cluster of 32 file servers running a parallel file system can serve 32 nodes at 450 MBytes/sec. Your other 96 compute nodes are in a part of the application where they do zero file access hence leave all bandwidth to the currently IO intensive activity. The argument of course breaks down if all 128 nodes do IO to the scratch area, then local disks is the only truly scalable solution, but is expensive to equip all compute nodes with several striped disks just to get scratch area performance. -- Ole W. Saastad, Dr.Scient. Manager Cluster Expert Center dir. +47 22 62 89 68 fax. +47 22 62 89 51 mob. +47 93 05 74 87 ole at scali.com Scali - www.scali.com Scaling the Linux datacenter
- Previous message: RS: RS: [Beowulf] Sempron compile optimization
- Next message: FW: [Beowulf] file IO benchmark
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
