[Beowulf] Input Sought: "Basic" Luster FS deployment on
timchipman at myrealbox.com
Wed Mar 28 12:30:30 PDT 2007
I've trawled the archives and read via google for a few days now, but have not got a lot of clarity yet - hence a query to the list. If merited/of use I can summarize back replies once done.
I'm looking to soon begin deployment of a ~50node (dual socket, dual core opteron) cluster with gig-ether interconnect, intended to run a fairly specific CPU intensive MPI model which scales "absurdly well". I/O performance is far and away not the bottleneck. (based on benchmarks done on other linux clusters already)
Originally I had assumed for simplicity to use ROCKS with NFS as the filesystem for shared storage (There is one dedicated "storage node" with 2 x Raid5 bricks attached, to be exported to all nodes in the cluster)
I've been reading in the past ~month and realize from what I've seen, that the Lustre FileSystem over GigEther (even with this kind of trivial topology - all MetaData and Storage Data hosted on the same node) - should give significantly better performance than NFS running the same hardware / topology.
Some digging in this list archive suggested a bit of debate (ie, Lustre performance would only exceed NFS if lots of large streaming intensive I/O access, otherwise it would be worse).
Additionally, if I can get any "real world" comments (ideally from folks with similar "straightforward" lustre deployments) - feedback will be tremendously appreciated. (I'm not really looking for failover / redundancy nor distributed storage across many nodes being exported across the whole cluster..)
I'm still poring over the lustre install guide to get a better handle on "subtle details" such as (how much storage is needed for MetaData / ratio of MetaData footprint vs OSS Storage capacity, etc - but I'm sure I'll get there soon enough after playing with a test setup for a week or so).
Many thanks for taking the time to read this far..
More information about the Beowulf