[Beowulf] HPC and SAN

Mark Hahn hahn at physics.mcmaster.ca
Sat Dec 18 09:45:51 PST 2004


>     Is there any thing like Beowulf cluster and SAN.

sure, but why?  SAN is just short for "breathtakingly expensive 
fibrechannel storage nonesense".  but if you've got the money,
there's no reason you couldn't do it.  put a FC HBA on each node
and plug them all into some godawful FC switch that your storage
targets are also plugged into.  there's the rub: even a small beowulf
cluster these days is, say, 64 nodes, and to be at all interesting,
bandwidth-wise, you'll need approximately 64 storage targets. oops!

what's the hang-up on SAN?  just that you've bought the marketing 
crap about how SAN managability is the only way to go?  I find that 
the managability/virtualization jabber comes from "enterprise" folk,
who really have no clue about HPC.  for instance, I basically never 
want to partition anything - as big storage chunks as possible means
better sharing of resources.  and I don't change the chunks either,
I add more bigger/faster chunks.  (at least in the funding environment 
here, where money comes in large chunks at multi-year intervals.)

> I would like to have all the data in the Beowulf cluster to be in SAN
> also. Pls excuse if in case you find my question silly. 

it's like asking whether you can do webserving from beowulf.  sure you can,
and it might even make sense in some niche.  but beowulf is mostly about 
message-passing HPC.  as such, it often has serious IO issues, but SAN
solves a different problem (how to take a slice of a FC volume from 
enginering because the accounting DB needs more space.)

that said, the current HPC trend of using fast cluster interconnects
along with filesystems like lustre/pvfs could be considered a SAN approach.
technically, I'd say it's between SAN and NAS, since the protocol is 
some block-like (SAN) properties, and some file-level (NAS) ones...

regards, mark hahn.




More information about the Beowulf mailing list