diskless clients? beowulf-newbie seeks advice

Fri Jun 22 11:35:16 PDT 2001

I personally try to advocate diskless clients whenever I get a chance. 
There are several reasons for this:
1.	Administration is much easier if you don't have local data or OS or
anything that won't be fixed by a simple power cycle.  If a problem
persists through that, you know it's hardware (I guess it could be BIOS
setup, but I'd consider that hardware).
2.	You do save money on hard disks, but have to spend a little more on
NICs.  What this does mean though, is that you can put a more advanced
storage system on the server (or network attached storage).  Putting
SCSI Raid or such on the server will increase performance and / or
reliability (I think Gig E is faster than most IDE drives and I know
Myrinet is, so you should improve total system performance).
3.	In theory, you shouldn't be writing to the local disks in a program
anyway.  This will slow your computations waaaay down.  Same thing for
swapping to disk.
4.	You only have one copy of data, so not only do you save in total
storage costs (assuming cost/byte is fixed), but you also only have one
copy of the data so you don't have to worry about inconsistencies
between nodes (there are other ways to get the hard drives consistent,
but they are a bit of a pain).  Also, as a sys-admin it is nice to only
have one place to look if there are problems.
5.	You can call it intuition if you like (although it's based on facts),
but I really think this is the way clusters are going.  This is the way
that big systems like the Cray T3E work.  It's a lot simpler for
programmers (It took me a while to explain how to write to local disks
vs. server disks).  It's also just a lot more elegant, which may sound
like a cop out, but most good solutions are elegant.

Having said all that, it is important to note that diskless nodes are
not for everyone.  In fact our cluster is not diskless, and we aren't
looking at getting diskless nodes any time soon (give me a few years). 
Right now it doesn't meet our needs since we need the local disk and we
have our disked cluster working fine.

Hope this helps a little.
Jared

Brian LaMere wrote:
> 
> why does every guide around talk about diskless clients?  I mean...disks are
> stinkin cheap nowadays...
> 
> I have ~$150,000 to make a test cluster (with WAY more if the test cluster
> shows worth) but the boss-man wants to go with nodes which aren't exactly
> "commodity" in my book.  dual p3-1000 with 1.25Gb ram, 15krpm 18Gb drives.
> The things cost $8k+ each...tried to explain that 148 $1k machines would way
> out perform 16 $8k machines, but...oh well.  These boxes take up 1u, which
> seems to be their main selling point (HP's lp1000r).  Fortunately, these
> boxes are down to $6.5k now in cost (dropped a bit since we bought them a
> couple months back), but still...
> 
> on to my point.  Getting PVM to see everyone as one happy little family was
> easy enough.  Got the network guys to isolate the little guys, so that only
> the worldly node could see them, since I wasn't happy with opening up
> everything and simply putting a little all:all in hosts.deny, and having
> that be all the security I had.  But every guide that I've found has been
> all about diskless nodes for a beowulf.  And this isn't really a beowulf
> with just pvm (and soon lam-mpi and mpich), right?  I personally thought
> that the network nfs/tftp traffic would be horrible if they were all
> diskless clients...
> 
> so the real question:  I can put gig-e cards in the boxes instead of hard
> drives...right now they just have 2 100bt enet connections.  I'm only using
> one of the enet ports at the moment, too.  Would I be better with no disks,
> and gig-e instead?  Some of the concerns I have here: though we're only
> starting with a hundred gigs or such of data, we'll be at multi-terabyte
> within a year.  To be throwing around data that large, while nfs'ing the OS
> filesystems (on the clients) just seems like a lot for the boxes to do.  Am
> I looking at it wrong?  Also, for cost reasons we may be doing our data
> storage on something as tacky as network attached storage; we were looking
> at some NetApp boxes, but went with some EMC boxes instead.  Note I'm not
> talking about a symmetrix box or something (I already have one of those
> housing my oracle data), but instead a EMC product called an "ip4700."  Not
> all that impressed with it.
> 
> Just a little genetics research firm, needing some serious horsepower to
> start running big hammer and blast jobs.  The data we have now is just the
> bare minimum we need to get by, but if we had things like a working beowulf
> the scientists upstairs would start making, since they'd be able to use it,
> much more data.  They hired me on as the unix guy here knowing I don't know
> squat about beowulfs, but that I really want to learn :)  Got "how to build
> a beowulf" <grin> and I've read the manuals for pvm, mpich, lam-mpi, etc,
> and several other beowulf how-to guides.  All are about diskless.  Is
> diskless better?  Is it just better because its cheaper?  Are there other
> reasons its better?  Would having gig-ethernet in the boxes instead of hard
> drives be far better performance-wise?
> 
> Brian LaMere
> Diversa
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Jared Hodge
Institute for Advanced Technology
The University of Texas at Austin
3925 W. Braker Lane, Suite 400
Austin, Texas 78759

Phone: 512-232-4460
Fax: 512-471-9096
Email: Jared_Hodge at iat.utexas.edu