diskless clients? beowulf-newbie seeks advice
Pedro Díaz Jiménez
pdiaz88 at terra.es
Fri Jun 22 16:01:40 PDT 2001
-----BEGIN PGP SIGNED MESSAGE-----
On Friday 22 June 2001 18:56, Brian LaMere wrote:
> couple of follow-up questions....
> I personally try to advocate diskless clients whenever I get a chance.
> There are several reasons for this:
> 1. Administration is much easier if you don't have local data or OS or
> anything that won't be fixed by a simple power cycle. If a problem
> persists through that, you know it's hardware (I guess it could be BIOS
> setup, but I'd consider that hardware).
I agree. OTOH the cluster setup is more complicated IMHO
> this is easy enough to automate though...a single shell script that rcp's
> the changed files to all the nodes (for configuration files). Then I was
> thinking of nfs exporting any applications that are needed...simply install
> them into /usr2, and share out /usr2...wouldn't that handle that?
> 2. You do save money on hard disks, but have to spend a little more on
> NICs. What this does mean though, is that you can put a more advanced
> storage system on the server (or network attached storage). Putting
> SCSI Raid or such on the server will increase performance and / or
> reliability (I think Gig E is faster than most IDE drives and I know
> Myrinet is, so you should improve total system performance).
I partially agree. Your thoughs on memory management?. Swap over Network
would decrease greatly the performance, so in my opinion you are limited to
> well, we do use 15k rpm scsi drives, but yeah....ya want to do as few reads
> as possible even then. Just seems with a potential of there being 100 of
> these nodes, I very well may have to boot one every few days...and that
> would be lots of traffic. At that point, is it advisable to have a worldly
> node set aside for doing the boot-up? Hell, I could have everything
> bootstrap off their 100bt ports, then actually work off a gig-ethernet
> 3. In theory, you shouldn't be writing to the local disks in a program
> anyway. This will slow your computations waaaay down. Same thing for
> swapping to disk.
> 4. You only have one copy of data, so not only do you save in total
> storage costs (assuming cost/byte is fixed), but you also only have one
> copy of the data so you don't have to worry about inconsistencies
> between nodes (there are other ways to get the hard drives consistent,
> but they are a bit of a pain). Also, as a sys-admin it is nice to only
> have one place to look if there are problems.
> the data would obviously be shared out, yeah. With a possibility of a
> multi-terabyte database that the cluster is querying in a year, there's no
> way I'd put a terabyte on each node..hehe. I mean, IBM can say all they
> want about low-end drives being 400gigs in a year or so, but... point
> being, whether they are diskless or not they'll be pulling in all their
> data locally. Its just the OS that would be local, and perhaps any
> applications I'd want on each node.
> 5. You can call it intuition if you like (although it's based on
> but I really think this is the way clusters are going. This is the way
> that big systems like the Cray T3E work. It's a lot simpler for
> programmers (It took me a while to explain how to write to local disks
> vs. server disks). It's also just a lot more elegant, which may sound
> like a cop out, but most good solutions are elegant.
IMHO Cray's are another beasts. Each micro has access to the main memory via
a high-speed bus. Thats not my situation (100Mbps) and probably not the
situation of most beowulfs
> Elegancy is certainly not a cop-out in my book. The more elegant something
> is, the better it works...this is almost always true.
I agree. Having diskless computing nodes focus them as purely computational
nodes, not just some boxen tied to a network
> Having said all that, it is important to note that diskless nodes are
> not for everyone. In fact our cluster is not diskless, and we aren't
> looking at getting diskless nodes any time soon (give me a few years).
> Right now it doesn't meet our needs since we need the local disk and we
> have our disked cluster working fine.
> Hope this helps a little.
Just my $0.02
> Any tips are helpful. I'm just sittin here trying to decide which would be
> better for -our- particular application. When is it better for there to be
> disked-clients? Is diskless pretty much something I should obviously do
> considering the fact that the cluster will be quering a huge database
> Brian LaMere
> Brian LaMere wrote:
> > why does every guide around talk about diskless clients? I mean...disks
> > stinkin cheap nowadays...
> > I have ~$150,000 to make a test cluster (with WAY more if the test
> > cluster shows worth) but the boss-man wants to go with nodes which aren't
> > exactly "commodity" in my book. dual p3-1000 with 1.25Gb ram, 15krpm
> > 18Gb drives. The things cost $8k+ each...tried to explain that 148 $1k
> > machines would
> > out perform 16 $8k machines, but...oh well. These boxes take up 1u,
> > which seems to be their main selling point (HP's lp1000r). Fortunately,
> > these boxes are down to $6.5k now in cost (dropped a bit since we bought
> > them a couple months back), but still...
> > on to my point. Getting PVM to see everyone as one happy little family
> > easy enough. Got the network guys to isolate the little guys, so that
> > the worldly node could see them, since I wasn't happy with opening up
> > everything and simply putting a little all:all in hosts.deny, and having
> > that be all the security I had. But every guide that I've found has been
> > all about diskless nodes for a beowulf. And this isn't really a beowulf
> > with just pvm (and soon lam-mpi and mpich), right? I personally thought
> > that the network nfs/tftp traffic would be horrible if they were all
> > diskless clients...
> > so the real question: I can put gig-e cards in the boxes instead of hard
> > drives...right now they just have 2 100bt enet connections. I'm only
> > one of the enet ports at the moment, too. Would I be better with no
> > and gig-e instead? Some of the concerns I have here: though we're only
> > starting with a hundred gigs or such of data, we'll be at multi-terabyte
> > within a year. To be throwing around data that large, while nfs'ing the
> > filesystems (on the clients) just seems like a lot for the boxes to do.
> > I looking at it wrong? Also, for cost reasons we may be doing our data
> > storage on something as tacky as network attached storage; we were
> > looking at some NetApp boxes, but went with some EMC boxes instead. Note
> > I'm not talking about a symmetrix box or something (I already have one of
> > those housing my oracle data), but instead a EMC product called an
> > "ip4700."
> > all that impressed with it.
> > Just a little genetics research firm, needing some serious horsepower to
> > start running big hammer and blast jobs. The data we have now is just
> > the bare minimum we need to get by, but if we had things like a working
> > the scientists upstairs would start making, since they'd be able to use
> > much more data. They hired me on as the unix guy here knowing I don't
> > squat about beowulfs, but that I really want to learn :) Got "how to
> > a beowulf" <grin> and I've read the manuals for pvm, mpich, lam-mpi, etc,
> > and several other beowulf how-to guides. All are about diskless. Is
> > diskless better? Is it just better because its cheaper? Are there other
> > reasons its better? Would having gig-ethernet in the boxes instead of
> > drives be far better performance-wise?
> > Brian LaMere
> > Diversa
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
- -----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
- -----END PGP PUBLIC KEY BLOCK-----
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
-----END PGP SIGNATURE-----
More information about the Beowulf