diskless clients? beowulf-newbie seeks advice
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Pedro Díaz Jiménez pdiaz88 at terra.esFri Jun 22 16:01:40 PDT 2001
- Previous message: diskless clients? beowulf-newbie seeks advice
- Next message: diskless clients? beowulf-newbie seeks advice
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, On Friday 22 June 2001 18:56, Brian LaMere wrote: > couple of follow-up questions.... > > ******************** > I personally try to advocate diskless clients whenever I get a chance. > There are several reasons for this: > 1. Administration is much easier if you don't have local data or OS or > anything that won't be fixed by a simple power cycle. If a problem > persists through that, you know it's hardware (I guess it could be BIOS > setup, but I'd consider that hardware). I agree. OTOH the cluster setup is more complicated IMHO > ******************** > this is easy enough to automate though...a single shell script that rcp's > the changed files to all the nodes (for configuration files). Then I was > thinking of nfs exporting any applications that are needed...simply install > them into /usr2, and share out /usr2...wouldn't that handle that? > > ******************** > 2. You do save money on hard disks, but have to spend a little more on > NICs. What this does mean though, is that you can put a more advanced > storage system on the server (or network attached storage). Putting > SCSI Raid or such on the server will increase performance and / or > reliability (I think Gig E is faster than most IDE drives and I know > Myrinet is, so you should improve total system performance). > ******************** > I partially agree. Your thoughs on memory management?. Swap over Network would decrease greatly the performance, so in my opinion you are limited to non-memory-consuming computations > well, we do use 15k rpm scsi drives, but yeah....ya want to do as few reads > as possible even then. Just seems with a potential of there being 100 of > these nodes, I very well may have to boot one every few days...and that > would be lots of traffic. At that point, is it advisable to have a worldly > node set aside for doing the boot-up? Hell, I could have everything > bootstrap off their 100bt ports, then actually work off a gig-ethernet > port...hmmm... > > ************************** > 3. In theory, you shouldn't be writing to the local disks in a program > anyway. This will slow your computations waaaay down. Same thing for > swapping to disk. > 4. You only have one copy of data, so not only do you save in total > storage costs (assuming cost/byte is fixed), but you also only have one > copy of the data so you don't have to worry about inconsistencies > between nodes (there are other ways to get the hard drives consistent, > but they are a bit of a pain). Also, as a sys-admin it is nice to only > have one place to look if there are problems. > ************************* > the data would obviously be shared out, yeah. With a possibility of a > multi-terabyte database that the cluster is querying in a year, there's no > way I'd put a terabyte on each node..hehe. I mean, IBM can say all they > want about low-end drives being 400gigs in a year or so, but... point > being, whether they are diskless or not they'll be pulling in all their > data locally. Its just the OS that would be local, and perhaps any > applications I'd want on each node. > > ************************** > 5. You can call it intuition if you like (although it's based on > facts), > but I really think this is the way clusters are going. This is the way > that big systems like the Cray T3E work. It's a lot simpler for > programmers (It took me a while to explain how to write to local disks > vs. server disks). It's also just a lot more elegant, which may sound > like a cop out, but most good solutions are elegant. IMHO Cray's are another beasts. Each micro has access to the main memory via a high-speed bus. Thats not my situation (100Mbps) and probably not the situation of most beowulfs > ************************* > Elegancy is certainly not a cop-out in my book. The more elegant something > is, the better it works...this is almost always true. > I agree. Having diskless computing nodes focus them as purely computational nodes, not just some boxen tied to a network > ********************************** > Having said all that, it is important to note that diskless nodes are > not for everyone. In fact our cluster is not diskless, and we aren't > looking at getting diskless nodes any time soon (give me a few years). > Right now it doesn't meet our needs since we need the local disk and we > have our disked cluster working fine. > > Hope this helps a little. > Jared > ************************************* Just my $0.02 Cheers Pedro > > Any tips are helpful. I'm just sittin here trying to decide which would be > better for -our- particular application. When is it better for there to be > disked-clients? Is diskless pretty much something I should obviously do > considering the fact that the cluster will be quering a huge database > anyway? > > Brian LaMere > Diversa > > Brian LaMere wrote: > > why does every guide around talk about diskless clients? I mean...disks > > are > > > stinkin cheap nowadays... > > > > I have ~$150,000 to make a test cluster (with WAY more if the test > > cluster shows worth) but the boss-man wants to go with nodes which aren't > > exactly "commodity" in my book. dual p3-1000 with 1.25Gb ram, 15krpm > > 18Gb drives. The things cost $8k+ each...tried to explain that 148 $1k > > machines would > > way > > > out perform 16 $8k machines, but...oh well. These boxes take up 1u, > > which seems to be their main selling point (HP's lp1000r). Fortunately, > > these boxes are down to $6.5k now in cost (dropped a bit since we bought > > them a couple months back), but still... > > > > on to my point. Getting PVM to see everyone as one happy little family > > was > > > easy enough. Got the network guys to isolate the little guys, so that > > only > > > the worldly node could see them, since I wasn't happy with opening up > > everything and simply putting a little all:all in hosts.deny, and having > > that be all the security I had. But every guide that I've found has been > > all about diskless nodes for a beowulf. And this isn't really a beowulf > > with just pvm (and soon lam-mpi and mpich), right? I personally thought > > that the network nfs/tftp traffic would be horrible if they were all > > diskless clients... > > > > so the real question: I can put gig-e cards in the boxes instead of hard > > drives...right now they just have 2 100bt enet connections. I'm only > > using > > > one of the enet ports at the moment, too. Would I be better with no > > disks, > > > and gig-e instead? Some of the concerns I have here: though we're only > > starting with a hundred gigs or such of data, we'll be at multi-terabyte > > within a year. To be throwing around data that large, while nfs'ing the > > OS > > > filesystems (on the clients) just seems like a lot for the boxes to do. > > Am > > > I looking at it wrong? Also, for cost reasons we may be doing our data > > storage on something as tacky as network attached storage; we were > > looking at some NetApp boxes, but went with some EMC boxes instead. Note > > I'm not talking about a symmetrix box or something (I already have one of > > those housing my oracle data), but instead a EMC product called an > > "ip4700." > > Not > > > all that impressed with it. > > > > Just a little genetics research firm, needing some serious horsepower to > > start running big hammer and blast jobs. The data we have now is just > > the bare minimum we need to get by, but if we had things like a working > > beowulf > > > the scientists upstairs would start making, since they'd be able to use > > it, > > > much more data. They hired me on as the unix guy here knowing I don't > > know > > > squat about beowulfs, but that I really want to learn :) Got "how to > > build > > > a beowulf" <grin> and I've read the manuals for pvm, mpich, lam-mpi, etc, > > and several other beowulf how-to guides. All are about diskless. Is > > diskless better? Is it just better because its cheaper? Are there other > > reasons its better? Would having gig-ethernet in the boxes instead of > > hard > > > drives be far better performance-wise? > > > > Brian LaMere > > Diversa > > > > _______________________________________________ > > Beowulf mailing list, Beowulf at beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf - -- - -----BEGIN PGP PUBLIC KEY BLOCK----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org mQGiBDqcGZsRBADFIahNPLk8suMlS39m8RqatLgX4dO7PU2F5p1oVvkyB7PaLQCv FREWwfrjGpxAjRnxyZ4TdaFi1oCP495t5R2CdjPZu0EfjsEqosdLXkjDsKl2n4Wo Afb6BaHMJS5PADEI0QfpZOkB8OruAZja/oGmn5rThyjgCxWHUuK1ArmeGwCg7+9a owg9wP1RohePHJSDB9d2HYMD/i7z1X4ev+K90LumgJwSWlScJ7MEip5rw4wqGOkK lF/C2nTYsoX5CVEn/pu7hROL/BWIYtBgkNDaEjsVsyb+4KjQXcZUW5l3ADipWYx2 r9s4sFfeZ9nfhDcG0aNYRcCNkYSZ/WxUkXS8UjVEAEhkFu1BA+6UZmeq3pKtJZTR +HqKA/9zRmgTon36zt2qe9eiR6DyY0EpGEI0iY+KYX6GC/wxizeHBw0FW1eOEoxF GjtxdBv/U9vi7Vgav6aY+pr4la5q6jVabe03Y8yGDFeL8jM+lqww1rzpABiGrF+W qge65zCUjL3jJE5+5yi+KcRyllb1OA7uXQTtsRw+TGq9Dvaaz7QwUGVkcm8gRGlh eiBKaW1lbmV6IChCLk8uRi5ILikgPHBkaWF6ODhAdGVycmEuZXM+iFYEExECABYF AjqcGZsECwoEAwMVAwIDFgIBAheAAAoJEJ7ud33hGMZRj20An2Ce4S/vBTuZDxnL WFBrJRnc3UdaAKDnIPNRbz7r4dh9AuBcpbCE1pQ/SLkBDQQ6nBmqEAQAr7O07Dws 5zAbQvm1hwGthXKCHtIIuWCPdX/XkNG6ZxV/cXgs4LI4oAg3GhttD2JIEk2SoVXE FOf/wIddIDz70/9mIZavMvpR31LxBFSJk0Up3caOvThM90wMttRi7tg7cf04rrMM Phy8T5bOIW/q5SMwZffbJXD7bA0/jDLdQ6MAAwYD/1emSwNTzOOmMCZadoEBpKIE HA35P2/m/SsCI+pQ/OKXKPvvrQKTQqRCcDa5aq31oSiT9M5WQ96BlIGKHRPWGpvm 0822V7M9RF2mYZPIfgKfTSvZpYHzjz+RM7PvBBiBc9l95vy70Sh7SywIF86H80Ag D0dUIDtGlrSANhXjx4EJiEYEGBECAAYFAjqcGaoACgkQnu53feEYxlHdVACgjVhU Y8CKf6MYZgQOR9eIDNvTX0AAn3dwbW1HLxEF5OQKJIsngl0BUlYK =d4S3 - -----END PGP PUBLIC KEY BLOCK----- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE7M85Unu53feEYxlERApXBAJ928Fa0axtRjA6qq1aOH2FCqyqs5wCcCUKU trEC8zxe8TqY5qnqWQzoJ3I= =Fqj6 -----END PGP SIGNATURE-----
- Previous message: diskless clients? beowulf-newbie seeks advice
- Next message: diskless clients? beowulf-newbie seeks advice
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
