[Beowulf] Should I go for diskless or not?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comFri May 15 07:00:14 PDT 2009
- Previous message: [Beowulf] Should I go for diskless or not?
- Next message: [Beowulf] Should I go for diskless or not?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Douglas Eadline wrote: > You will note that I used sufficient wiggle words "usually" and > "generally" because in my experience it always depends. > And of course my comments are from my personal > experience. I have found that diskless allows for > the entire cluster to be "re provisioned" without > have into re-image disks. Reboots are quicker > (for the hardware I use) and since I use ram > disk approach (Warewulf/Perceus) I find that things > are a bit faster, also the diskless image has > minimal services running vs a disk-full distribution image. > (I fully understand the good admin can trim a > disk-full distribution) > > There are plenty of arguments either way. Back in 2006 > I did a mini-poll on node disk space usage: > > http://www.clustermonkey.net//component/option,com_poll/task,results/id,18/ > > So "in general, it varies". YMMV Let me weigh in (late of course) with some thoughts. With the advent of buzz words^H^H^H^H^H^H^H^H^H^H^Htechnologies like the cloud(TM)(c)(SM)*, I have a sense that stateful (disk based) installs are a thing of the past. Some folks will want them for their local cluster, but when you are provisioning N nodes, you aren't going to want to deal with the vagaries of doing a stateful install to "random" hardware. For non-virtualized hosts, what you want is to either boot diskless in a trivial to configure manner, or, and we are seeing and working on this more and more, boot from iSCSI targets with the node OS implemented. The latter is a very nice way to do non-virtualized OS boots. You don't need to be able to PXE boot from iSCSI, though this is an option. The advantage of diskless is you don't need to containerize the OS in an "image" (which is little more than a 'dd if=/dev/zero of=disk.image bs=1M count=8k ; losetup disk.image ; mkfs... /dev/loop0 ; mount /dev/loop0 /mnt/disk ; #copy tree to disk' ) The disadvantage of diskless is some complexity in the boot/shutdown process. This has been (mostly) managed well by a number of distributions, so it is generally (these days) fairly easy to deploy diskless systems. Swap is an issue. A somewhat hard to solve issue ... we'd recommend actually turning off swap (and swappiness in the kernel) for diskless. Or put a USB drive in each machine and swap on that, though, honestly, that is as reliable as swapping over the network. E.g. don't do it. The Perceus hybrid approach works well, using a small local ramdisk, and an NFS mount for the important stuff. No installation needed, just boot it, and up you go. Same issues on swap. Another way to look at local drives is a small swap space. So if you can get drives in a $40/unit increments, these are cheap and fairly reliable swap targets. Better than USB. But you shouldn't swap in a cluster. The advantage of iSCSI is, it is a block device that you attach, over the network, and it works pretty well. You can (easily) programmatically adjust it, clone it, modify it, etc. Pull it down to a different system (offline), make changes, upload the different image, and try it on a few nodes. Basically iSCSI is sort of like diskless in a way, though it allows greater ranges of changes without creating whole new diskless trees, and without jumping through the host issues with some diskless schemes. Its not a VM, but as with diskless, VMs can boot from it. Making development fast/efficient. It allows you a VM-able, and easily transportable container for your stuff. Of course, it goes without saying that for either diskless or iSCSI, you need a really good and fast storage infrastructure ... :) More to the point, the stateful installations have their place, but it doesn't appear to be in "cloud" scenarios. Stateful is fine for local clusters, or automagically assembled larger clusters. I would be mindful of the latter though, as we have had a number of customers with problems directly attributable to unresolved issues in some cluster distribution's installer code. If your installer has bugs, or does something wrong, somewhere (like, I dunno, disabling the IPMI?), you can wind up with N unbootable nodes. As N gets north of 8, this gets less funny. Other issues we have run into with stateful clusters have included the cluster distribution getting in the way of what you want to do. Most cluster distributions come with an overarching philosophy, and the "one true way" to do things. Invariably you will find a need to do something "an other way", and can run head-first into the philosophy (and occasionally ... um ... animated [yeah, thats the ticket] ... discussions with the disciples of that philosophy) as you try to figure out what you need to do. Some of these philosophies are benign, some are aggressive. The stateful case with specific distributions (RHEL and alike) doesn't do so well with new hardware. We have had customers call us up to complain that their shiny new nodes seem to crash regularly with the (ancient) kernels in RHEL. Sure enough, the new-thingamabob on the motherboard is not well supported in RHEL, so you either have to a) toss your cluster and only buy from the HAL, or b) get a newer kernel. It gets more ... exciting ... and not in a good way, when that new-thingamabob impedes installation. We've seen this happen ... too many times to count. The solution is to slip in a new kernel and initrd to fix the problem. Slip-streaming a new kernel into a RHEL distribution is an ... um ... challenge ... to say the least. In diskless, this is trivial. In iSCSI, it is the same as diskless (in both cases, the boot kernel and boot initrd are outside of the trees). What we do for stateful installs, for customers who don't care about which cluster distribution is used, and let us use our Tiburon system, we pxe boot from our kernel (2.6.28.7 based) and associated initrd. Finishing scripts fix up any new kernel bits later on. Its harder to do this with some other cluster systems. Easy with Perceus/Warewulf. Basically, where I am getting to is, if you really want to go stateful, make sure your kernel works with your hardware before you make a decision about distro, etc. Less important with diskless/iSCSI. If you go stateless, solve the swap problem with small cheap local drives. * we are drinking the koolaid, and starting work with a partner on this stuff. I do see some merit in it (virtualized or direct), though it isn't for everyone in HPC. The folks for whom it will work well have a specific usage profile. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615
- Previous message: [Beowulf] Should I go for diskless or not?
- Next message: [Beowulf] Should I go for diskless or not?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
