[Beowulf] statless compute nodes
deadline at eadline.org
Thu May 28 05:10:58 PDT 2015
> I need to configure IB, slurm, MPI, and NFS and am most likely running
> centOS would you say that using warewulf makes configuration of these apps
> significantly more complicated?
You are not alone, all Warewulf (WW) users run the same stuff and the
project guys all run clusters, so they know what is needed. There
are ways to configure and manage IB, IPMI, and add packages to your
That said, there is an effort to document WW a bit more (I'm helping out)
Finally, the Limulus personal clusters are based on WW and have
all the standard stuff running on them, take a look:
Some background, I have a kickstart that builds the Limulus head node, it
does some local configure and installs everything as RPMs -- even
the node images. Once built a script boots and registers the nodes.
I will be making this publicly available real soon -- once I get done
with the Hadoop book thing (and BTW, I am working on doing
the same with a full Hortonworks Hadoop install, and for those
who don't know who Hortonworks is, think Red Hat of the Hadoop world)
>> On May 27, 2015, at 9:56 PM, Joe Landman
>> <landman at scalableinformatics.com> wrote:
>> On 05/27/2015 09:22 PM, Trevor Gale wrote:
>>> Hello all,
>>> I was wondering how stateless node fair with very memory intensive
>>> applications. Does it simply require you to have a large amount of RAM
>>> to house your file system and program data? or are there other
>> Warewulf has been out the longest of the stateless distributions. We had
>> rolled our own a while before using it, and kept adding capability to
>> Its generally not hard to pare down a stateless node to a few hundred MB
>> (or less!). Application handled via NFS, and strip your stateless
>> system down to the bare minimum you need. In fairly short order, you
>> should be able to pxe boot a kernel with a bare minimal initramfs, and
>> have it launch docker and docker like containers. This is the concept
>> behind CoreOS, and many distributions are looking to move to this model.
>> We use a makefile to drive creation of our stateless systems (everything
>> including the kitchen sink, and our entire stack), which hovers around
>> 4GB total. Our original stateless systems were around 400MB or so, but
>> I wanted a full development, IB, PFS, and MPI environment (not to
>> mention other things). I could easily make some of this stateful, but
>> our application requires resiliency that can't exist in a stateful model
>> (what if OS drives or the entire controller) suddenly went away, or the
>> boot/management network was partitioned with an OS on NFS.
>> This is one of our Unison units right now
>> root at usn-01:~# df -h
>> Filesystem Size Used Avail Use% Mounted on
>> rootfs 8.0G 3.9G 4.2G 49% /
>> udev 10M 0 10M 0% /dev
>> tmpfs 1.0M 0 1.0M 0% /data
>> /dev/sda 8.8T 113G 8.7T 2% /data/1
>> /dev/sdb 8.8T 201G 8.6T 3% /data/2
>> /dev/sdc 8.8T 63G 8.7T 1% /data/3
>> /dev/sdd 8.8T 138G 8.6T 2% /data/4
>> fhgfs_nodev 70T 1.1T 69T 2% /mnt/unison2
>> with the "local" mounts being controlled by a distributed database.
>> Think of it as a distributed cluster wide /etc/fstab. More relevant for
>> a storage cluster/cloud than a compute cluster, but easily usable in
>> this regard.
>> We handle all the rest of the configuration post-boot. A little
>> infrastructure work (bringing up interfaces), and then configuration
>> work (driven by scripts and data pulled from a central repository, which
>> is also distributable).
>> There are some oddities, not the least of which most distributions are
>> decidedly not built for this. But if you get them to a point where they
>> think they have a /dev/root and they mount it, life generally gets much
>> easier rather quickly.
>> One of the other cool aspects of our mechanism is that we can pivot to a
>> hybrid or NFS after fully booting. And if the NFS pivot fails, we can
>> fall back to our ramboot without a reboot. Its a thing of beauty ...
>> truly ...
>> FWIW: we use a debian base (and Ubuntu on occasion) these days, though
>> we've used CentOS and RHEL in the past before it became harder to
>> distribute. Generally speaking we can boot anything (and I really mean
>> *anything*: Any Linux, *BSD, Solaris, DOS, Windows, ... ) and control
>> them in a similar manner (well, not DOS and Windows ... they are ...
>> different ... but it is doable).
>> Warewulf has similar capabilities and is designed to be a cluster
>> specific tool. I think there are a few others (OneSIS, etc.) that come
>> to mind that can do roughly similar things. Maybe even xcat2 ... not
>> sure, haven't looked at it in years.
>> Joseph Landman, Ph.D
>> Founder and CEO
>> Scalable Informatics, Inc.
>> e: landman at scalableinformatics.com
>> w: http://scalableinformatics.com
>> t: @scalableinfo
>> p: +1 734 786 8423 x121
>> c: +1 734 612 4615
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> Mailscanner: Clean
More information about the Beowulf