[Beowulf] statless compute nodes

Trevor Gale trevor at snowhaven.com
Wed May 27 19:29:30 PDT 2015


I need to configure IB, slurm, MPI, and NFS and am most likely running centOS would you say that using warewulf makes configuration of these apps significantly more complicated?

Thanks,
Trevor

> On May 27, 2015, at 9:56 PM, Joe Landman <landman at scalableinformatics.com> wrote:
> 
> 
> 
> On 05/27/2015 09:22 PM, Trevor Gale wrote:
>> Hello all,
>> 
>> I was wondering how stateless node fair with very memory intensive applications. Does it simply require you to have a large amount of RAM to house your file system and program data? or are there other limitations?
> 
> Warewulf has been out the longest of the stateless distributions. We had rolled our own a while before using it, and kept adding capability to ours.
> 
> Its generally not hard to pare down a stateless node to a few hundred MB (or less!).  Application handled via NFS, and strip your stateless system down to the bare minimum you need.  In fairly short order, you should be able to pxe boot a kernel with a bare minimal initramfs, and have it launch docker and docker like containers. This is the concept behind CoreOS, and many distributions are looking to move to this model.
> 
> We use a makefile to drive creation of our stateless systems (everything including the kitchen sink, and our entire stack), which hovers around 4GB total.   Our original stateless systems were around 400MB or so, but I wanted a full development, IB, PFS, and MPI environment (not to mention other things).  I could easily make some of this stateful, but our application requires resiliency that can't exist in a stateful model (what if OS drives or the entire controller) suddenly went away, or the boot/management network was partitioned with an OS on NFS.
> 
> This is one of our Unison units right now
> 
> root at usn-01:~# df -h
> Filesystem      Size  Used Avail Use% Mounted on
> rootfs          8.0G  3.9G  4.2G  49% /
> udev             10M     0   10M   0% /dev
> ...
> tmpfs           1.0M     0  1.0M   0% /data
> /dev/sda        8.8T  113G  8.7T   2% /data/1
> /dev/sdb        8.8T  201G  8.6T   3% /data/2
> /dev/sdc        8.8T   63G  8.7T   1% /data/3
> /dev/sdd        8.8T  138G  8.6T   2% /data/4
> fhgfs_nodev      70T  1.1T   69T   2% /mnt/unison2
> 
> with the "local" mounts being controlled by a distributed database.   Think of it as a distributed cluster wide /etc/fstab. More relevant for a storage cluster/cloud than a compute cluster, but easily usable in this regard.
> 
> We handle all the rest of the configuration post-boot.   A little infrastructure work (bringing up interfaces), and then configuration work (driven by scripts and data pulled from a central repository, which is also distributable).
> 
> There are some oddities, not the least of which most distributions are decidedly not built for this.  But if you get them to a point where they think they have a  /dev/root and they mount it, life generally gets much easier rather quickly.
> 
> One of the other cool aspects of our mechanism is that we can pivot to a hybrid or NFS after fully booting.  And if the NFS pivot fails, we can fall back to our ramboot without a reboot.  Its a thing of beauty ... truly ...
> 
> FWIW: we use a debian base (and Ubuntu on occasion) these days, though we've used CentOS and RHEL in the past before it became harder to distribute.  Generally speaking we can boot anything (and I really mean *anything*: Any Linux, *BSD, Solaris, DOS, Windows, ... ) and control them in a similar manner (well, not DOS and Windows ... they are ... different ... but it is doable).
> 
> Warewulf has similar capabilities and is designed to be a cluster specific tool.  I think there are a few others (OneSIS, etc.) that come to mind that can do roughly similar things.  Maybe even xcat2 ... not sure, haven't looked at it in years.
> 
> 
> -- 
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics, Inc.
> e: landman at scalableinformatics.com
> w: http://scalableinformatics.com
> t: @scalableinfo
> p: +1 734 786 8423 x121
> c: +1 734 612 4615
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list