[Beowulf] Fault tolerance & scaling up clusters (was Re: Bright Cluster Manager)

Michael Di Domenico mdidomenico4 at gmail.com
Thu May 17 05:32:33 PDT 2018

On Thu, May 17, 2018 at 8:18 AM, Christopher Samuel <chris at csamuel.org> wrote:
> The compute nodes boot a RHEL7 kernel with custom initrd, that
> includes the necessary OPA and Lustre kernel modules & config
> to get the networking working and access the Lustre filesystem,
> the kernel then pivots its root filesystem from the initrd to
> the master copy on Lustre via overlayfs2 to ensure the compute
> node sees it as read/write but without the possibility of it
> modifying the master (as the master is read-only in overlayfs2).
> Does that help?

it does.  the overlayfs part is the interesting bit.  i'll have to
read up some about that

More information about the Beowulf mailing list