[Beowulf] more automatic building
bill at cse.ucdavis.edu
Thu Sep 29 22:04:00 PDT 2016
On 09/28/2016 07:34 AM, Mikhail Kuzminsky wrote:
> I worked always w/very small HPC clusters and built them manually
> (each server).
Manual installs aren't too bad up to 4 nodes or so.
> But what is reasonable to do for clusters containing some tens or
> hundred of nodes ?
We use cobbler for DHCP, bootp, DNS, and PXE boot. It's nice to have a
single database for mac address, IP address, hostname, etc. We have a
profile per OS. Then we use cobbler to netboot CentOS or Debian family
OSs and part of the installation.
The installation installs puppet which handles:
* Which users can login to which hardware
* Distribution of ssh keys
* Installation of packages, services, monitoring, etc.
* Tweaking initd/systemd scripts, pam, ulimit, etc.
* Managing autofs and friends.
> But it looks that ROCKS don't support modern interconnects, and there
> may be problems
> w/OSCAR versions for support of systemd-based distributives like CentOS
> 7. For next year -
> is it reasonable to wait new OSCAR version or something else ?
The hard part of supporting an HPC side of things is the users/apps, I
think of installation and configuration of hardware fairly minor.
Personally I'd just pick an OS that best suites your user/application
needs. PXE boot + cobbler with whatever linux OS is really not a big deal.
With the above it's easy to write a small script to shutdown a node,
turn on netboot, power on the node (assuming IPMI works), install on
boot, reboot into OS, NFS mount, run slurmd daemon, and be back in
More information about the Beowulf