[Beowulf] Contents of Compute Nodes Images vs. Login Node Images

Prentice Bisbal pbisbal at pppl.gov
Tue Oct 23 11:47:19 PDT 2018


> Management sometimes reads these and says “why do we keep getting these requests, just install everything.”

Sounds  like to need to sit down with management and talk to them, too. 
In general "just install everything" is not a good idea.

> Our node images are in RAM, generally, so putting a bunch of extra stuff in them for no good reason does have an impact, although it’s less than it was when installed memory was lower, and it also all needs to be transferred on a cold start.

All the more reason to restrict what software is installed on the 
compute nodes. The Blue Gene /L system I maintained when I was at RU (as 
does all Blue Genes) used a minimal OS that was resident in RAM only. As 
a result, all applications had to be cross-compiled on the login node 
and linked statically, since the CNK (compute node kernel - the name of 
the compute node OS) didn't include all the dynamic libraries (I also 
think the OS didn't even provide a dynamic linker!). For the Blue 
Gene/Q, they did start supporting dynamically linked executables, but I 
don't know what changed to the OS to allow that.

If you're going with a diskless OS like that, I think you need to be 
very sparing in what you include in your image. If management wants you 
to 'install everything', on the compute nodes, I think you'll need to 
switch to a disk-based OS to keep your sanity.

Prentice

On 10/23/2018 02:35 PM, Ryan Novosielski wrote:
>> On Oct 23, 2018, at 2:10 PM, Greg Lindahl <lindahl at pbm.com> wrote:
>>
>> On Tue, Oct 23, 2018 at 05:48:00PM +0000, Ryan Novosielski wrote:
>>
>>> We’re getting some complaints that there’s not enough stuff in the
>>> compute node images, and that we should just boot compute nodes to
>>> the login node image
>> It's probably worth your while sitting down with your users and
>> learning how they want to use the tool, instead of telling them.
> In general, good advice; not totally applicable here. An example would be that we’re frequently asked to install things by users that are already installed, just a different way (by modules, or whatever else) than the user is used to or that the steps in their documentation said (eg. “Can you please run the following: yum install whatever — I tried but I don’t have root access.” Management sometimes reads these and says “why do we keep getting these requests, just install everything.” We also have another group that works more closely with users, have identified cases where “it works on the login node” and often extrapolate the solution without relaying the problem (or understanding the architecture). We’ll get to the bottom of that, but I wanted to know more generally what sites are doing.
>
> Our node images are in RAM, generally, so putting a bunch of extra stuff in them for no good reason does have an impact, although it’s less than it was when installed memory was lower, and it also all needs to be transferred on a cold start.
>
> --
> ____
> || \\UTGERS,  	 |---------------------------*O*---------------------------
> ||_// the State	 |         Ryan Novosielski - novosirj at rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\    of NJ	 | Office of Advanced Research Computing - MSB C630, Newark
>       `'
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list