[Beowulf] Which distro for the cluster?

Sun Jan 7 07:06:17 PST 2007

Andrew M.A. Cater wrote:
> On Wed, Jan 03, 2007 at 09:51:44AM -0500, Robert G. Brown wrote:
>> On Wed, 3 Jan 2007, Leif Nixon wrote:
>>
>>> "Robert G. Brown" <rgb at phy.duke.edu> writes:
>>>
>>   b) If an attacker has compromised a user account on one of these
>> workstations, IMO the security battle is already largely lost.  They

s/largely/completely/g

At least for this user, if they have single factor passwordless login
set up between workstation and cluster.

Of course if they are using a malware-ridden, keylogger hosting machine,
they have ... uh ... somewhat worse things to deal with than just their
accounts on the cluster being open to attack.

The solution to this is simple.  Never let this happen.  Which means,
don't use a system which is significantly vulnerable to malware or
keylogger insertion.  It is left as an exercise to the reader to figure
out which platforms are more vulnerable.

>> have a choice of things to attack or further evil they can try to wreak.
>> Attacking the cluster is one of them, and as discussed if the cluster is
>> doing real parallel code it is likely to be quite vulnerable regardless
>> of whether or not its software is up to date because network security is
>> more or less orthogonal to fine-grained code network performance.
>>
> 
> Amen, brother :)
> 
>> BTW, the cluster's servers were not (and I would not advise that servers
>> ever be) running the old distro -- we use a castle keep security model
>> where servers have extremely limited access, are the most tightly
>> monitored, and are kept aggressively up to date on a fully supported
>> distro like Centos.  The idea is to give humans more time to detect
>> intruders that have successfully compromised an account at the
>> workstation LAN level and squash them like the nasty little dung beetles
>> that they are.

Yup.  Even better is never letting the users log in to admin machines.
Provide machines for them to log into, submit and run jobs from.  Just
not the admin nodes.

[...]

>> In general, though, it is very good advice to stay with an updated OS.

... on threat-facing systems, yes, I agree.

For what I call production cycle shops, those places which have to churn
out processing 24x7x365, you want as little "upgrading" as possible, and
it has to be tested/functional with everything.  Ask your favorite CIO
if they would consider upgrading their most critical systems nightly.

It all boils down to a CBA (as everything does).  Upgrading carries
risk, no matter who does it, and how carefully things are packaged.  The
CBA equation should look something like this:

	value_of_upgrade = positive_benefits_of_upgrade -
			   potential_risks_of_upgrade

And if the value_of_upgrade is not strongly positive, you probably
should not do it if you are supplying a service to a user base.  Sure,
you can do it on your own personal cluster.  I appreciate that people on
this list do this for their systems.  Regardless of this, you need to be
of the (somewhat paranoid) mindset when looking at an upgrade, and the
potential for loss of time/data/...

A (not so great) example would be someone packaging up a recent 2.6.19
kernel with that oh-so-nice ext3-vm interaction which gave us
compromised files.  It hit mmap based files from what I could see.  All
you need is an end user with a corner case that happens to tickle the
trigger and whammo.  You are now spending time fixing their problem
(which requires downgrading/upgrading).

You have a perfectly valid reason to upgrade threat facing nodes.  Keep
them as minimal and as up-to-date as possible.  The non-threat facing
nodes, this makes far less sense.  If you are doing single factor
authentication, and have enabled passwordless access within the cluster:
 ssh keys or certificates or ssh-agent based, once a machine that holds
these has been compromised, the game is over.  Multi-factor
authentication for launching cluster runs is still a challenge, as
queuing systems may schedule jobs to start at 3am local time, and no one
wants to wait around for job start to enter additional factors.

You want to test any upgrade, and only upgrade what needs upgrading.
Just like other aspects of security 101, threat facing nodes need to be
running as little (important) stuff as possible, and need as limited
access as you can give them.  Upgrades can and do carry their own bugs
and security holes, and you really don't want to be chasing those as well.

>> My real point was that WITH yum and a bit of prototyping once every
>> 12-24 months, it is really pretty easy to ride the FC wave on MANY
>> clusters, where the tradeoff is better support for new hardware and more
>> advanced/newer libraries against any library issues that one may or may
>> not encounter depending on just what the cluster is doing.  Freezing FC
>> (or anything else) long past its support boundary is obviously less
>> desireable.  However, it is also often unnecessary.
>>
> 
> Fedora Legacy just closed its doors - if you take a couple of months 
> to get your Uebercluster up and running, you're 1/3 of the way through 
> your FC cycle :( It doesn't square. Fedora looks set to lose its way 
> again for Fedora 7 as they merge Fedora Core and Extras and grow to 

Hmmm.  Fedora is the testing framework for RHEL.  We know this.  I like
6, it looks to be a fine test distro, and has lots of nice things in it.
 Works on lots of hardware.  If I were building a cluster on it, I would
not upgrade the compute nodes. Once they are set, unless there is a good
reason to upgrade (newer packages that do not add needed or missing
features is not a valid reason IMO), I would leave the compute nodes
alone.  Probably the head node as well.  The login nodes are a different
story.  Upgrade them (security patches) as quickly as possible.

> n-000 packages again - the fast upgrade cycle, lack of maintainers and 
> lack of structure do not bode well. They're apparently moving to a 13 month 
> upgrade cycle - so your Fedora odd releases could well be three years apart. 
> The answer is to take a stable distribution, install the minimum and work 
> with it OR build your own custom infrastructure as far as I can see. 
> Neither Red Hat nor Novell are cluster-aware in any detail - they'll 
> support their install and base programs but don't have the depth of 
> expertise to go further :(

Both are happy to sell licenses to the unwary.  At the end of the day,
if you are going to build a RHEL cluster, use Centos/Scientific Linux
unless you absolutely wish to pay RH for security patches.  With SuSE,
use OpenSuSE.  If you are going to settle on Fedora, pick a distro, and
remember that it will be out of support in a year, which shouldn't
matter to the compute/head node once they are up.

>> On clusters that add new hardware, usually bleeding edge, every four to
>> six months as research groups hit grant year boundaries and buy their
>> next bolus of nodes, FC really does make sense as Centos probably won't
>> "work" on those nodes in some important way and you'll be stuck
>> backporting kernels or worse on top of your key libraries e.g. the GSL.
>> Just upgrade FC regularly across the cluster, probably on an "every
>> other release" schedule like the one we use.
>>
> 
> Chances are that anything Red Hat Enterprise based just won't work. New 
> hardware is always hard. 

Heh.  Try to point this out to a purchasing agent on an RFP which
demands a) newest possible hardware and b) RHEL 4 support.  You get to
pick one or the other, not both.  Which one do you want?  Hint: "b" is
far less valuable.

The other (not-so-funny) aspect of this is when we deliver new hardware
with an OS load that supports the newer hardware and someone wants to
pull it back to the "corporate standard".  In doing so, they give up
stability, performance, and often file system support.  Or in the case
of our JackRabbit unit, when we deliver 30TB of 5U system and we get the
"ext3 is almost as good as xfs" line.  Uh.... er.... no.   Those who
really insist upon this must only want 16TB units with no possibility to
ever grow beyond this (we have a design cooked up to show how to do a 1
PB in 4 racks as a single file system, or better, an HA 1 PB in 9 racks
as a single file system).  16TB is great for some folks, but it is a
fundamental ext3 limit.  You need the untried-in-the-real-world ext4 to
break that limit.  Or xfs and jfs.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615