[Beowulf] Definition of HPC
landman at scalableinformatics.com
Wed Apr 17 09:56:53 PDT 2013
On 04/17/2013 12:34 PM, Max R. Dechantsreiter wrote:
>>> As to your "Issue #2:"
>>> "Owned compute" has some advantages over "rented compute." In general, the
>>> control one has over one's owned resources enables applications to run with
>>> greater performance. Some optimizations just demand root access!
>> Although I hear those who have responded to this, this is particularly true
>> in my case as a systems researcher. Not only is my research impossible to
>> optimize without root access, it's impossible to perform whatsoever. Because
>> of that I am constantly at odds with my IT dept at PSU. Hence the NAS server
>> and small beo-cluster in my home...
> I have had similar experiences - academic IT departments are the worst!
Without naming names ... we had a cluster we had set up several years
ago, with a particular cluster distribution compromised by an errant
graduate student running windows on a compromised laptop. They couldn't
break into the cluster, so they installed a key logger, and caught him
typing the root password. The rest is, shall we say, history.
We implored them to never ever do what they did. They chose to ignore
us, as "research couldn't get done without root".
Well, that attacked knocked this *entire university* off the interwebs
for a few hours.
We caught heat because they ignored our advice. So we set up a system
that was simply not compromisable. If you never type a password, you
have zero probability of ever capturing a password to log in with. And
if no ports are ever publicly exposed, its extraordinarily hard to break
a port service. You can DDoS it, but there are simple countermeasures
that can be implemented to black-hole the low end of that range. At the
higher end, you start overloading each node up the chain and you can't
handle that without support from your network provider.
So, I am sorry ... if you *require* root to perform your work on a
regular basis, chances are, you are one misstep from misfortune, and its
quite likely to be self-inflicted.
This said, the most amazing thing about this whole episode was, after
reporting this, and following the forensic clues, and reporting them to
the cluster mailing list ... those in charge of the mailing list took
great personal offence at the writeup and reporting ... and banned me
from the list. I was more saddened than annoyed, as what I found and
reported on would likely have helped others prevent attacks. No skin
off my nose, we took this as a signal to work much harder on Tiburon,
which is now quite good.
But back to the running with scissors down broken staircases, in the
dark, with low coefficients of friction on the stable steps, and many
missing or unstable steps ... that is running as root. Make sure you
have good, recent backups, and you test that your backups are recent,
and correct, before you go break something important. And if you rely
upon external support, make darned sure they have a clue.
Explaining to investors, customers, management, granting agencies how
your own management failures resulted in massive data/information
lossage is not ... well ... a pleasant thing. Usually results in a few
Running as root? Yeah, its that bad. Just say no.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf