Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] While the knives are out... Wulf Keepers

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

SIM DOG steve_heaton at iinet.net.au
Wed Aug 9 17:47:00 PDT 2006


Mark started it so while we're asking loaded questions... =)

I recently visited a large educational institution (that shall remain
nameless) that hosts an excellent, world class, science research team.
They also have a reasonably large Beowulf environment (over 100 dual nodes).

Now maybe it was just the people I was talking too (management) but I
get the distinct impression that they treat their 'Wulf as an
'appliance'. It came as a great disappointment :/

Reliability is their prime (only?) concern. Researchers are expected to
address any performance issues with their code. Well, yeah, OK... with
>their code< but what about the underlying infrastructure? Who keeps the
Wulf 'tool' nice and sharp? For this code does anyone *understand* how
to check the Wulf behavior and see if its helping or hindering?

I know from personal experience that one piece of code they run would be
expected to be CPU bound... but the interconnect plays a bigger part
that *I* expected (and I've spent a fair bit of time under the bonnet...
with still more things to explore). What about using a second NIC to off
load the admin traffic? How deeply have you looked into your compiler
switches? Looked into a FNN topology? These and other questions...

Obviously reality bites. Staff cost money. So does extra hardware. You
don't want to have a sick, flakey, Wulf that can't keep the customers
happy. However, knowing the size of their research staff and what
they're likely to get paid, I'd have thought a least one person with the
skills to keep the Wulf *near* the edge would be a good thing?

In fairness, maybe I got the wrong impression. Maybe I was just talking
to the wrong people. It was just the overall feel I got. The Wulf was
big and seemed fast and they were happy. I was just disappointed that
they didn't seem prepared to put a tad of TLC into their Wulf. Wouldn't
dropping a run from four days to three be worth investigating? [Sigh]

Is this typical of educational institutions? Am I missing some
consideration that would explain the apparent apathy? I'm prepared to
accept I'm just being naive.

However... If you're in Australia, run a Wulf that you like to see run
nice and sharp. Drop me a line... I'm looking for a job and you could be
my kinda place! :)

Stevo



More information about the Beowulf mailing list