[Beowulf] A couple of interesting comments
gerry.creager at tamu.edu
Fri Jun 6 08:39:47 PDT 2008
We recently purchased a set of hardware for a cluster from a hardware
vendor. We've encountered a couple of interesting issues with bringing
the thing up that I'd like to get group comments on. Note that the RFP
and negotiations specified this system was for a cluster installation,
so there would be no misunderstanding...
1. We specified "No OS" in the purchase so that we could install CentOS
as our base. We got a set of systems with a stub OS, and an EULA for
the diagnostics embedded on the disk. After clicking thru the EULA, it
tells us we have no OS on the disk, but does not fail to PXE.
2. BIOS had a couple of interesting defaults, including warn on
keyboard error (Keyboard? Not intentionally. This is a compute node,
and should never require a keyboard. Ever.) We also find the BIOS is
set to boot from hard disk THEN PXE. But due to item 1, above, we never
can fail over to PXE unless we load up a keyboard and monitor, and hit
F12 to drop to PXE.
In discussions with our sales rep, I'm told that we'd have had to pay
extra to get a real bare hard disk, and that, for a fee, they'd have
been willing to custom-configure the BIOS. OK, with the BIOS this isn't
too unreasonable: They have a standard BIOS for all systems and if you
want something special, paying for it's the norm... But, still, this is
a CLUSTER installation we were quoted, not a desktop.
Also, I'm now told that "almost every customer" ordered their cluster
configuration service at several kilobucks per rack. Since the team I'm
working with has some degree of experience in configuring and installing
hardware and software on computational clusters, now measured in at
least 10 separate cluster installations, this seemed like an unnecessary
expense. However, we're finding vendor gotchas that are annoying at the
least, and sometimes cause significant work-around time/effort.
Finally, our sales guy yesterday was somewhat baffled as to why we'd
ordered without OS, and further why we were using Linux over Windows for
HPC. Not trying to revive the recent rant-fest about Windows HPC
capabilities, can anyone cite real HPC applications generally run on
significant clusters (I'll accept Cornell's work, although I remain
personally convinced that the bulk of their Windows HPC work has been
dedicated to maintaining grant funding rather than doing real work)?
No, I won't identify the vendor.
Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843
More information about the Beowulf