[Beowulf] A couple of interesting comments
bill at cse.ucdavis.edu
Fri Jun 6 10:00:29 PDT 2008
Gerry Creager wrote:
> 1. We specified "No OS" in the purchase so that we could install CentOS
> as our base. We got a set of systems with a stub OS, and an EULA for
> the diagnostics embedded on the disk. After clicking thru the EULA, it
> tells us we have no OS on the disk, but does not fail to PXE.
If you want to avoid hooking up a KVM to each node and rebooting it once or
twice I'd suggest putting "Nodes must PXE boot by default" in your specifications.
> 2. BIOS had a couple of interesting defaults, including warn on
> keyboard error (Keyboard? Not intentionally. This is a compute node,
> and should never require a keyboard. Ever.) We also find the BIOS is
> set to boot from hard disk THEN PXE. But due to item 1, above, we never
> can fail over to PXE unless we load up a keyboard and monitor, and hit
> F12 to drop to PXE.
Very strange standard for a server, let alone a cluster node.
> In discussions with our sales rep, I'm told that we'd have had to pay
> extra to get a real bare hard disk, and that, for a fee, they'd have
> been willing to custom-configure the BIOS. OK, with the BIOS this isn't
> too unreasonable: They have a standard BIOS for all systems and if you
> want something special, paying for it's the norm... But, still, this is
> a CLUSTER installation we were quoted, not a desktop.
This whole thing sounds strangely like the vendor has already been picked.
Certainly changing any default in the pipeline can cost money, even deleting a
floppy, cd/dvd etc can cost money if the machine ships to the integration
center with it installed. With that said when someone charges an unreasonable
amount for said customizations they lose the bid and someone else wins.
> Also, I'm now told that "almost every customer" ordered their cluster
> configuration service at several kilobucks per rack. Since the team I'm
Not sure of the relevance here. Sounds like the upsell and padding that
sales folks love, it is there job to sell equipment preferably high margin at
that. Seems way high for a BIOS reset, less so if it includes a cabling
harness for power, console, rails premounted, and network. Again if it's a
> working with has some degree of experience in configuring and installing
> hardware and software on computational clusters, now measured in at
> least 10 separate cluster installations, this seemed like an unnecessary
> expense. However, we're finding vendor gotchas that are annoying at the
> least, and sometimes cause significant work-around time/effort.
Well there's two choice, either deal with the gotchas, or make them part of
the specifications. All vendors have their differences, defaults, and cost
structures. Do you want a cluster that could conceivable allow users to
start submitting jobs within a day? Or do you want to play BIOS games,
testing, and integration that might take a week or two. Every time I order a
cluster (well over 10 now) I get vendor queries of the "Sounds like X might
mean you need Y which costs $Z". I'm always very clear, it's in the spec, and
not meeting the spec will mean the bid isn't considered. Definitely seems
like some high margin items end up included... without the margin.
> Finally, our sales guy yesterday was somewhat baffled as to why we'd
> ordered without OS, and further why we were using Linux over Windows for
Heh, some sales folks seem to have a right to exert design pressure on cluster
design, not sure why your even entertaining that one. If you want to be
particularly friendly I'd just point at top500.org and that linux is the
standard and not the exception for beowulf clusters.
> No, I won't identify the vendor.
How about the number of letters in their name ;-). In general I find that the
big vendors build in large profits (I.e. negotiating down to 50% of list price
is not unusual) and often the preferred cluster defaults often mean higher
costs instead of less, despite the typically higher volume purchases,
identical compute nodes, don't need a dvd, don't need an OS, don't (typically)
need a redundant power supply for compute nodes, etc. The smaller cluster
specific shops default (usually) to mostly reasonable cluster configurations,
and seem to default to smaller margins. In my experience, writing a spec that
welcomes both ends up with the best deals. Even something trivial like
specifying 14 or 15 disks in a array (often the max for an external array)
instead of 16 (common for direct attached) can be the different to allow
a competitive bid from a big vendor. Sometimes Intel or AMD intercedes to get
a design win and sometimes a big vendor decides to get more competitive.
Of course these specifications directly effect costs and lead to endless
discussions on this list. KVM over IP? Serial console? Any console access
at all? IMPI or just switched PDUs? But in my experience things like "must
boot from PXE" is not a big deal, and not worth several kilobucks.
More information about the Beowulf