[Beowulf] Power draw of cluster nodes under heavy load
prentice.bisbal at rutgers.edu
Mon Jul 28 11:53:05 PDT 2014
On 07/28/2014 02:13 PM, Mark Hahn wrote:
>> Are any of you monitoring the power draw on your clusters? If so, can
>> any of you provide me with some statistics on your power draw under
>> heavy load?
> good question; it's something that deserves more attention and coverage.
> ATM, I can only provide one non-answer:
> this is active mixed-user load (45 unrelated users, approximately 85%
> CPU utilization due to memory scheduling and job layout constraints).
> this an older cluster, HP dual-socket E5440 (2.833G) whose IPMI
> happens to
> return nice power measures.
Thanks. That image is more helpful than you think - I didn't even think
of using IPMI to report power consumption. Using that, I could run HPL
on some nodes here and get measurements.
>> Ideally, I'm looking for the power load for a worst-case scenario,
>> such as running HPL, on a per-rack basis.
> I don't understand the "per-rack" part - aren't you interested in
Ideally, per-node is even better, but I figured most measurements would
be at the PDU or circuit level, with one or two PDUs/Circuits per rack.
I figured this is the granularity most people are measuring at, which is
why I asked that way.
>> I have some numbers from a friend who lurks on this list, but the
>> more data points I have, the better I can justify my power
>> requirements for a new cluster purchase I'm working on.
> my experience is that vendors are useless in this regard: they always
> to quote the PSU max rating, and then often don't even use the number
> (ie, put all the low-dissipation stuff like networking together, etc.)
> has anyone tried to rate the accuracy of vendor power calculators?
> at least a few years ago, they were absurdly inflated.
This is why I'm asking for actual, measured numbers. I read a whitepaper
by APC or Raritan that said that if you go with the nameplate on a PDU,
you can oversize your power requirements by a factor of 2x. For HPC, I
imagine it wouldn't be that extreme, since cluster nodes tend to be at
100% more of the time and therefore use more power. One vendor said they
assume 60% - 90% of nameplate ratings when estimating power needs, which
is still a pretty broad range.
> regards, mark hahn.
More information about the Beowulf