[Beowulf] Problems with Dell M620 and CPU power throttling

Fri Aug 30 08:10:07 PDT 2013

Of course we have done system tuning.  Limits, arp table entries, block 
size changes.  tuned actually helped by setting a number of parameters 
in sysctl among other sys parameters.  Performance has increased due to 
the changes made here.  Prior to making these changes, when the system 
was installed with a vanilla RH 6.1, we saw these problems.  Incremental 
changes helped performance but never resolved this power capping issue.

Instrumenting temperature probes on individual CPUs has not been 
performed.  When we look at temperatures from both the chassis and 
ipmitool, we see no drastic peaks.  Maybe we are getting a 60C peak that 
we don't detect and that is the cause.  But I doubt it.

Yes, a power cycle cools down the cores.  Maybe that is the reset we 
see.  Prior to a power cycle we can look at the BMC temps, again using 
ipmitool or the chassis GUI and wee that temps are well below 30C.  We 
also see that power consumption is around 80W.  That tells me that the 
system is cool enough.  Should I not believe those values?  i have no 
reason to from past experience.

Input air is about 22C.  For our data center, you'd have a better chance 
of getting this adjusted to 15C than I would!  As for fans, these don't 
have any and are controlled at the chassis level.

For heat sink thermal grease problems, I'd expect this to be visible 
using the ipmitools but maybe that is not where the temperatures are 
being measured.  I don't know about that issue.  I'd expect that a bad 
thermal grease issue would manifest itself by showing up on a per socket 
level and not on both sockets.  It seems odd that every node exhibiting 
this problem would have both sockets having the same issue.

Again, the magnitude of the problem is about 5-10% at any time.  Given 
600 compute nodes, this is a lot of nodes showing similar problems.  And 
in most cases, a power cycle cures this issue.

Since we have not seen a node go "bad" while idle, this does point to 
something overheating perhaps.  The tools i know about to watch this 
temperature are not sufficient in showing me this though.

Thanks,
Bill

On 08/30/2013 10:44 AM, Mark Hahn wrote:
>> We run the RH 6.x release and are up to date with kernel/OS patches.
>
> have you done any /sys tuning?
>
>> non-redundant.  tuned is set for performance.  Turbo mode is
>
> what knobs does tuned fiddle with?  I would probably turn off all
> auto-tuning and go strictly manual until the issue is understood.
>
>> on/hyperthreading is off/performance mode is set in BIOS.
>
> I found a "x86_energy_perf_policy.c" (author Len Brown) which I run on
> my laptop to set powersave mode.  it sets
> MSR_IA32_ENERGY_PERF_BIAS. it says that the hardware default is
> performance,
> but I wouldn't be surprised if "normal" is set by bios.
>
>> A reboot does not change this problem.  But a power cycle returns the
>> compute node to normal again.  Again, we do not know what triggers this
>
> unfortunately, a power cycle will also cool down the system,
> so I don't see how it can be dissociated from heating.
>
>> event.  We are not overheating the nodes.
>
> how do you know?
>
>> But while applications are
>> running, something triggers an event where this power capping takes
>> effect.
>
> it might be interesting to examine the cpu-heatsink contact.  what you're
> describing could be explained by poor thermal HS contact (or poor HS flow).
>
> or do you mean that you sample die temps at high resolution and know
> that you're never hitting, say, 60C?
>
> in some machines (albeit less often servers), acpi provides some knobs
> which are made visible under /sys and seem to permit some control of
> thermal mode (fan threshold or scaling).
>
>> If anyone has a clue, or better yet, solved the issue, we'd love to hear
>> the solution!
>
> what's your intake air temp?  I would try giving it cold (say, 15C) air.