Awesome!!!<div><br></div><div>With my job title most folks think I'm essentially technically neutered these days.  </div><div><br></div><div>Good to see there is still some life in this old dog :-)<br><br>Best,</div><div><br></div><div>J. </div><div><br>On Thursday, July 9, 2015, mathog <<a href="mailto:mathog@caltech.edu">mathog@caltech.edu</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 09-Jul-2015 11:54, James Cuff wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<a href="http://blog.jcuff.net/2015/04/of-huge-pages-and-huge-performance-hits.html" target="_blank">http://blog.jcuff.net/2015/04/of-huge-pages-and-huge-performance-hits.html</a><br>

</blockquote>

<br>

Well, that seems to be it, but not quite with the same symptoms you observed.  khugepaged never showed up, and "perf top" never revealed _spin_lock_irqsave.  Instead this is what "perf top" shows in my tests:<br>

<br>

(hugepage=always, when migration/# process observed)<br>

 89.97%  [kernel]       [k] compaction_alloc<br>

  1.21%  [kernel]       [k] compact_zone<br>

  1.18%  [kernel]       [k] get_pageblock_flags_group<br>

  0.75%  [kernel]       [k] __reset_isolation_suitable<br>

  0.57%  [kernel]       [k] clear_page_c_e<br>

<br>

(hugepage=always, when events/# process observed)<br>

 85.97%  [kernel]       [k] compaction_alloc<br>

  0.84%  [kernel]       [k] compact_zone<br>

  0.65%  [kernel]       [k] get_pageblock_flags_group<br>

  0.64%  perf           [.] 0x000000000005cff7<br>

<br>

(hugepage=never)<br>

 29.86%  [kernel]       [k] clear_page_c_e<br>

 21.88%  [kernel]       [k] copy_user_generic_string<br>

 12.46%  [kernel]       [k] __alloc_pages_nodemask<br>

  5.70%  [kernel]       [k] page_fault<br>

<br>

This is good, because "perf top" shows that the underlying issue<br>

is compaction_alloc and compact_zone even though what top shows<br>

is in one case migration/# and when locked to a cpu, events/#.<br>

<br>

Switching hugepage always->never seems to make things work right away.  Switching hugepage never->always seems to take a while to break.  In order to get it to start failing many of the big files involved must be copied to /dev/null again, even though they were presumably already in file cache.<br>

<br>

Searched for "compaction_alloc" and "compact_zone" and found a suggestion here<br>

<br>

<a href="https://structureddata.github.io/2012/06/18/linux-6-transparent-huge-pages-and-hadoop-workloads/" target="_blank">https://structureddata.github.io/2012/06/18/linux-6-transparent-huge-pages-and-hadoop-workloads/</a><br>

<br>

to do:<br>

<br>

echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag<br>

<br>

(transparent_hugepage is a link to redhat_transparent_hugepage).<br>

Reenabled hugepage and reproduced the painfully slow IO, set defrag to "never" and the IO was fast again, even though hugepage was still enabled.<br>

<br>

So on my machine the problem seems to be with hugepage defrag specifically.  Disabling just that is sufficient to resolve the issue, it isn't necessary to take out all of hugepage.  Will let<br>

it run that way for a while and see if anything else shows up.<br>

<br>

For future reference:<br>

<br>

CentOS release 6.6 (Final)<br>

kernel 2.6.32-504.23.4.el6.x86_64<br>

Dell Inc. PowerEdge T620/03GCPM, BIOS 2.2.2 01/16/2014<br>

48 Intel Xeon CPU E5-2695 v2 @ 2.40GHz  (in /proc/cpuinfo)<br>

RAM 529231456 kB (in /proc/meminfo)<br>

<br>

Thanks all!<br>

<br>

David Mathog<br>

<a>mathog@caltech.edu</a><br>

Manager, Sequence Analysis Facility, Biology Division, Caltech<br>

</blockquote></div><br><br>-- <br>(Via iPhone)<br>