[Beowulf] big read triggers migration and slow memory IO?
prentice.bisbal at rutgers.edu
Fri Jul 10 07:59:23 PDT 2015
Every dog has its day! ;)
On 07/09/2015 05:59 PM, James Cuff wrote:
> With my job title most folks think I'm essentially technically
> neutered these days.
> Good to see there is still some life in this old dog :-)
> On Thursday, July 9, 2015, mathog <mathog at caltech.edu
> <mailto:mathog at caltech.edu>> wrote:
> On 09-Jul-2015 11:54, James Cuff wrote:
> Well, that seems to be it, but not quite with the same symptoms
> you observed. khugepaged never showed up, and "perf top" never
> revealed _spin_lock_irqsave. Instead this is what "perf top"
> shows in my tests:
> (hugepage=always, when migration/# process observed)
> 89.97% [kernel] [k] compaction_alloc
> 1.21% [kernel] [k] compact_zone
> 1.18% [kernel] [k] get_pageblock_flags_group
> 0.75% [kernel] [k] __reset_isolation_suitable
> 0.57% [kernel] [k] clear_page_c_e
> (hugepage=always, when events/# process observed)
> 85.97% [kernel] [k] compaction_alloc
> 0.84% [kernel] [k] compact_zone
> 0.65% [kernel] [k] get_pageblock_flags_group
> 0.64% perf [.] 0x000000000005cff7
> 29.86% [kernel] [k] clear_page_c_e
> 21.88% [kernel] [k] copy_user_generic_string
> 12.46% [kernel] [k] __alloc_pages_nodemask
> 5.70% [kernel] [k] page_fault
> This is good, because "perf top" shows that the underlying issue
> is compaction_alloc and compact_zone even though what top shows
> is in one case migration/# and when locked to a cpu, events/#.
> Switching hugepage always->never seems to make things work right
> away. Switching hugepage never->always seems to take a while to
> break. In order to get it to start failing many of the big files
> involved must be copied to /dev/null again, even though they were
> presumably already in file cache.
> Searched for "compaction_alloc" and "compact_zone" and found a
> suggestion here
> to do:
> echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
> (transparent_hugepage is a link to redhat_transparent_hugepage).
> Reenabled hugepage and reproduced the painfully slow IO, set
> defrag to "never" and the IO was fast again, even though hugepage
> was still enabled.
> So on my machine the problem seems to be with hugepage defrag
> specifically. Disabling just that is sufficient to resolve the
> issue, it isn't necessary to take out all of hugepage. Will let
> it run that way for a while and see if anything else shows up.
> For future reference:
> CentOS release 6.6 (Final)
> kernel 2.6.32-504.23.4.el6.x86_64
> Dell Inc. PowerEdge T620/03GCPM, BIOS 2.2.2 01/16/2014
> 48 Intel Xeon CPU E5-2695 v2 @ 2.40GHz (in /proc/cpuinfo)
> RAM 529231456 kB (in /proc/meminfo)
> Thanks all!
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> (Via iPhone)
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf