[Beowulf] LSI Megaraid stalls system on very high IO?
mathog at caltech.edu
Wed May 13 15:57:01 PDT 2015
On 31-Jul-2014 10:55, mathog wrote:
> On 31-Jul-2014 10:36, Joe Landman wrote:
>> On 7/31/14, 12:37 PM, mathog wrote:
>>> Any pointers on why a system might appear to "stall" on very high IO
>>> through an LSI megaraid adapter? (dm_raid45, on RHEL 5.10.)
>> What IO scheduler are you using?
>> cat /sys/block/sd*/queue/scheduler
> % cat /sys/block/sd*/queue/scheduler
I just ran into the same thing on another similar (but larger) Dell. A
job was run that spun off 20 subprocesses, each of which tried to read a
different 11Gb file. The way this works is that each process opened its
file, determined its size with seeks, allocated a big block of memory to
hold the whole thing, and then started reading the data into that block
sequentially. top showed almost no CPU time on these processes. A
minute or two in this locked up tight for about 20 minutes.
Interestingly, when the machine once again started answering keystrokes
"top" showed each of these processes with 22 Gb of virtual (see below).
This machine has all of its disks packed into two volumes controlled by
the megaraid adapter: one is a Raid and the other is just a small disk
partition for swap. Neither volume appears to have any scheduler
there is also sda, which seems to be the same external disk that gave me
conniptions a while back in another thread, moved over to this machine
(for no apparent reason). That disk has a scheduler, which is cfq.
How is it possible to mount a volume, Raid or no, with no scheduler???
doesn't provide any clues:
/dev/mapper/vg_sitar-lv_root / ext4 defaults 1 1
The script that locked things up was run again with a limit of 15
subprocesses. This did not lock up (yet) and it shows 30-100% CPU on
each. The CPU usage is jumping around in some complex manner. The
strangest thing is that as they read in data
top shows virt of 11G and res of something less than that. When the
process gets to the end of the input file, that is Res hits 11G, it
closes the input file, opens the output file (which is the same in this
case) and then calls qsort. Literally, it is just:
qsort((void *)buffer, len_file/gbl_reclen, gbl_reclen,
When that happens virtual jumps from 11G to 22G (instantaneously) and
CPU usage goes to 100%. I can see why the CPU usage would be at 100%
while qsort ran, but cannot imagine why virtual should double, unless it
somehow associates file cache for the output file with the subprocess.
Note that it did not do that while the same file was being read.
Also, can somebody explain the point of having 4Gb of swap on a 529Gb
RAM machine? It is a Centos box, perhaps that is just the way that OS
sets things up?
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
More information about the Beowulf