Scyld: memory leak or just caching?

Mike Weller weller at zyvex.com
Wed Jun 27 11:46:45 PDT 2001


Hello,

I'm running Scyld with a recompiled source to allow for bigmem
support.  Each of my nodes has 1.5G of RAM.  The kernel was compiled
with 1G and bigmem.  When the system comes up, there is roughly 100M
in use, which is reasonable.

root at beowulf /root # bpsh -a free
             total       used       free     shared    buffers     cached
Mem:       1572388     102528    1469860          0      50432      16668
-/+ buffers/cache:      35428    1536960
Swap:      2097136          0    2097136
<snip...>
Mem:       1572388     102152    1470236          0      50384      16664
Mem:       1572388     102436    1469952          0      50384      16664
Mem:       1572388     102436    1469952          0      50384      16664

If I take the master node down, or disconnect the network to cause the
slaves to reboot, the slave drives get FSCK'd, which results in 959M
being used right off the start.  I can monitor with "beostatus" as the
slaves are being fsck'd, and it's definitely growing as the fsck is in
progress.  It never seems to go beyond 960M, which leads me to think
that the kernel is using it.

root at beowulf /root # bpsh -a free
             total       used       free     shared    buffers     cached
Mem:       1572388     981856     590532          0     852032      18244
-/+ buffers/cache:     111580    1460808
Swap:      2097136          4    2097132
...
Mem:       1572388     981352     591036          0     852844      16904
Mem:       1572388     981308     591080          0     852832      16904
Mem:       1572388     982096     590292          0     852728      17584


Will this memory be freed as I need it?  Is this a memory leak, or
is it just caching an enormous amount of disk IO?

Nothing under the SZ column shows where the memory went (if it weren't
the kernel).  The PRI and NI values for mdrecoverd are out of wack.
It's like this without the fscking as well, so I'm sure it's not related.

Any comments?

root at beowulf /root # bpsh 0 ps -leaf
  F S UID        PID  PPID  C PRI  NI ADDR    SZ  WCHAN STIME TTY          TIME CMD
100 S root         1     0  1  60   0    -   140 1196a0 16:21 ?        00:00:03 init
040 S root         2     1  0  60   0    -     0 12bc88 16:21 ?        00:00:00 [kflushd]
040 S root         3     1  0  62   0    -     0 12bcec 16:21 ?        00:00:00 [kupdate]
040 S root         4     1  0  60   0    -     0 1213f1 16:21 ?        00:00:00 [kpiod]
040 S root         5     1  0  60   0    -     0 12466a 16:21 ?        00:00:00 [kswapd]
040 S root         6     1  0 -2147483589 2147483647 - 0 18906c 16:21 ? 00:00:00 [mdrecoveryd]
140 S root        15     1  0  77   0    -   143 132831 16:21 ?        00:00:00 init
040 S root        16     1  0  61   0    -   140 132831 16:21 ?        00:00:00 init
040 S root        17    15  0  60   0    -   144 132831 16:21 ?        00:00:00 init
040 S root        29     1  0  60   0    -     0 1d34d7 16:21 ?        00:00:00 [scsi_eh_0]
040 S root        55     1  0  60   0    -     0 93dd4a 16:22 ?        00:00:00 [rpciod]
140 R root        96    15  0  78   0    -   618      - Jul10 ?        00:00:00 ps -leaf


-- 
Michael J. Weller, M.Sc.               office: (972) 235-7881 x.242
weller at zyvex.com                         cell: (214) 616-6340
Zyvex Corp., 1321 N Plano           facsimile: (972) 235-7882    
Richardson, TX 75081                      icq: 6180540





More information about the Beowulf mailing list