[Beowulf] cluster profiling

Christopher Samuel samuel at unimelb.edu.au
Tue Nov 2 17:49:59 PDT 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/11/10 08:45, tomislav_maric at gmx.com wrote:

> that will use the Ganglia-python interface and try to
> give me an insight into the way machine is burdened
> during runs

Depending on how old your kernel is the "perf" utility
(found in the tools/perf directory in your kernel sources,
or packaged in Ubuntu as part of the linux-tools package
or linux-tools-2.6 in Debian Squeeze) may well give you
some interesting stats.

As a here is an overview of stats a "find -ls" over
the current kernel git tree:

$ perf stat find . -ls > /dev/null

 Performance counter stats for 'find . -ls':

     372.415331  task-clock-msecs         #      0.923 CPUs
            158  context-switches         #      0.000 M/sec
              2  CPU-migrations           #      0.000 M/sec
            395  page-faults              #      0.001 M/sec
      648855865  cycles                   #   1742.291 M/sec
      698863597  instructions             #      1.077 IPC
       14321645  cache-references         #     38.456 M/sec
         379109  cache-misses             #      1.018 M/sec

    0.403454703  seconds time elapsed


You can use the "perf list" command to get a list of all
the kernel tracepoints you can monitor and then you can
select them individually with the "stat" command.

Here is perf monitoring CPU migrations, L1 dcache misses
and the kernel scheduler stats of that well known HPC
program "top". ;-)

perf stat -e migrations -e L1-dcache-load-misses -e sched:* top

[...]

 Performance counter stats for 'top':

              0  CPU-migrations           #      0.000 M/sec
        1038307  L1-dcache-load-misses    #      0.000 M/sec
              0  sched:sched_kthread_stop #      0.000 M/sec
              0  sched:sched_kthread_stop_ret #      0.000 M/sec
              0  sched:sched_wait_task    #      0.000 M/sec
             98  sched:sched_wakeup       #      0.000 M/sec
              0  sched:sched_wakeup_new   #      0.000 M/sec
             61  sched:sched_switch       #      0.000 M/sec
              0  sched:sched_migrate_task #      0.000 M/sec
              0  sched:sched_process_free #      0.000 M/sec
              1  sched:sched_process_exit #      0.000 M/sec
              0  sched:sched_process_wait #      0.000 M/sec
              0  sched:sched_process_fork #      0.000 M/sec
             15  sched:sched_signal_send  #      0.000 M/sec
             49  sched:sched_stat_wait    #      0.000 M/sec
            174  sched:sched_stat_runtime #      0.000 M/sec
             67  sched:sched_stat_sleep   #      0.000 M/sec
              0  sched:sched_stat_iowait  #      0.000 M/sec

   29.452075124  seconds time elapsed

With root access you can even do "perf top" to see what's
going on under the hood.

You can also use "perf record -g $COMMAND" to record the profiling
information for $COMMAND to perf.data along with call graph information
so you can display a detailed tree view of what was going on via the
"perf report" command.

Quite a neat little tool I've got to say!

cheers,
Chris
- -- 
 Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computational Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkzQsbcACgkQO2KABBYQAh+6JACaAx7p0zARcGGO4busVv7AbqHL
tCcAnA4Z6HOs1LTbucprnyBJFxF6glo+
=D2wX
-----END PGP SIGNATURE-----



More information about the Beowulf mailing list