Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] When is compute-node load-average "high" in the HPC context? Setting correct thresholds on a warning script.

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Reuti reuti at staff.uni-marburg.de
Wed Sep 1 01:47:29 PDT 2010


Am 01.09.2010 um 09:34 schrieb Christopher Samuel:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 01/09/10 01:58, Reuti wrote:
> 
>> With recent kernels also (kernel) processes in D state
>> count as running.
> 
> I wouldn't say recent, that goes back as far as I can
> remember.
> 
> For instance I've seen RHEL3 (2.4.x - sort of) NFS servers
> with load averages in the 80's where they were run with a lot
> of nfsd's that were blocked waiting for I/O due to ext3.

My impression was always (as there is a similar setting for the load_threshold in OGE), that it should limit the number of jobs on a big SMP machine when you oversubscribe by intention, as not all parallel jobs are really using all the CPU power over their lifetime (maybe such a machine was even operated w/o any NFS). Then allowing e.g. 72 slots for jobs on a 60 core maschine might get most out of it with a load near 100%.

Well, getting now 12 cores in newer CPUs and assemble them to 24 or 48 core machines would make such a setting useful again. Maybe the load sensor should honor only the scheduled jobs' load.

-- Reuti


> cheers!
> Chris
> - -- 
> Christopher Samuel - Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computational Initiative
> Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>         http://www.vlsci.unimelb.edu.au/
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAkx+AfwACgkQO2KABBYQAh+QhgCfUUgmyUUGYtQ00Xd8/N/TOXN1
> 47gAn0DYzhSrZV1pY489HpMVhjGNVXPl
> =70PC
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf





More information about the Beowulf mailing list