[Beowulf] hpl size problems
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at physics.mcmaster.caMon Sep 26 12:20:00 PDT 2005
- Previous message: [Beowulf] dual-core benefits?
- Next message: [Beowulf] hpl size problems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> Warewulf by default creates the virtual node file system to be extremely > minimal yet fully functional and tuned for the job at hand (which exists > in a hybrid RAM/NFS file system). but HPL does very little IO and runs few commands. > The nodes are lightweight in both file > system and process load (context switching and cache management can be > expensive especially on a non-NUMA SMP systems with lots of cache). The > more daemons and extra processes that are running, the higher the > process load and context switching that must occur. it's hard to guess since we don't know what you were running before. the only way I can imagine this (random procs) mattering is if you were running a full desktop install before, and had some polling daemons running. (magicdev, artsd, etc). on my favorite cluster, I use the obvious kind of initrd+tmpfs+NFS and don't run any extra daemons. on a randomly chosen node running two MPI workers (out of 64 in the job), "vmstat 10" looks like this: [hahn at node1 hahn]$ ssh node70 vmstat 10 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 2106876 58240 11504 1309148 0 0 0 0 1035 54 99 1 0 0 2 0 2106876 58312 11504 1309148 0 0 0 0 1037 59 99 1 0 0 2 0 2106876 58312 11504 1309148 0 0 0 0 1034 55 99 1 0 0 2 0 2106876 58312 11504 1309148 0 0 0 0 1033 56 99 1 0 0 2 0 2106876 58312 11504 1309148 0 0 0 0 1034 44 99 1 0 0 2 0 2106876 58312 11504 1309148 0 0 0 0 1031 41 99 1 0 0 2 0 2106876 58312 11504 1309148 0 0 0 0 1033 39 99 1 0 0 I haven't updated the kernel to a lower HZ yet, but will soon. I assert without the faintest whisp of proof that 50 cs/sec is inconsequential. the gigabit on these nodes is certainly not sterile either - plenty of NFS traffic, even some NTP broadcasts. actually, I just tcpdumped it a bit, and the basal net rate is an arp, 4ish NFS access/getattr calls every 60 seconds. > It reminds me of chapter 1 of sysadmin 101: Only install what you *need* sure, but that's not inherent to your system, and unless you had some pretty godaweful stuff installed before, it's hard to see that explanation... > If someone else also has thoughts as to what would have caused the > speedup, I would be very interested. a full-fledged desktop load doesn't cause *that* much extraneous load - yes, there are interrupts and the like, but you have to remember that modern machines have massive memory bandwidth, big, associative caches, and such stuff doesn't matter much. especially for HPL - it's not exactly tightly-coupled, is it? if it were (ie, MANY global collectives per second), then I could easily buy the explanation that removal of random daemons would help a lot. after all, this has been known for a long time (though generally only on very large clusters). > > > hours) running on Centos-3.5 and saw a pretty amazing speedup of the > > > scientific code (*over* 30% faster runtimes) then with the previous > > > RedHat/Rocks build. Warewulf also makes the cluster rather trivial to > > > > such a speedup is indeed impressive; what changed? > > Actually, we used the same kernel (recompiled from RHEL), and exactly the > same compilers, mpi and IB (literally the same RPMS). The only thing > that changed was the cluster management paradigm. The tests were done > back to back with no hardware changes. afaik, recompiling a distro kernel does generally not get you the same binary as what the distro distributes ;) regards, mark hahn.
- Previous message: [Beowulf] dual-core benefits?
- Next message: [Beowulf] hpl size problems
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
