question about Intel P4 versus Alpha's
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Don Holmgren djholm at fnal.govFri Jan 10 10:51:24 PST 2003
- Previous message: question about Intel P4 versus Alpha's
- Next message: question about Intel P4 versus Alpha's
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, Jan 08, 2003 at 09:49:26AM -0600, Henderson, TL Todd wrote: > I have an app, that scaled out to the 30ish XP1000's we had very > nicely, basically linearly. If you looked at top, it was almost > always in the 95-100%cpu utilization range. However, now we have a > cluster of P4's and when I look at top, it is more like 70-80% cpu > usage. This is using the same number of cpu's, the same switch, > same code, and same problem for the code. The jobs are still > completing in about 30-40% less time, so we are getting a speed up. > My guesstimate was that this was an indication of the memory > bandwidth and speed to memory. I know the XP1000's have a nice > memory subsystem. Was/is it that much better than the 533/2.4 ghz > P4's? Assuming that hyperthreading is off, I believe that if your processes are only showing 70-80% utilization then your P4's are definitely not memory bound. My experience is that memory bound processes show 99%+ utilization - when the processor stalls, say for TLB or cache line loads, your process is "billed" for that whole time. When "streams" runs, for example, it uses up all available memory bandwidth and shows 99%+ cpu utilization. Likewise, if your processor is stalling on FPU access, I'd guess utilization would show 99%+ as well. FWIW, the memory subsystem on 533/2.x GHz P4's is quite a bit better than an XP1000, at least as far as the streams benchmark shows. The "COPY" number posted at the streams website for Compaq_XP1000 is 900 MB/sec. I just ran on a 533/2.26 P4 (PC1066 RDRAM) and measured 2024 MB/sec. Perhaps your XP1000 cluster was close to reaching the limits of your network, and now with the faster P4's your jobs are I/O bound. If running a simple cycle-eating process at a low priority on your nodes at the same time as your job (to reach 100% cpu utilization) doesn't affect throughput, then the communications are suspect. Don Holmgren Fermilab
- Previous message: question about Intel P4 versus Alpha's
- Next message: question about Intel P4 versus Alpha's
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
