Using hyperthreading on 2 Proc Xeon cluster nodes
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Bill Broadley bill at math.ucdavis.eduSat Jun 8 16:46:12 PDT 2002
- Previous message: Using hyperthreading on 2 Proc Xeon cluster nodes
- Next message: Using hyperthreading on 2 Proc Xeon cluster nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> BTW, the 2 virtual processors share the same FPU, so not interesting for > HPC. In the case of the P4 I'd agree, in general even with a shared FPU I could see hyperthreading being very useful. Keep in mind even a single flop per cycle is often a big improvement over real world performance. If thread A blocks, and thread B can get some work done without having the expense of a context switch, getting work done during a cache miss without the expense of a context switch can be a big win. But alas for whatever reason the p4 doesn't have enough resources to get much advantage from the 2-way SMT. At least on any code I've found, but I'm still looking. Someone posted an article that rambus was necessary for the advantage. I'm not sure if any of the 2 bank DDR p4's can actually have 2 seperate outstanding requests at the same time. Rambus I believe does support multiple misses. So as usual the first intel implementation isn't that exciting, but I expect better from the next iteration. Currently the common case seems to be 4 processes on a 2 hyperthread cpu's is slower than 2. -- Bill Broadley Mathematics/Institute of Theoretical Dynamics UC Davis
- Previous message: Using hyperthreading on 2 Proc Xeon cluster nodes
- Next message: Using hyperthreading on 2 Proc Xeon cluster nodes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
