[Beowulf] Picking a processor
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at physics.mcmaster.caMon Dec 13 11:38:38 PST 2004
- Previous message: [Beowulf] Picking a processor
- Next message: [Beowulf] Job in New Zealand
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> I'm currently designing a beowulf style parallel processor and am trying to > decide which processor to use for the nodes. My project requires my final > design for the parallel processor to be able to provide a sustained throuput > of 0.25 TFlops. by what measure? > My research tells me that in general that the flop rate scales up linearly. well, there are factors which can cause sublinearity. > My trouble is that I'm having trouble finding estimates for the flop rates > of the processors I'm looking at. www.top500.org basically, top500 is a ranking of the fastest 500 computers in the world, when running a benchmark which is FP-intensive. it's a real code, but not a very real real code ;) the critical numebrs are Rpeak and Rmax: Rmax = ncpus*clock*flops-per-cycle it's the peak theoretical aggregate flops of the machine/cluster. interestingly, you can get a pretty decent approximation of Rpeak (the actual HPL score) using: Rpeak ~= Rmax * interconnect-efficiency with: interconnect rmax/rpeak quadrics .75 myrinet .7 infiniband .7 gigabit .6 this is not too surprising - it would be strange if gigabit were not less efficient, and quadrics is pretty much the premium interconnect (unless you count numaflex/etc). there are undoubtedly other factors which might be conflated here - for instance, I'd expect HPL scaling to depend on memory-size-per-cpu as well as memory-bandwidth-per-cpu. and for a slower interconnect, you can probably get higher efficiency by maximizing on-node work (minimizing interconnect dependency.) needless to say, real and useful apps are probably going to achieve lower useful flops than HPL. note also that HPL strongly rewards chips which have fused multiply-add, which can be entirely irrelevant to real codes... regards, mark hahn.
- Previous message: [Beowulf] Picking a processor
- Next message: [Beowulf] Job in New Zealand
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
