[Beowulf] Multidimensional FFTs
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Lindahl lindahl at pathscale.comTue Feb 28 18:34:50 PST 2006
- Previous message: [Beowulf] Multidimensional FFTs
- Next message: [Beowulf] ALE3D
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Feb 28, 2006 at 01:26:51PM -0500, Bill Rankin wrote: > There is a research group here at Duke doing some application > development and they are looking at implementing their codes in a > cluster environment. The main problem is that 95% of their > processing time is taken up by medium to large sized 3D FFTs (minimum > 64 elements on an edge, 256k total elements). That's a fairly small FFT on a parallel cluster. How many cpus do they imagine using? Perhaps the easiest thing to do is to whip up some code and invite people to benchmark it. The G-PTRANS and G-FFTE elements of HPC Challenge are relevant but not many folks have submitted numbers. Let's see: for 64**3, and 64 cpus with a 1D decomposition, there are 64**2 words per cpu, and a naive Alltoall will send 64 messages of 64 words each to 63 other nodes. Then the message length is 1024 bytes (double precision complex). I would disagree with Stu's recommendations at this size due to the short message length, but I don't know if 2D would be a better decomposition at this size. FFTW version 2's MPI routines only do 1D decomposition. -- greg
- Previous message: [Beowulf] Multidimensional FFTs
- Next message: [Beowulf] ALE3D
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
