[Beowulf] Multidimensional FFTs
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Konstantin Kudin konstantin_kudin at yahoo.comWed Mar 1 10:23:26 PST 2006
- Previous message: [Beowulf] Re: g77 limits...
- Next message: [Beowulf] Multidimensional FFTs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> So I was wondering what the current "state of the art" is in > clustered 3D FFTs? I've googled around a bit, but most off the > results seem a little dated. If someone could point me to any recent > papers or studies, I would be grateful. You can find some of the reasonably recent FFT related stuff at this link (read the preprint on the FFT strategies): http://pages.unibas.ch/comphys/comphys/SOFTWARE/ Anyway, the "alltoall" can be a real killer. If you want to use lots of cpus with really small packets, go for something like Parastation MPI ( http://www.parastation.com/ ). This MPI package is MPICH based and cuts down latencies for small packets by about 30% (really !). And the best part it is free for academics. For large packets, things get trickier. Like on a dual Opteron cluster around here there is significant "choking" effect, due to unknown reasons. Using skampi 4.1, one gets what is shown below for 64kB packets (this is with the bleeding edge version of Open-MPI, 1.1 pre-alpha). Open-MPI developers promise to pay specific attention to the "alltoall" function, so things might become quite good at some point. [ncpu ms std] (choking at 15 cpus) #/*@insyncol_MPI_Alltoall-nodes-long-SM.ski*/ 2 275.1 1.6 8 275.1 1.6 8 3 1890.2 31.3 8 1890.2 31.3 8 4 3467.1 85.0 8 3467.1 85.0 8 5 5843.9 66.3 8 5843.9 66.3 8 6 8720.9 110.6 8 8720.9 110.6 8 7 9598.8 99.6 7 9598.8 99.6 7 8 11757.9 256.4 6 11757.9 256.4 6 9 13428.2 166.4 8 13428.2 166.4 8 10 14623.4 176.2 8 14623.4 176.2 8 11 16689.4 171.9 4 16689.4 171.9 4 12 18941.4 502.9 5 18941.4 502.9 5 13 20105.2 99.0 8 20105.2 99.0 8 14 22731.1 155.0 2 22731.1 155.0 2 15 123939.7 49248.4 8 123939.7 49248.4 8 16 142048.0 43888.8 8 142048.0 43888.8 8 If "alltoall" is not used, but rather a bunch of isend+irecv, the choking effect shows up way earlier: (choking at 6 cpus) #/*@insyncol_MPI_Alltoall_Isend_Irecv-nodes-long-SM.ski*/ 2 247.4 0.8 8 247.4 0.8 8 3 1861.8 10.1 8 1861.8 10.1 8 4 3158.4 24.5 8 3158.4 24.5 8 5 4270.0 75.0 2 4270.0 75.0 2 6 225351.5 12504.5 2 225351.5 12504.5 2 7 228399.5 14770.5 2 228399.5 14770.5 2 8 247087.5 14448.4 2 247087.5 14448.4 2 9 243806.7 3878.9 8 243806.7 3878.9 8 10 248353.0 6640.9 2 248353.0 6640.9 2 11 267541.5 5210.1 8 267541.5 5210.1 8 12 286600.1 1665.1 2 286600.1 1665.1 2 13 277546.5 4208.1 8 277546.5 4208.1 8 14 364208.9 98276.9 2 364208.9 98276.9 2 15 392139.0 101163.9 2 392139.0 101163.9 2 16 367182.1 97711.0 2 367182.1 97711.0 2 Konstantin __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
- Previous message: [Beowulf] Re: g77 limits...
- Next message: [Beowulf] Multidimensional FFTs
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
