[Beowulf] OpenMP on AMD dual core processors
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Nathan Moore ntmoore at gmail.comFri Nov 21 07:38:29 PST 2008
- Previous message: [Beowulf] OpenMP on AMD dual core processors
- Next message: [Beowulf] OpenMP on AMD dual core processors
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thanks a ton for the worked out example! I had a similar problem with gfortran, and it only appeared with large array sizes (bigger than 4000x4000 as I recall). "ulimit" was no help, I assume there's a memory constraint built in somewhere. (as an aside, I once ran into a similar problem with perl - the release on linux would only allow 200MB array sizes, but the version available on a sun machine would allow GB of array sizes) On Fri, Nov 21, 2008 at 6:36 AM, Bill Broadley <bill at cse.ucdavis.edu> wrote: > Fortran isn't one of my better languages, but I did manage to tweak your > code > into something that I believe works the same and is openMP friendly. > > I put a copy at: > http://cse.ucdavis.edu/bill/OMPdemo.f > > When I used the pathscale compiler on your code it said: > "told.f", line 27: Warning: Referenced scalar variable OLD_V is SHARED by > default > "told.f", line 29: Warning: Referenced scalar variable DV is SHARED by > default > "told.f", line 31: Warning: Referenced scalar variable CONVERGED is SHARED > by > default > > I rewrote your code to get rid of those, I didn't know some of the > constants > you mentioned dy and Ly. So I just wrote my own initialization. I skipped > the boundary conditions by just restricting the start and end of the loops. > > Your code seemed to be interpolating between the current iteration (i-1 and > j-1) and the last iteration (i+1 and j+1). Not sure if that was > intentional > or not. In any case I just processed the array v into v2, then if it > didn't > converge I processed the v2 array back into v. To make each loop > independent > I made converge a 1D array which stored the sum of that row's error. Then > after each array was processed I walked the 1-d array to see if we had > converged. I exit when all pixels are below the convergence value. > > It scales rather well on a dual socket barcelona (amd quad core), my > version > iterates a 1000x1000 array with a range of values from 0-200 over 1214 > iterations to within a convergence of 0.02. > > CPUs time Scaling > ================= > 1 54.51 > 2 27.75 1.96 faster > 4 14.14 3.85 faster > 8 7.75 7.03 faster > > Hopefully my code is doing what you intended. > > Alas, with gfortran (4.3.1 or 4.3.2), I get a segmentation fault as soon as > I > run. Same if I compile with -g and run it under the debugger. I'm > probably > doing something stupid. > > -- - - - - - - - - - - - - - - - - - - - - - Nathan Moore Assistant Professor, Physics Winona State University AIM: nmoorewsu - - - - - - - - - - - - - - - - - - - - - -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20081121/683120af/attachment.html
- Previous message: [Beowulf] OpenMP on AMD dual core processors
- Next message: [Beowulf] OpenMP on AMD dual core processors
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
