[Beowulf] Q: AMD Opteron (Barcelona) 2356 vs Intel Xeon 5460
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Sangamesh B forum.san at gmail.comWed Sep 17 23:53:55 PDT 2008
- Previous message: [Beowulf] MS Cray
- Next message: [Beowulf] Q: AMD Opteron (Barcelona) 2356 vs Intel Xeon 5460
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Bill,
I'm sorry. I composed the mail in proper format, but its not showing as
I put.
See, I've tested with three compilers only for AMD. For intel only Intel
ifort.
Also there are two results for a single run (not for all. I missed out to
take results with time command).
I hope this helps,
Thanks,
Sangamesh
On Thu, Sep 18, 2008 at 11:59 AM, Bill Broadley <bill at cse.ucdavis.edu>wrote:
>
>
> I'm trying to understand your post, but failed. Can you post a link,
> publish a google spreadsheet or format it differently?
>
> You tried 3 compilers on both machines? Which times are for which
> CPU/Compiler combos? I tried to match up the columns and ros, but sometimes
> there were 3 columns, and sometimes 4. None of them lines up nicely under
> CPU or compiler headings.
>
> Mine (and many other folks) read email in ASCII/text, so a table should
> look like:
>
> Serial run:
> Compiler A Compiler B Compiler C
> =====================================================
> Intel 2.3 GHz 30 29 31
> AMD 2.3 GHZ 28 32 32
>
> Note that I used spaces and not tabs so it appears clear to everyone
> irregardless of their mail client, ascii/text, html, tab settings, etc.
>
> I've been testing these machines quite a bit lately and have been quite
> impressed with the barcelona memory systems, for instance:
>
> http://cse.ucdavis.edu/bill/fat-node-numa3.png
>
>
> Sangamesh B wrote:
>
>> The scientific application used is Dl-Poly - 2.17.
>>
>> Tested with Pathscale and Intel compilers on AMD Opteron Quad core. The
>> time
>> figures mentioned were taken from DL-Poly output file. Also I had used
>> time
>> command. Here are the results:
>>
>>
>> AMD-2.3GHz (32 GB RAM)
>> INTEL-2.33GHz (32 GB RAM)
>>
>> GNU gfortran Pathscale Intel 10
>> ifort Intel 10 fiort
>>
>> 1. Serial
>>
>> OUTPUT file 147.719 sec 158.158 sec 135.729 sec
>> 73.952 sec
>>
>> Time command 2m27.791s
>> 2m38.268s 1m13.972s
>>
>> 2. Parallel
>> 4 core
>>
>> OUTPUT file 39.798 sec 44.717 sec 36.962 sec
>> 32.317 sec
>>
>> Time Command 0m41.527s
>> 0m46.571s 0m36.218s
>>
>>
>> 3. Parallel
>> 8 core
>>
>> OUTPUT 26.880 sec 33.746 sec 27.979 sec
>> 30.371 sec
>>
>> Time cmd
>> 0m30.171s
>>
>>
>> The optimization flags used:
>>
>> Intel ifort 10: -O3 -axW -funroll-loops (don't remember exact
>> flag. Similar to loop unroll)
>>
>> Pathscale: -O3 -OPT:Ofast -ffast-math -fno-math-errno
>>
>> GNU gfortran -O3 -ffast-math -funroll-all-loops -ftree-vectorize
>>
>>
>> I'll try to use the further: http://directory.fsf.org/project/time/
>>
>> Thanks,
>> Sangamesh
>>
>>
>> On Thu, Sep 18, 2008 at 6:07 AM, Vincent Diepeveen <diep at xs4all.nl>
>> wrote:
>>
>> How does all this change when you use a PGO optimized executable on both
>>> sides?
>>>
>>> Vincent
>>>
>>>
>>> On Sep 18, 2008, at 2:34 AM, Eric Thibodeau wrote:
>>>
>>> Vincent Diepeveen wrote:
>>>
>>>> Nah,
>>>>>
>>>>> I guess he's referring to sometimes it's using single precision
>>>>> floating
>>>>> point
>>>>> to get something done instead of double precision, and it tends to keep
>>>>> sometimes stuff in registers.
>>>>>
>>>>> That isn't a problem necessarily, but if i remember well floating point
>>>>> state
>>>>> could get wiped out when switching to SSE2.
>>>>>
>>>>> Sometimes you lose your FPU registerset in that case.
>>>>>
>>>>> Main problem is that there is so many dangerous optimizations possible,
>>>>> to speedup testsets, because in itself floating point is real slow to
>>>>> do
>>>>> at hardware,
>>>>> from hardware viewpoint seen.
>>>>>
>>>>> Yet in general last generations of intel compilers that has improved
>>>>> really a lot.
>>>>>
>>>>> Well, running the same code here is the result discrepancy I got:
>>>> FLOPS:
>>>> my code has to do: 7,975,847,125,000 (~8Tflops) ...takes 15minutes on
>>>> 8*2core Opeteron with 32 Gigs-o-RAM (thank you OpenMP ;)
>>>>
>>>> The running times (ran it a _few_ times...but not the statistical
>>>> minimum
>>>> of 30):
>>>> ICC -> runtime == 689.249 ; summed error == 1651.78
>>>> GCC -> runtime == 1134.404 ; summed error == 0.883501
>>>>
>>>> Compiler Flags:
>>>> icc -xW -openmp -O3 vqOpenMP.c -o vqOpenMP
>>>> gcc -lm -fopenmp -O3 -march=native vqOpenMP.c -o vqOpenMP_GCC
>>>>
>>>> No trickery, no smoky mirrors ;) Just a _huge_ kick ASS k-Means
>>>> parallelized with OpenMP (thank gawd, otherwise it takes hours to run)
>>>> and a
>>>> rather big database of 1.4 Gigs
>>>>
>>>> ... So this is what I meant by floating point errors. Yes, the runtime
>>>> was
>>>> almost halved by ICC (and this is on an *opteron* based system, Tyan
>>>> VX50).
>>>> The running time wasn't what I was actually looking for rather than
>>>> precision skew and that's where I fell off my chair.
>>>>
>>>> For the ones itching for a little more specs:
>>>>
>>>> eric at einstein ~ $ icc -V
>>>> Intel(R) C Compiler for applications running on Intel(R) 64, Version
>>>> 10.1
>>>> Build 20080602
>>>> Copyright (C) 1985-2008 Intel Corporation. All rights reserved.
>>>> FOR NON-COMMERCIAL USE ONLY
>>>>
>>>> eric at einstein ~ $ gcc -v
>>>> Using built-in specs.
>>>> Target: x86_64-pc-linux-gnu
>>>> Configured with:
>>>> /dev/shm/portage/sys-devel/gcc-4.3.1-r1/work/gcc-4.3.1/configure
>>>> --prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/4.3.1
>>>> --includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.1/include
>>>> --datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.3.1
>>>> --mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.3.1/man
>>>> --infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.3.1/info
>>>>
>>>> --with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.1/include/g++-v4
>>>> --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --disable-altivec
>>>> --enable-nls --without-included-gettext --with-system-zlib
>>>> --disable-checking --disable-werror --enable-secureplt --enable-multilib
>>>> --enable-libmudflap --disable-libssp --enable-cld --disable-libgcj
>>>> --enable-languages=c,c++,treelang,fortran --enable-shared
>>>> --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu
>>>> --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo
>>>> 4.3.1-r1
>>>> p1.1'
>>>> Thread model: posix
>>>> gcc version 4.3.1 (Gentoo 4.3.1-r1 p1.1)
>>>>
>>>> Vincent
>>>>>
>>>>> On Sep 17, 2008, at 10:25 PM, Greg Lindahl wrote:
>>>>>
>>>>> On Wed, Sep 17, 2008 at 03:43:36PM -0400, Eric Thibodeau wrote:
>>>>>
>>>>>> Also, note that I've had issues with icc
>>>>>>
>>>>>>> generating really fast but inaccurate code (fp model is not IEEE *by
>>>>>>> default*, I am sure _everyone_ knows this and I am stating the
>>>>>>> obvious
>>>>>>> here).
>>>>>>>
>>>>>>> All modern, high-performance compilers default that way. It's
>>>>>> certainly
>>>>>> the case that sometimes it goes more horribly wrong than necessary,
>>>>>> but
>>>>>> I wouldn't ding icc for this default. Compare results with IEEE mode.
>>>>>>
>>>>>> -- greg
>>>>>>
>>>>>>
>>>>>>
>>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.scyld.com/pipermail/beowulf/attachments/20080918/4c93ad21/attachment.html
- Previous message: [Beowulf] MS Cray
- Next message: [Beowulf] Q: AMD Opteron (Barcelona) 2356 vs Intel Xeon 5460
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
