[Beowulf] Woodcrest Memory bandwidth

Joe Landman landman at scalableinformatics.com
Mon Aug 14 13:02:40 PDT 2006


Mark Hahn wrote:

> kinda sucks, doesn't it?  here's what I get for a not-new dual-275 with 
> 8x1G PC3200 (I think):
> 
> Function      Rate (MB/s)   RMS time     Min time     Max time
> Copy:        5714.6837       0.0840       0.0840       0.0841
> Scale:       5821.0766       0.0825       0.0825       0.0826
> Add:         6437.8226       0.1119       0.1118       0.1120
> Triad:       6414.2079       0.1123       0.1123       0.1124
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

Now that I am back from "da Yoo Pee" I can post some of my numbers. 
Here is Our dual core opteron 275.

4-threads

Function     Rate (MB/s)  Avg time   Min time  Max time
Copy:       9999.3465      0.0356      0.0320      0.0360
Scale:      8888.4147      0.0360      0.0360      0.0360
Add:        9230.2533      0.0542      0.0520      0.0560
Triad:      9230.2321      0.0538      0.0520      0.0560

1-thread

Function     Rate (MB/s)  Avg time   Min time  Max time
Copy:       4705.6130      0.0711      0.0680      0.0720
Scale:      4705.6130      0.0702      0.0680      0.0720
Add:        4615.1161      0.1067      0.1040      0.1080
Triad:      4444.1975      0.1080      0.1080      0.1080

using a PathScale compiled binary.  I see slightly higher numbers using 
PGI 6.1-2 compiled binaries for single threads, not sure why.  The 
6.1-5/6 compiled are worse :(

Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:       6666.9379      0.0454      0.0300      0.0500
Scale:      4000.0610      0.0567      0.0500      0.0600
Add:        4285.7330      0.0758      0.0700      0.0800
Triad:      4285.7330      0.0747      0.0700      0.0900

Same code binary on woodcrest 2.66 GHz

Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:       5000.1240      0.0427      0.0400      0.0500
Scale:      5000.1240      0.0452      0.0400      0.0500
Add:        5000.0445      0.0685      0.0600      0.0800
Triad:      5000.0445      0.0712      0.0600      0.0800

Intel 9.1 compiled version (64 bit)

1-thread

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        4447.6829       0.1440       0.1439       0.1445
Scale:       4613.8072       0.1388       0.1387       0.1390
Add:         4256.9431       0.2256       0.2255       0.2259
Triad:       4187.6605       0.2294       0.2292       0.2302


2-threads

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        7288.3813       0.0882       0.0878       0.0893
Scale:       7186.2381       0.0891       0.0891       0.0893
Add:         7085.0852       0.1357       0.1355       0.1365
Triad:       6916.0273       0.1389       0.1388       0.1392


3-threads

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        6589.2489       0.0989       0.0971       0.1001
Scale:       6528.4171       0.0988       0.0980       0.0997
Add:         6535.0076       0.1488       0.1469       0.1504
Triad:       6563.9202       0.1486       0.1463       0.1496


4-threads

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        6645.4125       0.0965       0.0963       0.0976
Scale:       6994.6233       0.0916       0.0915       0.0917
Add:         6373.0207       0.1508       0.1506       0.1509
Triad:       6710.7522       0.1432       0.1431       0.1433

I may have been Bill's 10 GB/s source, and that may have been a mixup on 
my part.

FWIW: the PathScale compiled binaries on this machine give

Function     Rate (MB/s)  Avg time   Min time  Max time
Copy:       7272.4071      0.0453      0.0440      0.0480
Scale:      7272.2298      0.0462      0.0440      0.0480
Add:        5999.6258      0.0827      0.0800      0.0840
Triad:      5999.6302      0.0831      0.0800      0.0840

and the PGI compiled ones give

Function      Rate (MB/s)   RMS time     Min time     Max time
Copy:        6608.0161       0.0970       0.0969       0.0977
Scale:       4592.3298       0.1395       0.1394       0.1397
Add:         4259.8885       0.2262       0.2254       0.2269
Triad:       4244.0478       0.2269       0.2262       0.2273

They may be slightly different versions of the original source (notice 
the labels on the columns), but the core measurements are the same.


On the Opteron 275, we have two memory nodes, each with multiple banks 
per node.

landman at dualcore:~/stream> numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3
cpubind:
nodebind:
membind: 0 1

landman at dualcore:~/stream> numactl --hardware
available: 2 nodes (0-1)
node 0 size: 2015 MB
node 0 free: 1276 MB
node 1 size: 4025 MB
node 1 free: 2416 MB
node distances:
node   0   1
   0:  10  20
   1:  20  10


On the woodcrest, it looks like a single memory node.

landman at woody:~> numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3
cpubind:
nodebind:
membind: 0


landman at woody:~> numactl --hardware
available: 1 nodes (0-0)
node 0 size: 4017 MB
node 0 free: 2649 MB
node distances:
node   0
   0:  10


I have it on good authority that with the other chipset (we have a 
Blackford here), we should see higher numbers.  Not exceeding the 
Opteron 275 though.

When I have time, I will investigate this more and write about it on my 
blog.  FWIW, I am not seeing a clear performance picture emerging.  I 
have heard speculation/rumor from others, but I prefer measurement, and 
my measurements while consistent, are not exposing a nice and meaningful 
picture where I can say "yes its faster" or "no it isn't".

What I can say is that Woodcrest is interesting.  It just may be 
overhyped by a "compliant" media.



-- 

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615




More information about the Beowulf mailing list