[Beowulf] Performance characterising a HPC application

Thu Mar 22 23:27:39 PDT 2007

Gilad,

Gilad Shainer wrote:
>> -----Original Message-----
>> People doing their homework are still buying more 2G than 10G 
>> today, because of better price/performance for their codes 
>> (and thin cables).

> People doing their homework test their applications and decide

That's what I have always said.

> For that purpose. In your case, maybe your customers prefer your 2G
> over your 10G, but I am really not sure if most of the HPC users
> are buying more 2G rather than other faster I/O solutions..... 

My feedback is limited to Myrinet 2G vs Myrinet 10G when customers
actually run their codes and the performance gain is either null or too
small to be worth the price difference (this criteria is of course very
subjective). I don't know if they test on "faster" solutions as well.

> All the real applications performance that I saw show that IB 10 
> and 20Gb/s provide much higher performance results comparing to 

I think you mean IB 8 Gb/s and 16 Gb/s, since using the signal rate
instead of the data rate is not only confusing, it is wrong and nobody
else does it.

Furthermore, what you really mean is 8 Gb/s and 13.7 Gb/s since this is
the maximum throughput of a PCI Express 8x link (*).

> your 2G, and clearly better price/performance. This is a good 
> indication that applications require more bandwidth than 2G.

It depends on the applications and also who does the benchmarking. The
most common "marketing mistake" is to look at GM numbers, not MX. In
latency-bounded codes, Myrinet 2G with MX does outperform IB Mellanox, 
even DDR. My own measurements on real applications show that MX-2G 
sometimes beats Mellanox IB DDR on medium messages, typically when the 
registration cache is ineffective (malloc hooks unusable or limited 
IOMMU) or when the code tries to overlap.

Similarly, on many applications I have checked, Qlogic IB SDR has better 
performance than Mellanox IB DDR, despite having a smaller pipe (and 
despite Mellanox claiming the contrary).

There are a lot of external factors as well. An application that is not 
bandwidth bounded can become one if the number of cores increases for 
example. So different host configurations yield different results.

Price/performance also depends on the price, and the price depends on 
the market, the volume, the vendor relationship, the competitive 
environment, etc. You seem to assume a high price for Myrinet 2G, but 
that may not be a safe assumption.

In conclusion, I will repeat myself: I believe that bigger pipes do not 
always have a better price/performance, nor even simply better 
performance, it depends very much on the application. The most used HPC 
interconnect in the world today is still Gigabit Ethernet, and it has 
the best price/performance ratio for a lot of codes.

Patrick

(*) For the curious, the maximum efficiency of PCI Express x8 is 86%, 
best scenario. The Read DMA completions are 128 bytes max on today's 
PCIE chipsets (default is 64 bytes), with a 20 bytes header composed of 
4 bytes for DLL, 12 bytes for TLP 3DW, 4 Bytes of LCRC). That's 20 bytes 
header for 128 bytes payload, ie 128/148 = 0.86. Link data rate is 16 
Gb/s, so 16*0.86 = 13.7 Gb/s after protocol. With ECRC or on Intel 
chipsets, there is 4 more bytes, so the max Read throughput becomes 13.5 
Gb/s. The real limit depends on the chipset and can be much lower than that.

-- 
Patrick Geoffray
Myricom, Inc.
http://www.myri.com