<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: Arial; font-size: 12pt; color: #000000'><div><br></div>On Monday, March 15, 2010 1:27:23 PM GMT Patrick Geoffray wrote: <div><br><div>>I meant to respond to this, but got busy. You don't consider the protocol</div><div>>efficiency, and this is a major issue on PCIe.</div><div><br></div><div>Yes, I forgot that there is more to the protocol than the 8B/10B encoding,</div><div>but I am glad to get your input to improve the table (late or otherwise).</div><div><br>>First of all, I would change the labels "Raw" and "Effective" to <br>>"Signal" and "Raw". Then, I would add a third column "Effective" which <br>>consider the protocol overhead. The protocol overhead is the amount of </div><div><br></div><div>I think adding another column for protocol inefficiency column makes</div><div>some sense.   Not sure I know enough to chose the right protocol performance</div><div>loss multipliers or what the common case values would be (as opposed</div><div>to best and worst case).  It would be good to add Ethernet to the mix</div><div>(1Gb, 10Gb, and 40Gb) as well.  Sounds like the 76% multiplier is </div><div>reasonable for PCI-E (with a "your mileage may vary" footnote).  The table</div><div>cannot perfectly reflect every contributing variable without getting very large. </div><div>Perhaps, you could provide a table with the Ethernet numbers, and I will do</div><div>some more research to make estimates for IB?  Then I will get a draft to Doug</div><div>at Cluster Monkey.  One more iteration only ... to improve things, but avoid</div><div>a "protocol holy war" ... ;-) ... </div><div><br></div><div>>raw bandwidth that is not used for useful payload. On PCIe, on the Read <br>>side, the data comes in small packets with a 20 Bytes header (could be <br>>24 with optional ECRC) for a 64, 128 or 256 Bytes payload. Most PCIe <br>>chipsets only support 64 Bytes Read Completions MTU, and even the ones <br>>that support larger sizes would still use a majority of 64 Bytes <br>>completions because it maps well to the transaction size on the memory <br>>bus (HT, QPI). With 64 Bytes Read Completions, the PCIe efficiency is <br>>64/84 = 76%, so 32 Gb/s becomes 24 Gb/s, which correspond to the hero <br>>number quoted by MVAPICH for example (3 GB/s unidirectional). <br>>Bidirectional efficiency is a bit worse because PCIe Acks take some raw <br>>bandwidth too. They are coalesced but the pipeline is not very deep, so <br>>you end up with roughly 20+20 Gb/s bidirectional.</div><div><br></div><div>Thanks for the clear and detailed description.</div><div><br>>There is a similar protocol efficiency at the IB or Ethernet level, but <br>>the MTU is large enough that it's much smaller compared to PCIe.</div><div><br></div><div>Would you estimate less than 1%, 2%, 4% ... ??<br><br>>Now, all of this does not matter because Marketers will keep using <br>>useless Signal rates. They will even have the balls to (try to) rewrite <br>>history about packet rate benchmarks...<br><br></div><div>I am hoping the table increases the number of fully informed decisions on</div><div>these questions.</div><div><br></div><div>rbw<br>_______________________________________________<br>Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing<br>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf<br></div></div></div></body></html>