[Beowulf] onload vs offload
deadline at eadline.org
Sun Sep 25 17:50:32 PDT 2016
I just wrote a white paper about this for InsideHPC and Mellanox.
I did not do any benchmarking and relied on data
from Mellano ( and the focus was the co-design concept)
You can download the white paper (probably pay with your email address)
Or the various sections of the paper are printed openly on the
InsideHPC site. Just search "Eadline" on InsideHPC and it will
give a list of all the articles.
Two things I find interesting are 1) the Mellanox offload
into the fabric and the trends in processor clock rate.
Faster clock rates are better for on-loading, but
as more cores are crammed onto processors the clock rates
are actually dropping, if you stay with low core counts
you can get faster cores of course.
> I was reviewing some rather fetid marketing collateral
> about this topic, and finding mostly stuff from 2010ish.
> A lot has changed since then: onboard PCIe, CPU speed,
> inter-socket bus, NUMA sensitivity of the kernel, lots
> more cores, mem BW, presumably smarter applications, etc.
> Does anyone have comments on recent generations of onload
> vs offload interconnect performance? Please don't respond
> unless it's recent and fully quantified (HW config, how
> measured, etc).
> I'd also be interested to hear from MPI/app people about how useful
> offload really is (how often can real apps leverage RDMA ops,
> or the simple sorts of collectives that are offloadable?)
> As keeper of probably the oldest living Quadrics system, I appreciate
> the appeal of offload. OTOH, there's no question that onloading puts
> a lot of performance potential into the CPU-designer's hands...
> thanks, mark hahn.
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> Mailscanner: Clean
More information about the Beowulf