[Beowulf] Checkpointing using flash
eugen at leitl.org
Fri Sep 21 12:13:09 PDT 2012
On Fri, Sep 21, 2012 at 01:09:41PM -0400, Ellis H. Wilson III wrote:
> On 09/21/12 12:58, Lux, Jim (337C) wrote:
> > Yes.. If that's the frequency of checkpoints. I was thinking more like 1
> > checkpoint per second or 10 seconds.
> While I suppose they might exist that frequent somehow in the wild, I've
> never heard of checkpoints at that low of time interval. These huge
> cluster checkpoints are near to the entire memories, so even today we're
> talking near to 64 or 128 GB of RAM per node. In ten years we're
Exascale will be likely ARM-like SoCs with stacked memories, including
nonvolatile ones (phase change, spintronics, whatever). At >100 GByte/s
memory bandwidth you can snapshot at ~Hz without too much penalties.
> talking what, near to if not above a TB of RAM per node? Moreover, they
I'd rather have MB/node or less.
> all tend to write their checkpoint at the same time and the SSDs aren't
> on the compute nodes -- they're on some intermediate I/O storage nodes
The forthcoming ARM SoCs have typically mSATA SSD at each node.
> (akin to BlueGene's intermediate layer). So were talking about huge
> cluster-wide dumps of data to the flash intermediate layer, which then
> takes some hours to dump that data down to the more persistent HDDs.
> This takes at the very least many minutes, and in the normal case hours.
> I would not be surprised if the best they could do at exascale was one
Exascale won't look like today's clusters. Can't look like today's
> checkpoint a day. Again, I don't think these are used as the front-line
> of defense against failures. That would really suck :D.
More information about the Beowulf