[Beowulf] Checkpointing using flash
john.hearns at mclaren.com
Tue Oct 2 01:38:09 PDT 2012
Regarding fault tolerance, this sounds interesting.
I haven't had a chance to do more than glimpse at the web page though (9:30 UK time and I need my coffee)
"A Perspective on Exploiting Heterogeneous Fault-Tolerant Parallelism for HPC clusters and Supercomputers"
Unique Digital Inc. Conference Center
Unique Digital, Inc.
10595 Westoffice Drive
Houston, TX 77042
REGISTRATION IS FREE- Seating is limited
Sponsored by MBA Sciences, Inc. (http://www.mbasciences.com)
With the growth in the size of data, graphs, and scientific computing workloads, the ability to rapidly prototype and deploy robust parallel solutions by leveraging multiple servers/nodes, each containing multiple devices like multicore chips and GPUs, is a key enabler for raising productivity and accelerating discovery. In this talk, we will review a novel programming runtime environment, Emerald, that augments the OpenMPI infrastructure to better express the management and execution of intra-node heterogeneous parallelism.
The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.
More information about the Beowulf