[Beowulf] precise synchronization of system clocks
Lux, James P
james.p.lux at jpl.nasa.gov
Mon Sep 29 15:14:55 PDT 2008
> -----Original Message-----
> From: beowulf-bounces at beowulf.org
> [mailto:beowulf-bounces at beowulf.org] On Behalf Of Lombard, David N
> Sent: Monday, September 29, 2008 2:21 PM
> To: Prentice Bisbal
> Cc: Beowulf Mailing List
> Subject: Re: [Beowulf] precise synchronization of system clocks
> On Mon, Sep 29, 2008 at 01:10:49PM -0700, Prentice Bisbal wrote:
> > In the previous thread I instigated about running services
> in cluster
> > nodes, there was some mentioning of precisely synchronizing
> the system
> > clocks and this issue is also mentioned in this paper:
> > "The Case of Missing Supercomputer Performance: Achieving Optimal
> > Performance on the 8,192 processor ASCI Q" (Petrini, Kerbisin and
> > Pakin) http://hpc.pnl.gov/people/fabrizio/papers/sc03_noise.pdf
> > I've also read a few other papers on the topic, and it
> seems you need
> > to sync the system clocks to ~1 uS. On top of that, I
> imagine you also
> > need to synch the activities of each system so they all
> stop to do the
> > same system-level tasks at the same time.
> The IEEE-1588 "Precision Time Protocol" can provide such
> levels of global clock synchronization.
1588 is truly one of the more useful things to come out in recent years. I just wish there was more hardware that supported it. The alternative is something quite clunky.. IRIG time codes + 1pps or Ethernet(ntp)+1pps or other clunky solutions.
There is great value in using the cable you've already got hooked up to carry both the sync and the data needed to interpret it.
In a shameless plug for my own, SpaceWire (which is unlikely to be used in HPC, even if it did originate with Transputers) has time code and sync built in, in part because I kept pushing the folks doing the ECSS spec and giving them good practical examples of it being useful. (http://spacewire.esa.int/content/Standard/Standard.php) It's good to a microsecond, or so (depending on the raw information rate, mostly)
James Lux, P.E.
Task Manager, SOMD Software Defined Radios
Flight Communications Systems Section
Jet Propulsion Laboratory
4800 Oak Grove Drive, Mail Stop 161-213
Pasadena, CA, 91109
More information about the Beowulf