[Beowulf] precise synchronization of system clocks
Robert G. Brown
rgb at phy.duke.edu
Tue Sep 30 08:37:12 PDT 2008
On Tue, 30 Sep 2008, Lux, James P wrote:
This is a very nice response, and I think you're on a very good track.
IIRC from discussion a few years ago, GPS can yield what, microsecond or
better timing (if used to adjust drift and resync all clocks)? In
principle sub-microsecond, since a microsecond is order of 300 meters
and GPS can get you within 30.
> The GPS synchronization problem is actually substantially easier. The
> propagation delay from satellite to receiver is varying in a very
> predictable manner (in fact, the nav solution solves for it); the signal is
> specifically designed for accurate timing (i.e. A PN code generated from a
> Cs clock is a darn good way to transmit timing and frequency information)
> Keeping it beowulf'y, if you want fine grained synchronization so that you
> don't lose performance when doing barriers, you're probably going to need
> some sort of common clock. The typical microprocessor crystal just isn't
> good enough. Actually, though, when talking about this sort of sync, aren't
> we getting close to SIMD sort of processing? Is a "cluster of commodity
> computers" actually a "good" way to be doing this sort of thing?
There is a natural synchronization driven by task advancement and
barriers already. The problem, I think, is in getting "everything else"
to be at least moderately synchronous, as it is the noise of this that
degrades the otherwise synchronous task is it not? If one could
convince the kernel to "start" all of its housekeeping task timeslices
within (say) 1 usec worst case across all nodes, you would effectively
parallelize and synchronize this noise. I don't think it would be
necessary to shoot for true SIMD-like advancement, clock by clock --
only to drive systems into a (nearly) identical state so that one CPU is
only rarely waiting on another because of a random out-of-step timeslice
being allocated to something outside of the task. Organization OF the
task is up to the parallel programmer, I would think, but it is
generally done under the assumption that the loss of occasional
timeslices doesn't much matter, only they do.
However, I would think it would require a lot of work to get the
kernel(s) to respect a usec-synchronized clock, assuming that one could
constrain the hardware so that it didn't generate too much random (e.g.
interrupt) noise on its own.
Robert G. Brown Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
More information about the Beowulf