[Beowulf] Go-playing machines
Robert G. Brown
rgb at phy.duke.edu
Wed Jun 25 07:10:31 PDT 2008
On Wed, 25 Jun 2008, Vincent Diepeveen wrote:
> Chess has indeed more than 10^40 positions. Latest mathematical calculation
> as published in ICGA journal came to 10^43.
> There is 2 points:
> a) playing the best move doesn't require searching the entire search
> so who knows maybe searching 10^18 nodes with a huge hashtable and
> good evaluation function,
> we might already play the best move.
:-) That's true and funny at the same time. "Only" 10^18 positions, at
"only" a nanosecond each takes "only" a gigasecond, which since a year
is \pi x 10^7 seconds IIRC, is "only" around \pi x 10 Gigasearch-years.
It is a difficult problem, in other words. It's what makes it fun, I
suppose. And equally obviously, human brains don't work anything like
this way on it. To me it is more fun to speculate on how the human
brain does it WITHOUT anything like the capability to process
giga-giga-trees. It clearly prunes things down to some much more finite
order extremely quickly by discarding nearly all of this space from the
beginning, and YET the space that remains is often the good one to the
best that even Big Blue can parse out otherwise.
> b) To get to 10^43 calculation power, we do not need a processor that
> exists out of 10^43 particles. All we need is
> a processor that is hosting 10^43 internal movements, which is a lot easier
> to get acomplished, than building that
> cpu of 10^43 in size.
I'm not sure what you mean by that. All silicon processing has the
usual serial and parallel components and accomplishes work (per
processor) with a gigahertz scale serial barrier, leaving it many, many
orders of magnitude short even in parallel. Yes, neural network
processing in the brain has way fewer (maybe 10^11) neurons, but each
neuron has thousands of synaptic connections to other neurons and the
network formed has staggering complexity. Big enough that the chess
problem could actually fit in its entirety inside with room enough for
the go problem left over. And still be nearly "empty", and work
in an extremely parallel fashion.
There aren't a whole lot of candidate architectures for a parallel
processing system that can do this sort of thing, but I agree that this
is "all" we need. It's just not terribly easy to accomplish, because
the problem is hard (and fun).
One encounters numbers that scale like this in physics (stat mech) quite
often, actually. If you work out the states in e.g. Monte Carlo
sampling of a very simple 100x100x100 simply cubic Ising ferromagnet
lattice (where each site contains a spin that can hold one of two
states) the number of states in the system is 2^(1,000,000). This is "a
big number" at least at first glance.
If one then traces through "moves" through the system from any given
starting state through all the various intermediary states to all the
possible final states, one creates a space with e.g. [2^1000000]^N
configurational trajectories for N steps. Ooo, maybe THAT is a big
number, at least if N is reasonably large (say, a few thousand).
Out of all of those states, a MUCH MUCH SMALLER set of states represents
the only ones that are "likely" to be seen if the system is in
equilibrium at any given temperature. How to find this needle in a very
"large" (and yet tiny -- we're talking a mere million atoms when
physical systems big enough to see contain, um, a lot more than this and
have continuous spins instead of discrete ones) haystack?
Importance sampling Monte Carlo lets you discard nearly all moves in a
Markov process that takes you from any starting configuration to the
"relevant" part of phase space rather quickly (considering) and then
samples that space effectively. It rejects far, far, far more states
than there are particles in the Universe with a metaphorical "sniff" as
being absurdly unlikely and homes right in on the likely suspects.
I think that this is a much better model, or metaphor, for how humans
think than search trees. Humans really suck at search trees, with their
fundamentally serial attention focus and conscious processing speed
measured in tens of transactions per second on a good day. We excel at
"intuition", at the instantaneous parallel pruning of enormous spaces to
the relevant part.
> In technical areas that people like to hear about, it is impossible as a
> single person to fight unpaid all the desinformation.
> The biggest problem there is professors who seek publicity, or cover up for
> total nonsense thesis of their students.
> I tried to have some sort of discussion with Jonathan Schaeffer about a
> student of him quoting MTD is ok in his thesis.
> What i skipped in fact is the total nonsense about parallel speedup in game
> tree searching from his student idea being:
> make the most inefficient searcher on planet earth, then parallellize it and
> of course because even random move orderings
> would improve your program then, also the parallellism gets a decent speedup
> as a result of all that inefficiency;
> the classical approach.
> Also MiT is notorious bad in game tree search.
> I will give yet 1 more example.
> I remember they claimed in articles a great speedup (50% or so even) in game
> tree search at supercomputers.
> Now my chessprogram has a lot of chessknowledge and also is doing all sorts
> of mobility, which requires expensive
> scans. That slows down the nodes per second that my chessprogram Diep can
> The search frame from MiT slows down the searchspeed of their chessengine by
> a staggering factor 40.
> At the same SGI Origin3800 hardware in fact my slow Diep engine gets MORE
> nodes per second, and a lot more,
> than MiT's cilkchess chess engine.
> Not because single core mine is any faster. In fact single core cilkchess
> without cilkframe used to be factor 20.
> Additionally there is nowhere anything statistical significant done there.
> All speedups of supercomputers are never carrying
> a table with what is its speedup with 95% sureness.
> Doing publications just to show how it should be done and disprove and prove
> a number of algorithms (especially disprove nonsense),
> is already quite a tough challenge and i can be busy for years there, as
> there is hundreds of such publications, just from the past few
> years to disprove and show the better alternative.
> It's a technical science, there is not many who really are expert there,
> progress goes *real* fast.
> In the end what matters for engines is not how strong it plays against you,
> it matters how strong you play against competitors :)
> On Jun 25, 2008, at 7:14 AM, Robert G. Brown wrote:
>> On Wed, 25 Jun 2008, Vincent Diepeveen wrote:
>>>> one player's position -- or one of those distant pieces placed early in
>>>> the game -- causes their entire effort to "unravel" and turn into a
>>>> disaster. That's almost twice the number of plys in an entire chess
>>>> game, and is still only the first third or so of the game.
>>> See above for correct calculation.
>>>> If you or your friend disagree with this, well, feel free to edit the
>>>> wikipedia article(s) with examples that contradict it, but the
>>>> mathematics and difficulty of pruning the Go trees suggest that it
>>> See above for disproof of that.
>> Very educational and interesting. But I was also serious -- if you can
>> (and it sounds like you can) you should consider editing the wikipedia
>> I'm not COMPLETELY convinced -- I'll be a lot more convinced when I can
>> get a GPL Go engine that will play a decent game on my laptop...;-)
>> Robert G. Brown Phone(cell): 1-919-280-8443
>> Duke University Physics Dept, Box 90305
>> Durham, N.C. 27708-0305
>> Web: http://www.phy.duke.edu/~rgb
>> Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
>> Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
Robert G. Brown Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
More information about the Beowulf