[Beowulf] help on island model
Robert G. Brown
rgb at phy.duke.edu
Sun Mar 12 05:06:16 PST 2006
On Sat, 11 Mar 2006, Jeremy Baker wrote:
>> In spite of it being in "unfit" individuals only,
>> if you lose those individuals you lose any possibility of ever getting
>> that gene right!
> An artificial Founder's Effect.
> Vermont's sugar maple trees are migrating north towards Canada. In the new
> territory, the ideal genotype for the region's soil and weather factors
> may never evolve c/o natural selection, if during the transition phase
> those trees containing the ideal genes for Canadan soil were not yet
> generally dispersed over the entire matrix's meta population.
> I have read that the discovery of more protein iterations than does exist
> genes may encourage humans to rethink the Central Dogma theory of one gene
> one protein. Advantageous phenotype behavior may not always result as a
> causual relationship with a genetic change as drastic as a new gene locus,
> if spliseosomes have epigenetically caused the beneficial behavior.
> I guess the idea is that nature is sometimes recursive and that the island
> model can be used to help explain gene behavior? If someone where studying
> genes, I would think their model would be constrained to their specie of
> study, and the behavior between paramecium and trees probably would have
> radical chromosome differences.
A couple of comments.
First, remember that a GA isn't quite the same thing as Darwinian
evolution as realized in nature. Genes (or base pair triplets) as
templates for protein generation, chromosomes as blueprints for lifeform
assembly out of gene-encoded proteins, and the feedback of e.g.
radiation, oxidative processes, metabolic processes, the environment may
well contain all sorts of improvements on the "basic" GAs used in
computer science, that are generally the simplest sorts of
shuffling/selection mechanisms. Gender, for example, is present in
nearly all complex life forms, including plants, where there is little
or no biological dimorphism or different "roles" for the sexes to
explain the advantage. Clearly there is an evolutionary advantage of
some sort to having two sexes at the algorithmic level (but not three,
four, six, at least that "nature" has discovered) but I'm probably one
of a very few people that has even experimented with algorithmic sexual
differentiation in GAs (published attempts so far having yielded null
results for improved efficiency) and I don't publish; this is all work I
do prospecting for gold as it were.
So I don't think that island model explains gene behavior in nature at
all -- I think it is what I said it was -- a nature-inspired attempt to
beat a brutal and entirely practical limitation on GAs used within
numerical optimization engines.
Second, along the same lines, paramecia and trees are infinitely complex
as far as modern GAs are concerned. I have some damned good GA code, GA
code with a bunch of bells and whistles I've "invented" and added to it
to try to deal with the premature convergence, miss the optimal valley
problem. It's good enough to "solve" a bunch of highly nonlinear
problems of fairly low dimensionality -- extremely complex landscapes --
by e.g. optimizing the weights of a neural network. It also does
measureably, significantly "better" at finding solutions to problems
with very high projective dimensionality on the inputs -- as many as
hundreds of variables -- than standard (non-enhanced) GAs.
This is roughly the level of problem complexity managed by an RNA virus
(order of 10-10^2 genes, ~10 kilobases if you think that way). Except
that the biological mechanisms of transcription into amino acids bound
into proteins and so on contain a LOT of incidental complexity in the
HIGHLY nonlinear problem space viruses inhabit. However, if there is a
"good" model for computational GAs in nature it would be viruses, as
viruses, like computer programs, rely on an externally provided
"computer" (protein factories in cells that are NOT encoded into the
virus itself) to transform the codons into a structure that can be
subjected to selection.
Alas, at this level of complexity viruses do not generally engage in
"sex" (crossover) and viruses, like so many biological systems, are more
concerned with remaining STABLE on a slowly varying niche optimum than
they are wildly sampling the bulk of their genetic variability to seek
newer better optima in other niches. So even here the metaphor is
strained -- many GAs are like sexy viruses risking everything on trying
to "crack" the optimum of some particular artificially selected niche.
To summarize, GAs, like neural networks, are METAPHORS of biological
processes transmogrified to human algorithmic purposes in information
processing, and the Island model is a nature-inspired attempt to extend
the metaphor to address the specific problem of trying to shuffle a
"perfect hand" out of a finite deck of cards (where you are forced to
systematically discard cards from each round of the game, some of which
might be needed by that perfect hand) by permitting the game to be
played at several tables at once, each with their own deck, and
periodically replicating players with the best hands so far between the
tables to "replenish" prematurely depleted codons, one hopes.
With a few tweaks, a really good metaphor, actually -- for many problems
the solution space has an unknown but potentially large degeneracy (like
a royal flush in any one of FOUR suits having equal, optimal, weight).
It does no good to send a player with the best hand so far (in a suit
that HAPPENS to be clubs) to a table where the best hand so far HAPPENS
to be spades, especially if you have waited until all the good spades
have been eliminated at the club table so it isn't that likely that one
of the "extra" cards in the best club hand is the ace of spaces.
I personally prefer a "tribal" model to an Island model per se -- where
a tribal model is in systematically closer contact than an Island model
and where special games are played to permit tribes to conserve
different codons pools while still remaining in sufficiently close
contact to be optimizing the same "species" -- or "suit".
Either way the problem remains -- stability/convergence and exploration
(exploration of a VERY VERY LARGE SPACE -- 100 BINARY variables
represents 2^100 individual cells, about 10^30 possible states -- with
100 N-ary discrete or continuous ordinal or non-ordinal variables far
worse) are fundamentally at odds in the optimization process, and if you
make a model good at one it is bad at the other. It is very difficult
indeed to make a general purpose GA that is good at both, and some very
real but (still, as far as I can tell from the literature) undiscovered
scaling laws in the no-free-lunch space that this represents. I have
some ideas on how to build the "ideal" GA for solving at least certain
classes of problems and perhaps one day soon I will have the time to
explore them within MY code base... my base of real C code:-)
All that I can say is that anybody who has actually played with GAs
and/or studied biogenetics could never, ever, doubt the generalized
theory of evolution -- reproduction (with any mechanism at all of
variability) mixed with selection is a very, very powerful optimization
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf