Why Evolution Is Not a Good Paradigm For Program Induction; A Critique of Genetic Programming John Woodward and Ruibin Bai My interest We would not attempt to try to write computer programs without the constructs of 1. Reusable functions (in GP terminology ADFs), 2. Iteration (loops or repeat) 3. Memory (e.g. read-write arrays)
So why dont we allow GP this ability Where are the papers on evolving Turing Complete Programs? This suggests it is hard!!! OUTLINE 1 Revisit natural evolution 1. genetic code A T C G bases in DNA 2. Crossover primary search operator 3. re-evaluation aim of evolution? 4. limits of natural evolution what evolution cannot do easily.
5. Why does evolution seem so successful? Because it solved self-imposed problems OUTLINE 2 Re-examine genetic programming. 1. Non-biological description (mathematical) 2. Crossover unsuitable? 3. number of loops in evolved programs (very few!) 4. manipulating syntax (what about semantics) 5. stochastic search for deterministic problems
GENETIC CODE 1 4 bases (A T C G) along DNA In groups of 3 (called codons)
code for 21 amino acids (+ STOP). This is the minimum number. If only 2 bases per codon = 4*4=16 3 bases (64) and 4 bases (256) GENETIC CODE 2 The codons do not randomly map to the amino acids! They are clustered together, often the last base of the 3 in a codon is redundant. This reduces the chances of a copying error.
In fact we have a perfect code. But what if an error were to occur? GENETIC CODE 3 Amino acids have different properties acidic, hydrophobic However different amino acids are clustered together So even if a wrong amino acid was coded for the impact on the property of resulting protein is low.
Genetic code is good at the job it does. BIOLOGICAL CROSSOVER In biology, like genes are exchanged for like-genes in the crossover process. Template Skin-hair-eye color example. Parent1 brown-black-black Parent2 white-blonde-blue Child1 brown-brown-blue (possible) Child2 blue-white-blonde (highly improbably) Crossover is good for producing novel combinations of
traits in a species. This sub-space is still large RE-EVALUATION Magnets appear to repel/attract. Balls appear to roll to bottom of valley It appears the aim of evolution is to produce more like individuals. The more individuals in the current generation, the more likely the species will survive. If the number hits zero the species is extinct (and
therefore very unlikely to reappear). If everything become extinct, evolution has failed LIMITS OF NATURAL EVOLUTION Some bacteria reproduce faster than it takes to copy their DNA! Nice solution Wheels are simple from an engineering perspective, but hard for evolution. Bulldogs are artificially selected to have larger heads, which is not naturally selected, with the result bulldogs are born by cesarean
section (humans have a fontella ). WHY EVOLUTION SEEMS SUCCESSFUL Evolution has undoubtedly produced a vast array of interesting, simple/complex, creative solutions to some demanding problems. Evolution appears so successful as it is often solving selfimposed problems regarding survival. Basic problem: resource Advanced problem: competition Evolution is providing the solutions to the problems it is posing.
A biological solution to a biological problem. End of part 1 Natural evolution Start of part 2 Genetic programming.
NON-BIOLOGICAL DESCRIPTION Most crossover operators conserve the amount of genetic material, remaining faithful to biology. XO: P X P -> P X P and is just a binary operatory. Labeling thing influences the way we think about things (mathematical terminology is largely unbiased). Calling it crossover makes us think we should conserve the size of the programs. Could even be n-ary operator! Thinking biologically constrains us!!!
I have even seen post-doc using ATCG for a problem where are binary representation was perfectly okay. NON-BIOLOGICAL DESCRIPTION NON-BIOLOGICAL DESCRIPTION biological population individual or o-spring mutation operator crossover operator
selection gene chromosome genotype phenotype fitness allele mathematical multi-set of programs program unary operator
binary operator n-ary function instruction ADF program function objective value ? GP - CROSSOVER
Example Evolving a Word processor (wp) Template Font-hotkeys-input method WP1 arial-windows-dasher
WP2 courier-unix-voice Child1 courier-windows-voice (viable variation) Child2 voice-arial-unix (unviable variation) The purpose of crossover is to safely search the sub-space of viable combinations -off-spring. It is not the position of the code, but the context! NUMBER OF LOOPS - EVOLVED PROGRAMS NUMBER OF LOOPS - EVOLVED PROGRAMS loops-problem-reference
1 multiplication [10] 1 squares, cubes, factorial, Fibonacci [11] 1 even parity [12] 1 sorting, proper subtraction [6] 1 language recognition [7] ? HIV data-set [8] 2 multiplication [9] 2 maze navigation, function regression [14] 2 evolving data structures [27, 28] MANIPULATING SYNTAX
How can random changes in syntax bring about meaningful changes in semantics? We are ignoring the mapping between programs and functions. Given two operators and two function sets most GP researchers would not know which combination would be better. This interplay is what defines the landscape and ultimately the success of GP. THOUGHT EXPERIMENT
As a programmer you understand the semantics of the function set e.g. {+, -, *, /}. Imagine you were semantically blind and were only dealing with arbitrarily labeled functions {jiggle, wiggle, boggle, diddle}. In your favorite GP algorithm, the crossover component is replaced by you the semantically blind programmer. STOCASTIC SEARCH FOR NON-STOCASTIC PROBLEMS
Many of the problems tackled in the literature (for Turing Complete GP) are not stochastic E.g. sorting, multiplication, even parity. These are not even noise problems. Similar to situation with artificial neural networks and support vector machines. Would you trust a method that gave you a different answer each time? SUMMARY
Biology 1. genetic code 2. Crossover primary search operator 3. limits of natural evolution. 4. evolution is deceptively successful. Genetic Programming 5. Non-biological description (mathematical) 6. Crossover unsuitable? 7. number of loops in evolved programs (very few!)