CharlieAI: Charlie

An important role of this blog is to act as a journal of my experiments. Onwards from now I will seek to document my experiments with enough detail to allow others to compare results, and so that I can compare results myself at a later date. So here we go...

I managed to dig out some notes from my early Charlie tests - at first I experimented using the following setup:

Function nodes	Add, Subtract, Multiply and Divide
Terminal nodes	0, 1, 2 and x
Crossover probability	90%
Reproduction probability	9%
Mutation probability	1%
Mutation depth	2
Max tree size (number of nodes)	unlimited
Max tree depth	unlimited

Below is a table of experiments I made with the above configuration. The function that I was trying to evolve is on the left. Population size and maximum generations are also listed along with some comments on the results:

Function	Population Size	Generations	Result
7.2x³+x²-3x	10000	50	No perfect solution and struggling.
7x³+x²-3x	10000	50	Perfect solution in 16 generations
7x+x²-3x	10000	50	Perfect solution in 1 generation
7.2x+x²-3x	10000	50	No perfect solution
7.2x	10000	50	No perfect solution. Best fitness score: ~442
7.2	10000	50	No perfect solution

I was surprised that the system struggled to generate a sub tree equal or close to 7.2. It seemed to evolve integers just fine but struggled with fractions. I therefore tried introducing two further terminal nodes: the constants 4 and 8 in the hope of guiding the evolution to the right ballpark figures. I repeated the last experiment above, but still got no perfect or even approximate solution:

Function	Population Size	Generations	Result
7.2	10000	50	No perfect solution

I got desperate and decided to introduce ephemeral constants, which was on my to-do list anyway. I also removed constants 4 and 8 to reduce the search space a bit again. This resulted in the following figures:

Function	Population Size	Generations	Result
7.2	10000	50	No perfect solution, but very close (~0.0001)
7.2x³+x²-3x	10000	50	No perfect solution but no longer struggling.
7x³+x²-3x	10000	50	No perfect solution.
7x+x²-3x	10000	50	Perfect solution in 1 generation
7.2x+x²-3x	10000	50	No perfect solution
7.2x	10000	50	No perfect solution. Best fitness score: ~13
7.2x	100000	43	No perfect solution but very close
7.2x	10000	100	No perfect solution. Best fitness score: ~12
sqrt(x)	10000	50	No perfect solution. Best score: ~34
7x+x²-3x+x³	10000	50	Perfect solution in 6 generations.

The results were better but by no means amazing. I then came across a subtle bug that meant some arguments were not forwarded to sub trees, and after fixing that I got the following results:

Function	Population Size	Generations	Result
7.2	10000	50	No perfect solution, but very close (~0.00001)
7.2x³+x²-3x	10000	50	No perfect solution but no longer struggling.
7x³+x²-3x	10000	50	Perfect solution in 12 generations.
7x+x²-3x	10000	50	Perfect solution in 1 generation
7.2x+x²-3x	10000	50	No perfect solution, but very close (~0.0001)
7.2x	10000	50	No perfect solution, but very close: ~10e-6.
7.2x	100000	31	No perfect solution but very close: ~10e-6.
sqrt(x)	10000	50	No perfect solution. Best score: ~0.036
7x+x²-3x+x³	10000	50	Perfect solution in 3 generations.

That was more like it! :) It was time to try the Tic-tac-toe experiments with Charlie...

A few years ago I decided that to fully understand the dynamics of genetic programming, I would have to implement my own framework. Beagle had proved incredibly useful, but I required full control of the entire software design, down to minute details, so I had to write my own.

Named after a dog we had when I was a kid, and after a certain pioneer in the field of evolution, I called my new framework Charlie (if you were wondering why this blog was called CharlieAI, you now have your answer!)

I will spare you the implementation details and instead list the feature set I ended up with:

Tree-based genomes
Cross-over, mutation and re-selection operators
Node and sub-tree mutation
Tournament-based selection
Ephemeral random constants
Population statistics
Lisp-style and graph in .gif file output (Graphviz)
Multi-threaded evaluation
Single and co-evolution
Hall-of-fame archives

This was all built over six months or so, but I only had a few hours of time to devote every week so the whole thing could easily have been done in something like two weeks of full-time work. I must admit, though, that it was nice to have ample thinking time between implementation steps, to ensure it all fitted well together and performed efficiently.

Next I will cover the results I got from running symbolic regression experiments with this framework. In the meantime, if you would like any more details on any of the above features or any aspects of the software, do let me know and I will attempt to cover it in a future post.

CharlieAI

Monday, April 11, 2011

Symbolic regression results

Tuesday, March 8, 2011

Introducing Charlie