CharlieAI: March 2011

Tuesday, March 8, 2011

Introducing Charlie

A few years ago I decided that to fully understand the dynamics of genetic programming, I would have to implement my own framework. Beagle had proved incredibly useful, but I required full control of the entire software design, down to minute details, so I had to write my own.

Named after a dog we had when I was a kid, and after a certain pioneer in the field of evolution, I called my new framework Charlie (if you were wondering why this blog was called CharlieAI, you now have your answer!)

I will spare you the implementation details and instead list the feature set I ended up with:

Tree-based genomes
Cross-over, mutation and re-selection operators
Node and sub-tree mutation
Tournament-based selection
Ephemeral random constants
Population statistics
Lisp-style and graph in .gif file output (Graphviz)
Multi-threaded evaluation
Single and co-evolution
Hall-of-fame archives

This was all built over six months or so, but I only had a few hours of time to devote every week so the whole thing could easily have been done in something like two weeks of full-time work. I must admit, though, that it was nice to have ample thinking time between implementation steps, to ensure it all fitted well together and performed efficiently.

Next I will cover the results I got from running symbolic regression experiments with this framework. In the meantime, if you would like any more details on any of the above features or any aspects of the software, do let me know and I will attempt to cover it in a future post.

Thursday, March 3, 2011

Hall-of-fame archives

To solve the problem with strategy loops, I implemented a hall-of-fame archive - a collection of the best individual from each generation that led to a new "champion". Generations that did not beat the opposing population were ignored.

Fitness was then measured as an individual's performance against the opposing population's entire archive. Only if all members could be beaten, would a new champion emerge and get added to its population's archive. Thus, loops were made impossible and only definite progress was allowed.

One issue with this approach was the increasing CPU costs of evaluating individuals as the archives expanded, but this is the approach I still use today. It would be interesting to try only using the e.g. ten most recent, and therefore strongest, members of the archive as that would still make progress likely, although not guaranteed, at a much reduced CPU cost.

Using hall-of-fame archives resulted in much stronger Tic-tac-toe players than previously achieved. At first a new champion would be evolved every few generations, but after about a few dozen members in the archive the time between new champions increased, seemingly exponentially, and thus stagnated - even if I left the simulation running for weeks.

While the evolved programs were much stronger than any I had previously evolved, they were still not as strong as I had expected. If I played certain patterns they would block well and eventually win, but they were playing far from perfectly and were still pretty easy to beat.

I was under no illusion that my search space was not huge, but I had thought that for a game as simple as Tic-tac-toe it would not be unrealistic to evolve a perfect player, so that the game would always end in a draw or a win for the AI.

I thought there must be something I could do to improve things, but didn't know exactly what; it was back to the drawing board...

Wednesday, March 2, 2011

Coevolutionary Principles

Before I continue my project log, I wanted to point you at an excellent chapter that I came across today:

Coevolutionary Principles (PDF)

The chapter is an extensive overview of co-evolution and a summary of progress made in the last two decades.

It is by Popovici, E., Bucci, A., Wiegand, P. and De Jong E.D. and from Handbook of Natural Computing, Springer-Verlag, which will be out later this month.

Tuesday, March 1, 2011

Strategy loops

After close examination of the tactics employed by the best individuals from successive generations, I noticed that strategies that were the best in early generations got superseded, as you would expect, but then reappeared several generations later!

While this at first was somewhat puzzling there turned out to be a logical explanation: If program x is always beaten by program y, and y by program z, then that does not mean that x cannot beat z. In practice it was more like this:

program a from population A is randomly selected
program b from population B beats a
c from A beats b
d from B beats c
e from A beats d
f from B beats e
g from A beats f
...
m from A beats l
n, equivalent to f, from B beats m
repeats from step 7.

So this was why the simulation never produced strong players.

Next, I will describe how I solved this problem, but I'd be interested to hear from you if you have any suggestions.