CharlieAI: Co-evolution

Showing posts with label Co-evolution. Show all posts

Thursday, March 3, 2011

Hall-of-fame archives

To solve the problem with strategy loops, I implemented a hall-of-fame archive - a collection of the best individual from each generation that led to a new "champion". Generations that did not beat the opposing population were ignored.

Fitness was then measured as an individual's performance against the opposing population's entire archive. Only if all members could be beaten, would a new champion emerge and get added to its population's archive. Thus, loops were made impossible and only definite progress was allowed.

One issue with this approach was the increasing CPU costs of evaluating individuals as the archives expanded, but this is the approach I still use today. It would be interesting to try only using the e.g. ten most recent, and therefore strongest, members of the archive as that would still make progress likely, although not guaranteed, at a much reduced CPU cost.

Using hall-of-fame archives resulted in much stronger Tic-tac-toe players than previously achieved. At first a new champion would be evolved every few generations, but after about a few dozen members in the archive the time between new champions increased, seemingly exponentially, and thus stagnated - even if I left the simulation running for weeks.

While the evolved programs were much stronger than any I had previously evolved, they were still not as strong as I had expected. If I played certain patterns they would block well and eventually win, but they were playing far from perfectly and were still pretty easy to beat.

I was under no illusion that my search space was not huge, but I had thought that for a game as simple as Tic-tac-toe it would not be unrealistic to evolve a perfect player, so that the game would always end in a draw or a win for the AI.

I thought there must be something I could do to improve things, but didn't know exactly what; it was back to the drawing board...

Wednesday, March 2, 2011

Coevolutionary Principles

Before I continue my project log, I wanted to point you at an excellent chapter that I came across today:

Coevolutionary Principles (PDF)

The chapter is an extensive overview of co-evolution and a summary of progress made in the last two decades.

It is by Popovici, E., Bucci, A., Wiegand, P. and De Jong E.D. and from Handbook of Natural Computing, Springer-Verlag, which will be out later this month.

Tuesday, March 1, 2011

Strategy loops

After close examination of the tactics employed by the best individuals from successive generations, I noticed that strategies that were the best in early generations got superseded, as you would expect, but then reappeared several generations later!

While this at first was somewhat puzzling there turned out to be a logical explanation: If program x is always beaten by program y, and y by program z, then that does not mean that x cannot beat z. In practice it was more like this:

program a from population A is randomly selected
program b from population B beats a
c from A beats b
d from B beats c
e from A beats d
f from B beats e
g from A beats f
...
m from A beats l
n, equivalent to f, from B beats m
repeats from step 7.

So this was why the simulation never produced strong players.

Next, I will describe how I solved this problem, but I'd be interested to hear from you if you have any suggestions.

Sunday, February 27, 2011

Co-evolution

The main reason for introducing co-evolution was to provide gradual selection pressure. Instead of just evolving one population against a fixed fitness measure I instead setup two populations, A and B, each initialised with random individuals. I then picked a random individual from A and evolved B until it produced a player that could beat the individual from A. This individual, now the strongest from B, was used to evolve A. Once some player from A could beat it, I swapped again and so forth:

Select random individual from A, Best of A, to get things started.
Evolve B until it can beat Best of A and thus select Best of B.
Evolve A until it can beat Best of B and thus select Best of A.
Repeat from step 2.

This meant that small steps were continually taken to improve the AI and, in theory, I should have ended up with much stronger players than when evolving against Mr Random, as per my previous post on rule inference. That was not the case however - when playing against the best individuals it was evident that they were pretty poor, not much stronger than the ones evolved against Mr Random.

I tried much larger populations and leaving it running for several days on more powerful PCs but to no avail, so I decided to analyse the evolution process and the individual generations in more detail...