CharlieAI

Friday, October 7, 2011

Initial Objective Scores

I completed the objective scoring function that I mentioned earlier and wanted to document some initial results. I did 10 evolution runs against each opponent type, until a perfect fitness case was found, and calculated the average objective score:

Random Move: 124, 180, 108, 132, 366, 160, 108, 72, 198, 108 =>155.6
Good Move: 310, 344, 882, 138, 273, 1172, 150, 518, 128, 246 => 416.1

The Random Move opponent simply just picks a random, valid, move each turn.

Good Move picks a winning move if there is one move that makes 3 in a row, and blocks the opponent if there is a chance for them to play one piece to get 3 in a row. Otherwise it just picks a random, valid, move.

The setup was as follows:

Function nodes	+, -, , /, IfThenElse, =, >, <, And, Or, MakeMove*
Terminal nodes	0, 1, 2, Ephemeral Constant (0-10), SQ_00...SQ_22
Crossover probability	90%
Reproduction probability	9%
Mutation probability	1%
Mutation depth	2
Max tree size (number of nodes)	100
Max tree depth	30

Of course, this was more teaching the AI how to play as opposed to letting it evolve from scratch, so the true test would be with co-evolution. However, I needed to do some more groundwork to get that up and running in my new environment...

Sunday, October 2, 2011

A Tic-tac-toe Yardstick

After getting a new laptop and installing Linux on it (after some experimentation I settled with Ubuntu), I now have a better system to continue experimentation on. I have been spending some time converting my existing code to make use of some of the new C++11 features and its new standard library. The GP system now relies very little on the Boost library (purely for timing functions) and is thus much more portable.

So now the focus is on what I can do to evolve something that successfully plays Tic-tac-toe. Something I have been wanting to introduce for a long time is a yardstick - a way to objectively measure how good my Tic-tac-toe players are. You cannot really do good science without one - how else do you know when you have improved the system or by how much?

Fortunately, for a simple game like Tic-tac-toe it should be possible to make the yardstick something that systematically plays all possible game permutations against the evolved players. An upper bound for the number of permutations can be computed by observing that the first move can be any of 9 possible moves, then 8, then 7 etc. reaching a total of 9! = 362880 permutations. Considering that often a winner is determined before the game is played that far, the actual number of games to play through will be a lot lower. Thus, the yardstick should not be too cpu-intensive and can probably easily be applied at the end of each run to give it an objective score: The number of wins against the yardstick player, which tries out each each possible permutation.

Friday, August 5, 2011

Fibonacci Sequence

I wanted to test my GP framework with a more challenging problem. So a few weeks ago I decided to use my newly found computing resource to try and evolve a program that can output the first 30 numbers of the Fibonacci sequence.

My initial setup was similar to the one used in the search for even numbers - only the fitness function differed. This time it did not result in a perfect solution however. Elaborate approximations were evolved but at no point was a simple iterative solution, keeping track of the last two numbers output and adding these up, evolved.

I thought there may have been some subtle bug in my system preventing that kind of program to be constructed, so I tried manually creating one using the function nodes available to the simulation. That worked fine, so somehow the probability of evolving an equivalent program was just very unlikely. That led me to experiment with other pseudo-random number generators (I ended up using Boost's Mersenne twister implementation). Still no perfect solution.

I had been using a fixed seed for my random number sequences so the final thing I tried was to change the system to use the system clock as seed in order to get variation in successive runs. I ran the final setup with large populations and hundreds of generations ten times over. Still no perfect solution. Still just very good approximations.

I finally concluded that the simulation is very likely to generate a good approximate solution early on and thus gets stuck in a local maximum, where progressive improvements do not move the population in the direction of a perfect solution, and where the distance to the perfect solution is so vast that arriving at it is extremely improbable.

The experiment underlined how important it is to provide an environment with gradual selection pressure if one wants to succeed with genetic programming.

Sunday, May 29, 2011

Cloud computing to the rescue

I've been playing with the thought of getting a fast, reliable computer for my experiments for a while, and a couple of weeks ago I found the perfect solution: Amazon Elastic Compute Cloud. You basically launch computer instances on a need by need basis and within a few seconds they are up and running with whatever OS and software you have configured on them.

It is relatively cost-effective compared to buying your own hardware and maintaining that. I only need computing power from time to time, but most often a lot of it, and this way I can launch as many cores as I need whenever I need! :)

Their Linux instances are the most powerful, so I've spent some time porting my GP code, as a few components were not as portable as I would have liked (was using Windows-based threads and timers). It's now all Boost based and today I got it all running so I can't wait to try some more experiments on fast CPUs.

Saturday, May 7, 2011

Loops in Genetic Programming

Dealing with loops and other iterative constructs in GP is far from trivial. You have to deal with closure and you have to avoid infinite or lengthy fitness evaluations. It only takes a few nested loops of a million iterations each in your individuals to grind everything to (nearly) a halt.

This is perhaps why loops have not seen much use in the GP field. Experiments with Explicit For-loops in Genetic Programming by Vic Ciesielski and Xiang Li provides good coverage of related work.

I ended up using a binary variant of protected for-loops where the first sub tree provides the number of iterations, and the second sub tree, the body of the loop, is then iterated. I did not restrict the number of loop nodes per program, but instead enforced an overall limit of 1000 iterations per program.

To give the programs something to work with, I also introduced a few variable nodes to read and write the variables X and Y. X was always set to the current iteration of a loop.

This allowed me to evolve a perfect solution to the "100 First Even Numbers" sequence in just 8 generations amongst a population of 10,000! :) The run parameters as well as the tree of the perfect solution can be seen below:

Function nodes	Add, Subtract, Multiply, Divide, SetX, SetY, OutputNumber, Loop
Terminal nodes	0, 1, 2, Ephemeral Constant (0-10), GetX, GetY
Crossover probability	90%
Reproduction probability	9%
Mutation probability	1%
Mutation depth	2
Max tree size (number of nodes)	30
Max tree depth	10

Tuesday, May 3, 2011

Sequences

I wanted to do some experiments with simpler problems, so last night I experimented with evolving individuals that can generate a sequence of numbers. Fitness is then based on how many of these numbers match a predefined sequence. Conceptually, sequences of numbers are a bit like board games, which in a way just boil down to sequences of moves...

Function nodes	Add, Subtract, Multiply, Divide, SetX
Terminal nodes	0, 1, 2
Crossover probability	90%
Reproduction probability	9%
Mutation probability	1%
Mutation depth	2
Max tree size (number of nodes)	300
Max tree depth	100

I experimented with the sequence of even numbers. The system struggled to evolve large sequences correctly, but produced perfect solutions for up to the first 6 even numbers: 2, 4, 6, 8, 10, 12 after a few minutes of computation.

Note that the "index" was not provided to the population - the programs simply had to add numbers to a list, with a SetX function node, and then the list was examined in full afterwards.

At first I limited the tree depth to 30, which turned out to be too stringent. Expanding the limit to 100 was sufficient up to the first 6 digits.

After examining the trees containing perfect solutions, I'm starting to see why the trees need to be so deep. Personally, if I coded the program tree manually, I would not be able to create a much shallower tree with the function nodes provided; I would need a Loop node. So I will try to add that next.

But that got me thinking about the Tic-tac-toe experiment. I had avoided adding a Loop node in order to not make the search space too large. In my mind, the solution I had expected to evolve was a giant if-then-else-if-else-if... block with all the possible positions to block against. But considering the number of possible "near-wins" for the opponent there are quite a few: 3 horizontal, 3 vertical and 2 diagonal lines, each of which can have two pieces in 3 different ways, so 24 moves to block against. The Sequence experiment struggled with anything higher than 6 numbers of a sequence, so it's not surprising that the Tic-tac-toe opponents couldn't cover most winning strategies.

So a Loop function node may actually help quite a lot. Alternatively, a FunctionList node may do the trick...i.e. something with high arity that calls a number of sub trees in turn, similar to calling a function in procedural programming languages (making the tree wide instead of deep). Automatically-Defined-Functions would also be interesting to add, to improve reuse of code.

But first things first, so I will try with Loops initially.

Monday, May 2, 2011

Tic-tac-toe with Charlie

I first attempted the Tic-tac-toe experiment, using Charlie, with the same setup as the symbolic regression experiment, along with a MakeMove function node and terminal nodes to represent the status of each square on the board:

Function nodes	Add, Subtract, Multiply, Divide, MakeMove
Terminal nodes	0, 1, 2, EphemeralConstanct(0,10), SQ_00, SQ_01, SQ_02, SQ_10, SQ_11, SQ_12, SQ_20, SQ_21, SQ_22
Crossover probability	90%
Reproduction probability	9%
Mutation probability	1%
Mutation depth	2
Max tree size (number of nodes)	unlimited
Max tree depth	unlimited

That didn't produce very good results. This was as expected as there were no conditional function nodes. So I introduced If-Then-Else, Equals, Greater-Than, Less-Than, And and Or operators and immediately got more promising results.

However, it was still very easy to beat the evolved opponents, so I went ahead and made various performance optimisations including making the simulation multi-threaded. Still not very good results.

I then tried running the setup with huge populations, on a 64-bit OS with lots of RAM, for several weeks but still the best opponents were mediocre and very easy for humans to beat. They frequently made large blunders and you didn't have to be particularly careful to ensure a win against them.

All in all, very similar results to the initial Beagle experiment, so quite disappointing. That brings my journal up to the present. Yes, I'm stuck with this current issue of the simulation getting stuck before strong players are evolved. I'm going to try and think of some new approach or tweak to the setup to improve things. Do let me know if you have any suggestions!