Fig. 2. The initial sensor setup, which is kept throughout the evolutionary
run for those runs where sensor parameters are not evolvable. Here, the car
is seen in close-up moving upward-leftward. At this particular position, the
front-right sensor returns a positive number very close to 0, as it detects a
wall near the limit of its range; the front-left sensor returns a number close
to 0.5, and the back sensor a slightly larger number. The front, left and right
sensors do not detect any walls at all and thus return 0.
range 200 pixels, as has three sensors pointing forward-
left, forward-right and backward respectively. The two other
sensors, which point left and right, have reach 100; this is
illustrated in figure 2.
B. Neural networks
The controllers in the experiments below are based on
neural networks. More precisely, we are using multilayer
perceptrons with three neuronal layers (two adaptive layers)
and tanh activation functions. A network has at least three
inputs: one fixed input with the value 1, one speed input
in the approximate range [0..3], and one input from the
waypoint sensor, in the range [-Π..Π]. In addition to this,
it might have any number of inputs from wall sensors, in
the range [0..1]. All networks have two outputs, which are
interpreted as driving commands for the car.
C. Evolutionary algorithm
The genome is an array of floating point numbers, of
variable or fixed length depending on the experimental setup.
Apart from information on the number of wall sensors and
hidden neurons, it encodes the orientation and range of the
wall sensors, and weights of the connections in the neural
network.
The evolutionary algorithm used is a kind of evolutionary
strategy, with μ = 50 and δ = 50. In other words, 50
genomes (the elite) are created at the start of evolution. At
each generation, one copy is made of each genome in the
elite, and all copies are mutated. After that, fitness value is
calculated for each genome, and the 50 best individuals of
all 100 form the new elite.
There are two mutation operators: Gaussian mutation
of all weight values, and Gaussian mutation of all sensor
parameters (angles and lengths), which might be turned on
or off. In both cases, the standard deviation of the Gaussian
distribution was set to 0.3.
Last but not least: the fitness function. The fitness of a
controller is calculated as the number of waypoints it has
Track |
10 |
50 |
100 |
200 |
Pr. |
^4 |
0.32 (0.07) |
0.54 (0.2) |
0.7 (0.38) |
0.81 (0.5) |
2 |
2 |
0.38 (0.24) |
0.49 (0.38) |
0.56 (0.36) |
0.71 (0.5) |
2 |
3 |
0.32 (0.09) |
0.97 (0.5) |
1.47 (0.63) |
1.98 (0.66) |
7 |
4 |
0.53 (0.17) |
1.3 (0.48) |
1.5 (0.54) |
2.33 (0.59) |
9 |
5 |
0.45 (0.08) |
0.95 (0.6) |
0.95 (0.58) |
1.65 (0.45) |
8 |
6 |
0.4 (0.08) |
0.68 (0.27) |
1.02 (0.74) |
1.29 (0.76) |
5 |
7 |
0.3 (0.07) |
0.35 (0.05) |
0.39 (0.09) |
0.46 (0.13) |
0 |
8_______ |
0.16 (0.02) |
0.19 (0.03) |
0.2 (0.01) |
0.2 (0.01) |
0____ |
TABLE I
The fitness of the best controller of various generations on
THE DIFFERENT TRACKS, AND NUMBER OF RUNS PRODUCING
proficient controllers. Fitness averaged over 10 separate
EVOLUTIONARY RUNS; STANDARD DEVIATION BETWEEN PARENTHESES.
passed, divided by the number of waypoints in the track,
plus an intermediate term representing how far it is on its way
to the next waypoint, calculated from the relative distances
between the car and the previous and next waypoint. A
fitness of 1.0 thus means having completed one full track
within the alloted time. Waypoints can only be passed in the
correct order, and a waypoint is counted as passed when the
centre of the car is within 30 pixels from the waypoint. In
the evolutionary experiments reported below, each car was
allowed 700 timesteps (enough to do two to three laps on
most tracks in the test set) and fitness was averaged over
three trials.
IV. Evolving track-SPECIFIC controllers
The first experiments consisted in evolving controllers for
the eight tracks separately, in order to the test the software
in general and to rank the difficulty of the tracks.
For each of the tracks, the evolutionary algorithm was run
10 times, each time starting from a population of “clean”
controllers, with all connection weights set to zero and sensor
parameters as explained above. Only weight mutation was
allowed. The evolutionary runs were for 200 generations
each.
A. Fixed sensor parameters
1) Evolving from scratch: The results are listed in table I,
which is read as follows: each row represents the results for
one particular track. The first column gives the mean of the
fitnesses of the best controller of each of the evolutionary
runs at generation 10, and the standard deviation of the
fitnesses of the same controllers. The next three columns
present the results of the same calculations at generations 50,
100 and 200, respectively. The “Pr” column gives the number
of proficient best controllers for each track. An evolutionary
run is deemed to have produced a proficient controller if
its best controller at generation 200 has a fitness (averaged,
as always, over three trials) of at least 1.5, meaning that it
completes at least one and a half lap within the allowed time.
For the first two tracks, proficient controllers were pro-
duced by the evolutionary process within 200 generations,
but only in two out of ten runs. This means that while it is
possible to evolve neural networks that can be relied on to