identifying the evolution of an agent over time is problematic. In the laboratory, an agent
could learn from her own experience but not from the experience of others. In fact, an agent
could not even observe, let alone copy, the strategy played by others.
The size of the memory set, K, is a measure of the level of sophistication of an agent since it
determines how many strategies an agent can simultaneously evaluate and remember. The
Psychology literature has pointed out that the working memory has severe limitations in the
quantity of information that it can store and process. According to these findings, the memory
limitation is not just imperfect recall from one round to the next, but rather an inability to
maintain an unlimited amount of information in memory during cognitive processing (Miller,
1956; Daily et al., 2001). The classic article by Miller (1956) stresses the “magic number
seven” as the typical number of units in people’s working memory. As the memory set size K
needs to be even, both 6 and 8 are viable options. We set K=6, which implies that decision-
makers have a hardwired limitation in processing information at 6 strategies at a time.
As each agent is endowed with a memory set, in the individual learning GA (multi-
population) there is an additional issue of how to choose a strategy to play out of the K
available. This task is performed by a stochastic operator that we will call choice rule. The
choice rule works in a very similar way as the reinforcement rule, i.e. as a one-time pairwise
tournament, where (1) two strategies, aikt and aiqt, are randomly drawn with replacement from
the memory set Ait and (2) the strategy with the highest score in the pair is chosen to be
played: a*it=argmax{π (aikt), π(aiqt)}. A pairwise tournament is different from deterministic
maximization, because the best strategy in the memory set is picked with a probability less
than one. The choice rule, however, is characterized by a probabilistic response that favors
high-score over low-score available strategies. In particular, the probability of choosing a
strategy is strictly increasing in its ranking within the memory set. The stochastic element in