Context-Dependent Thinning 4
- Items of various complexity level are stored in different distributed auto-associative neural-network
memories with the same number N of neurons.
Thus, items of any complexity are encoded by sparse distributed stochastic patterns of binary
neurons in the neural fields of the same dimensionality N. It is convenient to represent activity patterns
in the neural fields as binary vectors, where 1s correspond to active neurons. Let us use bold-face
lowercase font for codevectors to distinguish them from the items they represent denoted in italics.
The number of 1s in x is denoted |x|. We seek to make |x| ≈ M for x of various complexity. The
similarity of codevectors is determined by the number of 1s in their intersection or overlap: ∣x∧y∣, where
∧ is elementwise conjunction of x and y. The probability of 1s in x (or density of 1s in x, or simply the
vector density) is p (x)=|x|/N.
Information encoding by stochastic binary vectors with a small number of 1s allows a high
capacity of correlation-type neural-network memory (known as Willshaw memory or Hopfield network)
to be reached using Hebbian learning rule (Wilshaw, Buneman, & Longuet-Higgins, 1969; Palm, 1980;
Lansner & Ekeberg, 1985; Frolov & Muraviev, 1987, 1988; Frolov, 1989; Amari, 1989; Tsodyks, 1989).
The codevectors we are talking about may be exemplified by vectors with M=100...1000, N=10
000...100 000. Though the maximal storage capacity is reached at M=logN (Willshaw, Buneman, &
Longuet-Higgins, 1969; Palm, 1980), we use M≈√N to get a network with a moderate number N of
neurons and N2 connections at sufficient statistical stability of M (e.g. the standard deviation of M less
than 3%). Under this choice of the codevector density p=M/N ≈ 1∕√N, the information capacity holds
high enough and the number of stored items can exceed the number of neurons in the network
(Rachkovskij, 1990a, 1990b; Baidyk, Kussul, & Rachkovskij, 1990).
Let us consider the problems arising in the APNN and other neural-network architectures with
distributed representations when composite items are constructed.
2.2. Density of composite codevectors
The number H of component items (constituents) comprising a composite item grows exponentially with
the nesting level, that is, with going to the higher levels of the part-whole hierarchy. If S items of a level l
constitute an item of the adjacent higher level (l+1), then for level (l+L) the number H becomes
H = SL . (2.1)
The presence of several items comprising a composite item is encoded by the concurrent activation of
their patterns, that is, by superposition of their codevectors. For binary vectors, we will use superposition
by bitwise disjunction. Let us denote composite items by concatenation of symbols denoting their
component items, e.g. abc. Corresponding composite codevectors (ai ∨ bi ∨ ci, i=1,...,N) will be denoted
as a ∨ b ∨ c or simply abc.
Construction of composite items will be accompanied by fast growth of density p' and respective
number M' of 1s in their codevectors. For H different superimposed codevectors of low density p:
p'H = 1-(1-p)H ≈ 1-e-pH, (2.2)
M'H ≈ p'HN. (2.3)
Equations 2.2 and 2.3 take into account the "absorption" of coincident 1s that prevents the exponential
growth of their number versus the composition level L. However it is important that p'>>p (see Figure 1)
and M' >> M. Since the dimensionality N of codevectors representing items of various complexity is the
same, the size of corresponding distributed auto-associative neural networks, where the codevectors are
stored and recalled, is also the same. Therefore at M' >> M ≈ √N (at the higher levels of hierarchy) their
storage capacity in terms of the number of recallable codevectors will decrease dramatically. To
maintain high storage capacity at each level, M' should not substantially exceed M. However, due to the
requirement of statistical stability, the number of 1s in the code can not be reduced significantly.
Besides, the operation of distributed auto-associative neural-network memory usually implies the same
number of 1s in codevectors. Thus it is necessary to keep the number of 1s in the codevectors of
complex items approximately equal to M. (However, some variation of M between distinct hierarchical
levels may be tolerable and even desirable).