Context-Dependent Thinning 6
therefore high storage capacity of the distributed auto-associative memories where these low-density
codevectors are stored can be maintained as well (see also section 2.1).
Hence the component codevectors are represented in the codevector of the composite item in a
reduced form - by a fraction of their 1s. The idea that the items of higher hierarchical levels ("floors")
should contain their components in reduced, compressed, coarse form is well-accepted among those
concerned with diverse aspects of Artificial Intelligence research. Reduced representation of component
codevectors in the codevector of composite item realized in the APNN may be relevant to "coarsen
models" of Amosov (1968), "reduced descriptions" of Hinton (1990), and "conceptual chunking" of
Halford, Wilson, & Phillips (in press).
Reduced representation of component codevectors in the codevectors of composite items also
allows a solution of the superposition catastrophe. If the subset of 1s included in the codevector of a
composite item from each of the component codevectors depends on the composition of component
items, then different subsets of 1s from each component codevector will be found in the codevectors of
different composite items. For example, non-identical subsets of 1s will be incorporated into the
codevectors of items abc and acd from a. Therefore the component codevectors will be bound together
by the subsets of 1s delegated to the codevector of the composite item. It hinders the occurrence of false
patterns and assemblies.
For the example from the Introduction, when both ac and cb are present, we will get the
following overall composite codevector: ac ∨ ca ∨ cb ∨ bc, where xy stands for the subset of 1s in x that
becomes incorporated into the composite codevector given y as the other component. Therefore if ac ≠
ab, bc ≠ ba, we do not observe the ghost pattern ab ∨ ba in the resultant codevector.
For the example of Figure 2A, where false assemblies emerge, they do not emerge under reduced
representation of items (Figure 2B). Now interassembly connections are formed between different
subsets of active neurons which have relatively small intersection. Therefore the connectivity of
assembly corresponding to the non-presented item abc is low.
That the codevector of a composite item contains the subsets of 1s from the component
codevectors preserves the information on the presence of component items in the composite item. That
the composition of each subset of 1s depends on the presence of other component items preserves the
information on the combinations in which the component items occurred. That the codevector of a
composite item has approximately the same number of 1s as its component codevectors allows the
combinations of such composite codevectors to be used for construction of still more complex
codevectors of higher hierarchical levels.
Thus an opportunity emerges to build up the codevectors of items of varied composition level
containing the information not only on the presence of their components, but on the structure of their
combinations as well. It provides the possibility to estimate the similarity of complex structures without
their unfolding but simply as overlap of their codevectors which is considered by many authors as a very
important property for AI systems (e.g. Kussul, 1992; Hinton, 1990; Plate, 1995, 1997).
Originally the procedure reducing the sets of coding 1s of each item from the group which
makes up a composite item was named "normalization" (Kussul, 1988; Kussul & Baidyk, 1990; Kussul,
1992). That name emphasized the property to maintain the number of 1s in the codes of composite items
equal to that of component items.. However in this paper we will call it "Context- Dependent Thinning"
(CDT) by its action mechanism, that reduces the number of 1s taking into account the context of other
items from their group.
3. Requirements on the Context-Dependent Thinning procedures
Let us summarize the requirements on the CDT procedures and on the characteristics of codevectors
produced by them. The procedures should process sparse binary codevectors. An important case of input
is superimposed component codevectors. The procedures should output the codevector of the composite
item where the component codevectors are bound and the density of the output codevector is comparable
to the density of component codevectors. Let us call the resulting (output) codevector as "thinned"
codevector. The requirements may be expressed as follows.