Context-Dependent Thinning 15
codevectors down to the density of their component codevectors. (For permutive conjunctive thinning,
the density of component codevectors was chosen to get approximately M of 1s in the result).
7.1. Similarity of thinned codevectors
Let us find an overlap of thinned codevector (abcde) with (abcde∖ (abcdf∖ (abcfg∖ (abfgh∖ (afghi∖
and (fghijX Here () is used to denote any thinning procedure. A normalized measure of the overlap of x
with various y is determined as ∣x∧y∣∕∣x∣.
The experimental results are presented in Figure 7A, where the normalized overlap of thinned
composite codevectors is shown versus the normalized overlap of corresponding unthinned composite
codevectors. It can be seen that the overlap of thinned codes for various versions of the CDT procedure
is approximately equal to the square of overlap of unthinned codes. For example, the similarity (overlap)
of abcde and abfgh is approximately 0.4 (two common components of five total), and the overlap of
their thinned codevectors is about 0.16.
7.2. Similarity of component codevector subsets included into thinned codevectors
Some experiments were conducted in order to investigate the similarity of subsets requirement (3.8). The
similarity of subsets of a component codevector incorporated into various thinned composite vectors was
obtained as follows. First, the intersections of various thinned five-component composite codevectors
with their component a were determined: u=a∧(abcde∖ v=a∧(abcdf), w=a∧(abcfg∖ x=a∧(abfgh∖
y=a∧(afghiχ Then, the normalized values of the overlap of intersections were obtained as ∣u∧v∣∕∣u∣,
∣u∧w∣∕∣u∣, ∣u∧x∣∕∣u∣, ∣u∧y∣∕∣u∣.
Figure 7B shows how the similarity (overlap) of component codevector subsets incorporated into
two thinned composite codevectors varies versus the similarity of corresponding unthinned composite
codevectors. It can be seen that these dependencies are different for different thinning procedures: for
the CDTadd and the CDTadd-sl they are close to linear, but for the CDTsub and CDTsub-sl they are
polynomial. Which is preferable, depends on the application.
7.3. The influence of the depth of thinning
By the depth of thinning we understand the density value of a thinned composite codevector. Before, we
considered it equal to the density of component codevectors. Here, we vary the density of the thinned
codevectors. The experimental results presented in Figure 8 are useful for the estimation of resulting
similarity of thinned codevectors in applications.
As in sections 7.1-7.2, composite codevectors of five components were used. Therefore
approximately 5M of 1s (actually, more close to 4.9M because of random overlaps) were in the input
vector before thinning. We varied the number of 1s in the thinned codevectors from 4M to M∕4. Only the
additive and the subtractive CDT procedures were investigated.
The similarity of thinned codevectors is shown in Figure 8A. For a shallow thinning, where the
resulting density is near the density of input composite codevector, the similarity degree of resulting
vectors is close to that of input codevectors (the curve is close to linear). For a deep thinning, where the
density of thinned codevectors is much less than the density of input codevectors, the similarity function
behave as a power function, transforming from linear through quadratic to cubic (for subtractive
thinning).
The similarity of component subsets in the thinned codevector is shown in Figure 8B. For the
additive CDT procedure, the similarity function is linear, and its angle reaches approximately 45° for
“deep” thinning. For the subtractive CDT procedure, the function is similar to the additive one for the
“shallow” thinning, and becomes near-quadratic for the “deep” thinning.
8. Representation of structured expressions