period of cultural development.
Absent a detailed understanding of institutional stabiliza-
tion mechanisms, and hence lacking an engineering strategy
for the construction of ‘ethical’ machines, reliable massively
parallel computing technology may not yet be possible.
MATHEMATICAL APPENDIX
The Shannon-McMillan Theorem
According to the structure of the underlying language of
which a message is a particular expression, some messages are
more ‘meaningful’ than others, that is, are in accord with the
grammar and syntax of the language. The Shannon-McMillan
or Asymptotic Equipartition Theorem, describes how mes-
sages themselves are to be classified.
Suppose a long sequence of symbols is chosen, using the
output of the random variable X above, so that an output
sequence of length n, with the form
xn = (α0,α1, ..., αn-1)
has joint and conditional probabilities
P(X0 =α0,X1 = α1,..., Xn-1 = αn-1)
P (Xn = αn |X0 = α0, ..., Xn-1 = αn-1).
(19)
Using these probabilities we may calculate the conditional
uncertainty
H(Xn|X0,X1, ...,Xn-1).
The uncertainty of the information source, H[X], is defined
as
H[X] ≡ lim H(Xn|X0,X1, ...,Xn-1).
n→∞
(20)
In general
H(Xn|X0,X1,...,Xn-1) ≤ H(Xn).
Only if the random variables Xj are all stochastically in-
dependent does equality hold. If there is a maximum n such
that, for all m > 0
H(Xn+m|X0,...,Xn+m-1) =H(Xn|X0,...,Xn-1),
then the source is said to be of order n. It is easy to show
that
H [X] = lim H (X0 ,...Xn).
n→∞ n + 1
In general the outputs of the Xj, j = 0, 1, ..., n are depen-
dent. That is, the output of the communication process at
step n depends on previous steps. Such serial correlation, in
fact, is the very structure which enables most of what is done
in this paper.
Here, however, the processes are all assumed stationary in
time, that is, the serial correlations do not change in time,
and the system is stationary.
A very broad class of such self-correlated, stationary, in-
formation sources, the so-called ergodic sources for which the
long-run relative frequency of a sequence converges stochas-
tically to the probability assigned to it, have a particularly
interesting property:
It is possible, in the limit of large n, to divide all sequences
of outputs of an ergodic information source into two distinct
sets, S1 and S2 , having, respectively, very high and very
low probabilities of occurrence, with the source uncertainty
providing the splitting criterion. In particular the Shannon-
McMillan Theorem states that, for a (long) sequence having
n (serially correlated) elements, the number of ‘meaningful’
sequences, N(n) - those belonging to set S1 - will satisfy the
relation
log[N (n)]
-n--≈ H [X]
(21)
More formally,
lim log[N(n)] = H[X]
n→∞ n
= lim H(Xn|X0, ...,Xn-1)
n→∞
H(X0,...,Xn)
= lim ---------:-----.
n→∞ n + 1
(22)
Using the internal structures of the information source per-
mits limiting attention only to high probability ‘meaningful’
sequences of symbols.
The Rate Distortion Theorem
14