In the neural network community it is well known that supplementing the inputs to a
neural network model with higher-order combinations of the inputs increases the
capacity of the network in an information capacity sense (see Cover 1965) and its
ability to learn (see Giles and Maxwell 1987). Although the error surface of product
unit networks contains more local minima than when using standard transfer functions,
the surface is locally smooth. But the price to be paid is a combinatorial explosion of
higher order terms as the number of inputs to the network increases.
The product units introduced by Durbin and Rumelhart (1989) attempt to make use of
the above fact. Product unit networks have the advantage that - given an appropriate
training algorithm - the units can learn the higher order terms that are required to
approximate a specific constrained spatial interaction function. This motivates to utilise
the product unit rather than the standard summation unit neural framework for
modelling singly constrained interactions over space.
3.2 The Network Architecture
Product units compute the product of inputs, each raised to a variable power. They can
be used in a network in many ways, but the overhead required to raise an arbitrary base
to an arbitrary power makes it more likely that they will supplement rather than replace
summation units (Durbin and Rumelhart 1989).3 Thus, we use the term product unit
networks [or product networks] to refer to networks containing both product and
summation units.
Figure 1 illustrates the modular network architecture of the product unit neural network
that we propose to model the singly constrained case of spatial interactions. Modularity
is seen here as decomposition on the computational level. The network is composed of
two processing layers and two layers of network parameters. The first processing layer
is involved with the extraction of features from the input data. This layer is
implemented as a layer of J functionally independent modules with identical topologies.
Each module is a feedforward network with two inputs x2j.1 and x2 j, H hidden product
units [( j -1)H +1,...,( j -1)H + h,..., jH], denoted by the symbol ∏, and terminates
10