The Classical Unconstrained Neural Spatial Interaction Model
Hornik, Stinchcombe and White (1989) proved that single hidden layer feedforward
networks with ψ being the identity function and φh - φ (h = 1,..., H) an arbitrary
sigmoid transfer function2 can approximate any measurable function to any degree of
accuracy (in appropriate metrics), given sufficiently many hidden units. Thus, the
neural spatial interaction models suggested by Fischer, Hlavackova-Schindler and
Reismann (1999):
H
ωl (x, w ) = Σ γh
h=0
( ( 3
1 + exP∣ -∑ βhn
4 ∖ n=0
-ɪʌ
(7)
represent a rich and flexible family of spatial interaction function approximators.
Although it has become common place to view network models such as (7) as kinds of
black boxes, this leads to inappropriate applications which may fail not because such
network models do not work well but because the issues are not well understood.
Failures in applications can often be attributed to inadequate learning (training),
inadequate numbers of hidden units, or the presence of a stochastic rather than a
deterministic relation between input and target.
Least Squares Learning
If we view (7) as generating a family of approximators (as w ranges over W, say) to
some specific empirical spatial interaction phenomenon relating inputs x to some
response, y, then we need a way to pick the best approximation from this family. This is
the function of learning in the context of neural network modelling. The goodness of
the approximation can be evaluated using an error [penalty] function that measures how
well the model output y matches the target output y corresponding to given input x.
The penalty should be zero when target and model output match, and positive
otherwise. A leading case is the least square error function. With this error (penalty)
function, learning must arrive at w* which solves
min
w <ξW
E (⅛ ( y - ωl ( x, w ))2 )
- E {⅛ ( y - E ( y |x)2 )} + E {⅛ [E ( y |x) - ωl (x, w)]2} (8)