Neural Network Modelling of Constrained Spatial Interaction Flows

Some descriptive statistics characterising M₁ , M₂ and M₃ are summarised in Table 1.
As can be seen from this table there are no large differences between the training,
validation and test sets. There are, nevertheless, differences, especially in t_ij , which will
present some challenge to the estimation procedure used.

5.3 Model Estimation and the Overfitting Problem

Deciding on an appropriate number, H, of product units and on the value for the
Alopex-parameters δ (the step size) is somewhat discretionary, involving the familiar
trade-off between speed and accuracy. The approach adopted for this evaluation was
stopped (cross-validation) training. The Alopex-parameters T and N were set to 1,000
and 10, respectively.

It is worth emphasising that the training process is sensitive to its starting point. Despite
recent progress in finding the most appropriate parameter initialisation that would help
Alopex to find near optimal solutions, the most widely adopted approach still uses
random weight initialisation in order to reduce fluctuation in evaluation. Each
experiment employed to determine H and δ was repeated 60 times, the model being
initialised with a different set of random weights before each trial. Random numbers
were generated from [-0.3, 0.3] using the rand_uni function from Press et al. (1992).
The order of the input data presentation was kept constant for each run to eliminate its
effect on the result. The training process was stopped when к - 40,000 consecutive
iterations were unsuccessful.

Extensive computational experiments with different combinations of H- and δ -values
have been performed on a DEC Alpha 375 Mhz. Table 2 summarises the results of the
most important ones. Training Performance is measured in terms of ARV(M1) and
validation performance in terms of ARV(M2). The performance values represent the
mean of the 60 simulations, standard deviations are given in brackets. Since all
simulations have similar computational complexity, iterations to converge to the
minimal ARV(M2)-value may be used as a measure of learning time. It is easy to see
that the combination of H = 16 and δ = 0.0025 provides an appropriate choice for our
particular application.

More intriguing information

1. THE UNCERTAIN FUTURE OF THE MEXICAN MARKET FOR U.S. COTTON: IMPACT OF THE ELIMINATION OF TEXTILE AND CLOTHING QUOTAS
2. Social Cohesion as a Real-life Phenomenon: Exploring the Validity of the Universalist and Particularist Perspectives
3. International Financial Integration*
4. The name is absent
5. The open method of co-ordination: Some remarks regarding old-age security within an enlarged European Union
6. Effort and Performance in Public-Policy Contests
7. The name is absent
8. The name is absent
9. The changing face of Chicago: demographic trends in the 1990s
10. The name is absent