Neural Network Modelling of Constrained Spatial Interaction Flows



Some descriptive statistics characterising M1 , M2 and M3 are summarised in Table 1.
As can be seen from this table there are no large differences between the training,
validation and test sets. There are, nevertheless, differences, especially in
tij , which will
present some challenge to the estimation procedure used.

5.3 Model Estimation and the Overfitting Problem

Deciding on an appropriate number, H, of product units and on the value for the
Alopex-parameters
δ (the step size) is somewhat discretionary, involving the familiar
trade-off between speed and accuracy. The approach adopted for this evaluation was
stopped (cross-validation) training. The Alopex-parameters
T and N were set to 1,000
and 10, respectively.

It is worth emphasising that the training process is sensitive to its starting point. Despite
recent progress in finding the most appropriate parameter initialisation that would help
Alopex to find near optimal solutions, the most widely adopted approach still uses
random weight initialisation in order to reduce fluctuation in evaluation. Each
experiment employed to determine
H and δ was repeated 60 times, the model being
initialised with a different set of random weights before each trial. Random numbers
were generated from [-0.3, 0.3] using the rand_uni function from Press et al. (1992).
The order of the input data presentation was kept constant for each run to eliminate its
effect on the result. The training process was stopped when
к - 40,000 consecutive
iterations were unsuccessful.

Extensive computational experiments with different combinations of H- and δ -values
have been performed on a DEC Alpha 375 Mhz. Table 2 summarises the results of the
most important ones. Training Performance is measured in terms of ARV(
M1) and
validation performance in terms of ARV(
M2). The performance values represent the
mean of the 60 simulations, standard deviations are given in brackets. Since all
simulations have similar computational complexity, iterations to converge to the
minimal ARV(
M2)-value may be used as a measure of learning time. It is easy to see
that the combination of
H = 16 and δ = 0.0025 provides an appropriate choice for our
particular application.

24



More intriguing information

1. Olive Tree Farming in Jaen: Situation With the New Cap and Comparison With the Province Income Per Capita.
2. Conservation Payments, Liquidity Constraints and Off-Farm Labor: Impact of the Grain for Green Program on Rural Households in China
3. A Pure Test for the Elasticity of Yield Spreads
4. On the Integration of Digital Technologies into Mathematics Classrooms
5. On the origin of the cumulative semantic inhibition effect
6. THE USE OF EXTRANEOUS INFORMATION IN THE DEVELOPMENT OF A POLICY SIMULATION MODEL
7. The East Asian banking sector—overweight?
8. Technological progress, organizational change and the size of the Human Resources Department
9. ISSUES IN NONMARKET VALUATION AND POLICY APPLICATION: A RETROSPECTIVE GLANCE
10. Regional differentiation in the Russian federation: A cluster-based typification
11. Developmental changes in the theta response system: a single sweep analysis
12. Agricultural Policy as a Social Engineering Tool
13. The name is absent
14. Insecure Property Rights and Growth: The Roles of Appropriation Costs, Wealth Effects, and Heterogeneity
15. Placentophagia in Nonpregnant Nulliparous Mice: A Genetic Investigation1
16. The name is absent
17. The storage and use of newborn babies’ blood spot cards: a public consultation
18. Confusion and Reinforcement Learning in Experimental Public Goods Games
19. The name is absent
20. The name is absent