(see [18, 25]) that the following recurrent formula for rewards is correct
N
νi (n) = XPij [rij +νj (n- 1)] . (1)
j=1
Other form of this formula is:
NN
νi (n) = X pij rij +Xpijνj (n - 1) , (2)
j=1 j
where
N
qi = X pijrij , (3)
j=1
as one-step reward of process.
Now we can write
N
νi (n) = qi + pijνj (n - 1) . (4)
j=1
After taking discount factor β into consideration we receive:
N
νi(n,β) = qi + β pijνj (n - 1) . (5)
j=1
Let us write this formulae as vector
ν (n, β) = q + β ∙ P ∙ ν (n — 1) , n = 0,1,2,... (6)
It is easy to notice that
ν(1,β) = q + βPν(0)
ν(2,β) = q+ βPν(1) = q+ βP(q + βPν(0)) = q+ βPq+ β2P2ν(0)
(7)
(8)
ν (n, β) = q + βnPnν(0) + Pnn-=11 βnP nq
Taking into consideration fact, that
q ≡ β0P0q
More intriguing information
1. The WTO and the Cartagena Protocol: International Policy Coordination or Conflict?2. Quelles politiques de développement durable au Mali et à Madagascar ?
3. Magnetic Resonance Imaging in patients with ICDs and Pacemakers
4. Applications of Evolutionary Economic Geography
5. Bridging Micro- and Macro-Analyses of the EU Sugar Program: Methods and Insights
6. The Effects of Attendance on Academic Performance: Panel Data Evidence for Introductory Microeconomics
7. Inflation and Inflation Uncertainty in the Euro Area
8. Macroeconomic Interdependence in a Two-Country DSGE Model under Diverging Interest-Rate Rules
9. The name is absent
10. How to do things without words: Infants, utterance-activity and distributed cognition.