(see [18, 25]) that the following recurrent formula for rewards is correct
N
νi (n) = XPij [rij +νj (n- 1)] . (1)
j=1
Other form of this formula is:
NN
νi (n) = X pij rij +Xpijνj (n - 1) , (2)
j=1 j
where
N
qi = X pijrij , (3)
j=1
as one-step reward of process.
Now we can write
N
νi (n) = qi + pijνj (n - 1) . (4)
j=1
After taking discount factor β into consideration we receive:
N
νi(n,β) = qi + β pijνj (n - 1) . (5)
j=1
Let us write this formulae as vector
ν (n, β) = q + β ∙ P ∙ ν (n — 1) , n = 0,1,2,... (6)
It is easy to notice that
ν(1,β) = q + βPν(0)
ν(2,β) = q+ βPν(1) = q+ βP(q + βPν(0)) = q+ βPq+ β2P2ν(0)
(7)
(8)
ν (n, β) = q + βnPnν(0) + Pnn-=11 βnP nq
Taking into consideration fact, that
q ≡ β0P0q
More intriguing information
1. Structural Conservation Practices in U.S. Corn Production: Evidence on Environmental Stewardship by Program Participants and Non-Participants2. Delivering job search services in rural labour markets: the role of ICT
3. THE MEXICAN HOG INDUSTRY: MOVING BEYOND 2003
4. Stakeholder Activism, Managerial Entrenchment, and the Congruence of Interests between Shareholders and Stakeholders
5. ‘I’m so much more myself now, coming back to work’ - working class mothers, paid work and childcare.
6. Thresholds for Employment and Unemployment - a Spatial Analysis of German Regional Labour Markets 1992-2000
7. Midwest prospects and the new economy
8. Computational Experiments with the Fuzzy Love and Romance
9. Biologically inspired distributed machine cognition: a new formal approach to hyperparallel computation
10. Models of Cognition: Neurological possibility does not indicate neurological plausibility.