5 Conclusions and comments
Identical results would be obtained for ν∞ (β) if we use directly formula (10)
(passing over difficulties connected with inverse matrix (I - βP)) . Using formu-
la (19) we can separate two components of total reward; a constant component
connected with 1/ (1 - β) factor and ergodic matrix and variable component
which represents this part of ν∞ (β) which arises under the influence of unste-
ady transient process. Effect of this process is especially visible during the initial
phase of Markov Decision Process. Value of this part of component of quantity
ν∞ (β) rises along with decrease of discounted factor β and increase of disturban-
ces which are generated by matrix P . Two presented examples show it.
We can create, relying on ab ove observations, a performance index of tested Di-
scounted Markov Decision Processes. Let ν∞1 (β) mean component which stand
in front of (1-β), and ν∞ (β), k = 2, 3,..., N mean components which stand in
front of 1-01κβ. Then performance index of mentioned Markov Chain can have
the following form:
J(β)=[Ji(β)]=
Pκ=2,3,... ν∞,i (β)
ν∞,i (β)
(20)
From definition of coefficient J (β) for given β results that when absolute value
of this coefficient is more close zero, then better properties have tested Markov
Process. For presented two examples, values of coefficients amount to:
J (0, 5) =
J (0, 99) =
0, 151
-0, 151
-1, 005
0, 002
-0, 002
-1, 032
J (0, 5) =
J (0, 99)
2, 630
-2, 104
0, 05
-0, 04
We observe that given stochastic matrix P always generates the same transient
process for n = 0, 1, 2 (it means Pn). It results from calculation that effect of the
process depends on value of β. Hence optimisation of Discounted Markov Deci-
sion Process can rely on selection of adequately large factor β for given quality
coefficient. But usually β is given and depends on different economic-technical
conditions. Then optimisation MDP can rely on selection of adequately matrix
13
More intriguing information
1. The name is absent2. A Hybrid Neural Network and Virtual Reality System for Spatial Language Processing
3. How does an infant acquire the ability of joint attention?: A Constructive Approach
4. The name is absent
5. Regional dynamics in mountain areas and the need for integrated policies
6. Wage mobility, Job mobility and Spatial mobility in the Portuguese economy
7. The name is absent
8. The name is absent
9. The name is absent
10. ANTI-COMPETITIVE FINANCIAL CONTRACTING: THE DESIGN OF FINANCIAL CLAIMS.