AN ANALYTICAL METHOD TO CALCULATE THE ERGODIC AND DIFFERENCE MATRICES OF THE DISCOUNTED MARKOV DECISION PROCESSES



5 Conclusions and comments

Identical results would be obtained for ν(β) if we use directly formula (10)
(passing over difficulties connected with inverse matrix (
I - βP)) . Using formu-
la (19) we can separate two components of total reward; a constant component
connected with 1
/ (1 - β) factor and ergodic matrix and variable component
which represents this part of
ν(β) which arises under the influence of unste-
ady transient process. Effect of this process is especially visible during the initial
phase of Markov Decision Process. Value of this part of component of quantity
ν (β) rises along with decrease of discounted factor β and increase of disturban-
ces which are generated by matrix
P . Two presented examples show it.

We can create, relying on ab ove observations, a performance index of tested Di-
scounted Markov Decision Processes. Let
ν1 (β) mean component which stand
in front of
(1-β), and ν(β), k = 2, 3,..., N mean components which stand in
front of
1-01κβ. Then performance index of mentioned Markov Chain can have
the following form:

J(β)=[Ji(β)]=


Pκ=2,3,... ν∞,i (β)
ν∞,i (β)

(20)


From definition of coefficient J (β) for given β results that when absolute value
of this coefficient is more close zero, then better properties have tested Markov
Process. For presented two examples, values of coefficients amount to:

J (0, 5) =


J (0, 99) =


0, 151

-0, 151

-1, 005


0, 002

-0, 002

-1, 032


J (0, 5) =


J (0, 99)


2, 630

-2, 104


0, 05

-0, 04


We observe that given stochastic matrix P always generates the same transient
process for
n = 0, 1, 2 (it means Pn). It results from calculation that effect of the
process depends on value of
β. Hence optimisation of Discounted Markov Deci-
sion Process can rely on selection of adequately large factor
β for given quality
coefficient. But usually
β is given and depends on different economic-technical
conditions. Then optimisation MDP can rely on selection of adequately matrix

13



More intriguing information

1. THE CHANGING STRUCTURE OF AGRICULTURE
2. Consciousness, cognition, and the hierarchy of context: extending the global neuronal workspace model
3. The name is absent
4. The name is absent
5. Towards Teaching a Robot to Count Objects
6. Achieving the MDGs – A Note
7. The Shepherd Sinfonia
8. Ultrametric Distance in Syntax
9. POWER LAW SIGNATURE IN INDONESIAN LEGISLATIVE ELECTION 1999-2004
10. The name is absent
11. The name is absent
12. Economies of Size for Conventional Tillage and No-till Wheat Production
13. An Efficient Circulant MIMO Equalizer for CDMA Downlink: Algorithm and VLSI Architecture
14. The name is absent
15. The name is absent
16. The name is absent
17. The name is absent
18. Bird’s Eye View to Indonesian Mass Conflict Revisiting the Fact of Self-Organized Criticality
19. The name is absent
20. The name is absent