AN ANALYTICAL METHOD TO CALCULATE THE ERGODIC AND DIFFERENCE MATRICES OF THE DISCOUNTED MARKOV DECISION PROCESSES



3 Method of calculation of ergodic and differen-
ce matrices

We consider dependence for total discounted rewards given by formula (10) again.

ν(β) = (I - βP)-1 ∙ q.                         (11)

It is not difficult to notice, that

(I - βP) = det (i βp) (I - βp) ad,               (12)

where (I - βP) ad is an algebraically complement of matrix (I - βP). Next
we can write

(I - βP) ad = [Dji (β)], i,j = 1,N

(13)


where Dji (β) = (1)j+i Mji (β), and Mji (β) is a minor of matrix (I βP)T,
hence

(I-βP)-1


[Dji (β)]
det (I — βP)


(14)


Theorem:

Let determinant of matrix (I βP) have real and singular roots, then for each
stochastic matrix P and factor β < 1 exist such α
k 6= 0, k = 1, 2, . . . , N that true
is the following formula:

(I βP )-1


JDiiL + [D2i] +...+JDNL
(1 αιβ) (1 α2β)         (1 αNβ),
where

(15)


det (I βP) = (1 α1β) (1 α2β) . . . (1 αNβ) . . . ,           (16)



More intriguing information

1. Are combination forecasts of S&P 500 volatility statistically superior?
2. Party Groups and Policy Positions in the European Parliament
3. CGE modelling of the resources boom in Indonesia and Australia using TERM
4. The migration of unskilled youth: Is there any wage gain?
5. MATHEMATICS AS AN EXACT AND PRECISE LANGUAGE OF NATURE
6. The name is absent
7. The name is absent
8. Structure and objectives of Austria's foreign direct investment in the four adjacent Central and Eastern European countries Hungary, the Czech Republic, Slovenia and Slovakia
9. The name is absent
10. A Unified Model For Developmental Robotics