AN ANALYTICAL METHOD TO CALCULATE THE ERGODIC AND DIFFERENCE MATRICES OF THE DISCOUNTED MARKOV DECISION PROCESSES



3 Method of calculation of ergodic and differen-
ce matrices

We consider dependence for total discounted rewards given by formula (10) again.

ν(β) = (I - βP)-1 ∙ q.                         (11)

It is not difficult to notice, that

(I - βP) = det (i βp) (I - βp) ad,               (12)

where (I - βP) ad is an algebraically complement of matrix (I - βP). Next
we can write

(I - βP) ad = [Dji (β)], i,j = 1,N

(13)


where Dji (β) = (1)j+i Mji (β), and Mji (β) is a minor of matrix (I βP)T,
hence

(I-βP)-1


[Dji (β)]
det (I — βP)


(14)


Theorem:

Let determinant of matrix (I βP) have real and singular roots, then for each
stochastic matrix P and factor β < 1 exist such α
k 6= 0, k = 1, 2, . . . , N that true
is the following formula:

(I βP )-1


JDiiL + [D2i] +...+JDNL
(1 αιβ) (1 α2β)         (1 αNβ),
where

(15)


det (I βP) = (1 α1β) (1 α2β) . . . (1 αNβ) . . . ,           (16)



More intriguing information

1. MULTIMODAL SEMIOTICS OF SPIRITUAL EXPERIENCES: REPRESENTING BELIEFS, METAPHORS, AND ACTIONS
2. Biological Control of Giant Reed (Arundo donax): Economic Aspects
3. The name is absent
4. Cyclical Changes in Short-Run Earnings Mobility in Canada, 1982-1996
5. Who is missing from higher education?
6. The name is absent
7. On Social and Market Sanctions in Deterring non Compliance in Pollution Standards
8. Short report "About a rare cause of primary hyperparathyroidism"
9. Spatial Aggregation and Weather Risk Management
10. Neighborhood Effects, Public Housing and Unemployment in France