AN ANALYTICAL METHOD TO CALCULATE THE ERGODIC AND DIFFERENCE MATRICES OF THE DISCOUNTED MARKOV DECISION PROCESSES



3 Method of calculation of ergodic and differen-
ce matrices

We consider dependence for total discounted rewards given by formula (10) again.

ν(β) = (I - βP)-1 ∙ q.                         (11)

It is not difficult to notice, that

(I - βP) = det (i βp) (I - βp) ad,               (12)

where (I - βP) ad is an algebraically complement of matrix (I - βP). Next
we can write

(I - βP) ad = [Dji (β)], i,j = 1,N

(13)


where Dji (β) = (1)j+i Mji (β), and Mji (β) is a minor of matrix (I βP)T,
hence

(I-βP)-1


[Dji (β)]
det (I — βP)


(14)


Theorem:

Let determinant of matrix (I βP) have real and singular roots, then for each
stochastic matrix P and factor β < 1 exist such α
k 6= 0, k = 1, 2, . . . , N that true
is the following formula:

(I βP )-1


JDiiL + [D2i] +...+JDNL
(1 αιβ) (1 α2β)         (1 αNβ),
where

(15)


det (I βP) = (1 α1β) (1 α2β) . . . (1 αNβ) . . . ,           (16)



More intriguing information

1. The name is absent
2. Notes on an Endogenous Growth Model with two Capital Stocks II: The Stochastic Case
3. NVESTIGATING LEXICAL ACQUISITION PATTERNS: CONTEXT AND COGNITION
4. A Rational Analysis of Alternating Search and Reflection Strategies in Problem Solving
5. THE INTERNATIONAL OUTLOOK FOR U.S. TOBACCO
6. Plasmid-Encoded Multidrug Resistance of Salmonella typhi and some Enteric Bacteria in and around Kolkata, India: A Preliminary Study
7. The name is absent
8. Tariff Escalation and Invasive Species Risk
9. On the Desirability of Taxing Charitable Contributions
10. The name is absent