87
Let fk(pn) denote the predictive probability (PP) function , i.e., the conditional
probability of a hypothetical new (n + 1)—st unit being allocated to cluster к, condi-
tional on pn, Let K(n) denote the number of clusters in pn. We find
⅛u(n+4) l≠a(≡⅜)7 ⅛'1≤*≤m ....
Jk(Pn) =------77rτ-----OC < 4 7 (4.5)
l k) [α∕ςr for к = K(n) + 1.
where zn+ι = t is the category of the new experimental unit and c(0) 1. In the
context of our application xi is the prognosis for subtype i. Posterior simulation
follows a simple modification of standard Gibbs-Sampling schemes for DP or PPM
models, as described in, e.g., MacEachern and Müller (1998) or Quintana (2006). See
further details in the Appendix.
4.3.2 Some Properties of the NEPPM
The proposed model reduces to the DP Polya urn when all xi are equal, i.e., Q = 1.
As a consequence the NEPPM reduces to a DPM. That is, if Q = 1, then, m⅛ι = #Sfc
and d(Sfc) reduces to
,z ς ʌ P(#Sfc + /3) , .
( fc) r(#Sfe + /3) ' ( ,6)
Similarly, when 7 = 0 the similarity function drops out of the model and the NEPPM
reduces to the DPM. On the other hand, the model can easily be extended for more
complicated covariates x-l. For multiple categorical covariates one could introduce
several similarity functions and use the product to modify the cohesion functions in
(4.4).
For 7 = l the model reduces to a special case of the model introduced in Müller et
al. (2009), using the default similarity function for categorical covariates. In general,
they suggest to use a similarity function defined with an auxiliary probability model
as d(S,fe) ≡ g(x*k) =
tf(4) = / ∏ Q{xi I ξfc)<7(ξfc)dξfc,
j itSk