82
across related subtypes. This could be done with a hierarchical model consisting of
a submodel for each disease, plus a prior distribution for pi that assumes that all Pi
arise from some underlying common distribution p(pi ∣ ry). The pi are interpreted as
subtype-specific effects, and η could characterize an overall success probability. For
example, one could assume logit pi ~ N(μ,τ) and complete the model with a prior
p{μ, τ). A major limitation of this approach is that it assumes that disease subtypes
are a priori exchangeable. Formally the prior model logit pt ~ N(μ,τ) is invariant
with respect to arbitrary permutations of the indices i = 1,... ,n. This is not ap-
propriate since different subtypes are not exchangeable. Disease subtypes with poor
prognosis are known to be different from those with good prognosis. For data analysis
conditional on a large enough data set this might be no problem, as the likelihood
would asymptotically dominate the prior. But for design, when inference is initially
based on very small sample sizes, such details of the prior model can matter. An easy
fix is to replace the exchangeable model with a partially exchangeable model using
a regression. Let xi ∈ {—1,0,1} denote an indicator for the subtype prognosis. We
could assume logit Pi ~ N(j⅛, τ) with μi = β0+β1xi. The problem with this approach
is that disease subtypes are grouped by overall prognosis and this grouping is frozen.
However, while overall prognosis for a subtype is important, it is not obvious that it
determines the most appropriate grouping. One of the eligibility criteria of the study
is failure of prior therapy, i.e., in a sense all patients enter the study with a poor
prognosis.
We propose a novel approach that can be characterized as intermediate between an
exchangeable hierarchical model and a regression. We treat the appropriate grouping
as a random quantity, p, and define a probability model for this random partition ρ.
The model is indexed with the covariates xi. Thus the model includes a priori a pref-
erence for grouping by Xi, but allows for alternative grouping as the data dictates. In
other words, we propose a semi-parametric model that respects the non-exchangeable
nature present in the data.