92
probabilities and violations of the monotonicity assumption of the HLRM model. In
S2, monotonicity is overall satisfied but the grouping by covariates is not perfect. In
S3 the grouping is perfect but monotonicity violated. Finally, in S4 both, grouping
by covariates and monotonicity are violated.
We generated M = 200 repeat simulations of the entire trial under these five
scenarios (the match of M — 200 with ɪɪ) № ɪs coincidence). For each simulation
m — 1,..., 200, and for each i = 1,..., n, we estimated pi by the posterior mean pfn.
We evaluated bias and mean square error by
foɑs(pɔ
_ i m _
and mse(pi) ≈ — ~ T3J2,
m=l
i.e., we use Monte Carlo averages to evaluate the (frequentist) means with respect to
repeat experimentation. For scenario SO, we estimated the three models described
earlier in this section plus two versions of the proposed NEPPM, once with 7 = 1/2
and once with 7 = 1 in (4.4). The comparison of all these models is summarized
in Figure 4.1. In terms of MSE, the two version of the NEPPM perform similarly.
The HLRM (4.10) produces the estimators with the lowest values of the MSE. This
is to be expected since scenario SO strongly favors HLRM. The NEPPM and HLRM
perform better than the remaining models because they are the only ones that borrow
strength across sarcoma subtypes and acknowledge the similarity of the success rates
corresponding to the same prognosis. Figure 4.2 compares the CP of the central 95%
credible interval (CI) under the HLRM vs. the NEPPM with 7 = 1. The HLRM has
low CP for the subtypes with lowest and largest intermediate prognosis success rates.
This is due to the fixed grouping of all intermediate prognosis subtypes, leading to
excessive shrinkage for the subtypes with xi = 0 with lowest and highest true pi. The
HLRM model does not allow any change or weighting of the grouping.
Figure 4.3 compare NEPPM vs. HLRM under scenarios Sl-S4. In scenario Sl the
HLRM performs overall better than the NEPPM. Scenario S2 is the same as Sl but
swapping in the simulation truth one intermediate (p8) for a poor prognosis success