Conclusion
In this thesis I have presented three research projects using non-parametric Bayesian
model-based statistical inference for biomedical data. The common themes of these
analyses are flexible statistical models and a refinement of previously published para-
metric analyses for similar datasets. Some limitations remain.
In the second chapter, I introduced a model to analyze adverse event data gathered
in a phase III clinical trial. Each data record consists of the observed grades of seven
different adverse events exhibited by a patient (including grade 0 for adverse events
that were not recorded for the patient). To my knowledge, this approach is the first
model-based inference that accounts for and assesses the very plausible correlation
of the grades of the adverse events exhibited by the same patient. Besides, the
proposed model is more flexible than standard models for ordinal data in the sense
that it is able to fit cell probabilities (i.e. the probability that a patient exhibits
certain adverse event at a determined level) that do not necessarily satisfy the parallel
regression assumption. The parallel regression assumption is the probit version of the
proportional odds assumption for the logit model. The data structure in this problem
is very common in other applications, for example in a survey where each question has
an ordinal outcome and it is expected that the answers given by the same respondent
are correlated. The implementation of the model presents one difficulty. The latent
variable that determines the ordinal outcome is distributed according to a mixture
of normal distributions. Determining the correct number of components, G, in this
mixture is not trivial. We empirically explore different values of G and make a
104