66
3.5 Selecting Significant Tripeptide-Tissue Pairs
Posterior MCMC allows us to carry out essentially all posterior inference of interest.
In particular, let N denote the observed data and pi — Pr(δi > ∕⅝ > 1 ∣ N) denote
the posterior probability for increasing mean counts for peptide-tissue pair i. The
posterior Monte Carlo sample allows easy evaluation of Pi as empirical average of
I(δi > βi > 1) over all imputed values {δi,ββ in the Monte Carlo posterior sample.
Let di ∈ {0,1} denote an indicator for reporting significant affinity for the peptide-
tissue pair i, i.e., increasing mean counts. A reasonable decision rule is to report all
pairs with marginal posterior probability beyond a threshold:
di* = I(pi > f). (3.31)
The rule d* can be justified in terms of the False Discovery Rate (FDR) concept
(Newton, 2004) or, alternatively, as an optimal Bayes rule. To define an optimal rule
we need to augment the probability model to a decision problem by introducing a
utility function. Let θ and y generically denote all unknown parameters and all ob-
servable data. A utility function u(d, θ, y) formalizes relative preferences for decision
d under hypothetical outcomes y and under an assumed truth θ. For example, a
utility function could be
u(d, θ, y) = ∑ diI(δi > βi > 1) + к 52(1 - di)(l - I(δi >βi> 1)), (3.32)
i i
i.e., a linear combination of the number of true positive selections d-i and true nega-
tives. For a given probability model, data and utility function, the optimal Bayes rule
is defined as the rule that maximizes и in expectation over all not observed variable,
and conditional on all observed variables. In our case,
dβ = argmaxE,(u(d, θ, y) ∣ y).
δ
It can be shown that d* arises as Bayes rule under several utility functions that trade
off false positive and false negative counts.