69
3.6 A Simulation Study
We carried out a simulation study to validate the proposed approach. We generated
τι = 2000 observations of the model described in (3.26) through (3.30), except that
the random probability measure G is replaced by a Gamma distribution with fixed
parameters
μfc ~ Gd(μ∖sfμ, sfμtfμ), i.i.d. and (βk, δk) ~ Ga(J3 ∣ sfβ, sfβtfβ) Ga(δ ∣ sfδ, sfδtfδ) (3.37)
We set the hyper-parameters such that the expected value of μi and its variance are
small and, besides, βi and δi have both mean 8 and variances 30 and 120 respectively.
The idea behind is that μi is interpreted as the mean of the counts through the
three stages of the pair i if there were no enrichment. Since, initially, the library
contains a small amount of the particular tripeptide related with the pair i among
the large number of different tripeptides, we expect μi to be small. The parameters
βi and δi represent the folds of the expected count values from the first stage to
the second and third stages respectively due to the library enrichment. We allow
these last parameters to have large variances. The Gamma parameters were set to
sfμ = 3.6, tʃ = 5/6, sβ = 13/6, ⅛ - 1/8, sfδ = 0.53 and tfδ = 0.125.
The hyper-parameters of the model described in (3.26) through (3.30) were chosen
taking into account the same considerations and set to sμ = 1, atμ — 3, btμ = 2,
Sβ = st5 — 1, dtμ ~ citδ — 2.5, btp — btδ 9, da = ba 1.
Saving every 10th iteration after a 10,000 iteration burn-in, a Monte Carlo posterior
sample of size M = 5,000 was saved. We performed convergence assessment for the
proposed parameter values to ensure that the MCMC algorithm converges well. On
the basis of the convergence criteria in Cowless and Taylor (1996), we found that
the Markov chains mixed very well and converged rapidly. In 723 out of the 2,000
simulated cases it turned out that ¾ > ∕¾ > 1.
Using the FDR criterion described by equations (3.31) and (3.33), we selected the
pairs such that, under the assumptions of our model, the expected false discovery