Multiple testing and false discovery rate

Here is a quick explanation of False Discovery Rates from Brad Efron’s Large-scale simultaneous hypothesis testing: The choice of a null hypothesis:

We begin with a simple Bayes model. Suppose that the N z-values fall into two classes, “Uninteresting” or “Interesting”, corresponding to whether or not z_i is generated according to the null hypothesis, with prior probabilities p0 and p1 = 1 − p0 , for the classes; and that z_i has density either f_0(z) or f_1(z) depending on its class,

p_0 = Prob{Uninteresting}

f_0(z) density if Uninteresting (Null)

p_1 = Prob{Interesting}

f_1(z) density if Interesting (Non-Null) .

The smooth curve in Figure 1 estimates the mixture density f(z),

f(z) = p_0 * f_0(z) + p_1 * f_1(z) .

According to Bayes theorem the a posteriori probability of being in the Uninteresting class given z is

Prob{Uninteresting|z} = p_0 * f_0(z) / f(z) .

Here we define the local false discovery rate to be

fdr(z) ≡ f_0(z) / f(z) ,

ignoring the factor p_0, so fdr(z) is an upper bound on Prob{Uninteresting|z}. In fact p_0 can be roughly estimated, but we are assuming that p_0 is near 1, say p_0 ≥ 0.90, so fdr(z) is not a flagrant overestimator.

Advertisements

One thought on “Multiple testing and false discovery rate”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s