In high-dimensional data analysis such as differential gene expression analysis, people often use filtering methods like fold-change or variance filters in an attempt to reduce the multiple testing penalty and improve power. However, filtering may introduce a bias on the multiple testing correction. The precise amount of bias depends on many quantities, such as fraction of probes filtered out, filter statistic and test statistic used.
We show that a biased multiple testing correction results if non-differentially expressed probes are not filtered out with equal probability from the entire range of p-values. We illustrate our results using both a simulation study and an experimental dataset, where the FDR is shown to be biased mostly by filters that are associated with the hypothesis being tested, such as the fold change. Filters that induce little bias on the FDR yield less additional power of detecting differentially expressed genes. Finally, we propose a statistical test that can be used in practice to determine whether any chosen filter introduces bias on the FDR estimate used, given a general experimental setup.
Filtering out of probes must be used with care as it may bias the multiple testing correction. Researchers can use our test for FDR bias to guide their choice of filter and amount of filtering in practice.