Searching for robust associations with a multi-environment knockoff filter

Published in Biometrika, 2021

Abstract

This paper develops a method based on model-X knockoffs to find conditional associations that are consistent across diverse environments, controlling the false discovery rate. The motivation for this problem is that large data sets may contain numerous associations that are statistically significant and yet misleading, as they are induced by confounders or sampling imperfections. However, associations consistently replicated under different conditions may be more interesting. In fact, consistency sometimes provably leads to valid causal inferences even if conditional associations do not. While the proposed method is widely applicable, this paper highlights its relevance to genome-wide association studies, in which consistency across populations with diverse ancestries mitigates confounding due to unmeasured variants. The effectiveness of this approach is demonstrated by simulations and applications to the UK Biobank data.

Download paper here