Causal inference in genetic trio studies

Published in Proc. Natl. Acad. Sci. U.S.A., 2020

Abstract

We introduce a method to rigorously draw causal inferences—inferences immune to all pos-sible confounding—from genetic data that include parents and offspring. Causal conclusionsare possible with these data because the natural randomness in meiosis can be viewed as a high-dimensional randomized experiment. We make this observation actionable by developinga novel conditional independence test that identifies regions of the genome containing distinct causal variants. The proposed Digital Twin Testcompares an observed offspring to carefully constructed synthetic offspring from the same parents in order to determine statistical significance, and it can leverage any black-box multivariate model and additional non-trio genetic data in order to increase power. Crucially, our inferences are based only on a well-established mathematical description of the rearrangement of genetic material during meiosis and make no assumptions about the relationship between the genotypes and phenotypes.

Download paper here