Estimation of non-null SNP effect size distributions enables the detection of enriched genes underlying complex traits

Autoři: Wei Cheng aff001;  Sohini Ramachandran aff001;  Lorin Crawford aff002
Působiště autorů: Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island, United States of America aff001;  Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America aff002;  Department of Biostatistics, Brown University, Providence, Rhode Island, United States of America aff003;  Center for Statistical Sciences, Brown University, Providence, Rhode Island, United States of America aff004
Vyšlo v časopise: Estimation of non-null SNP effect size distributions enables the detection of enriched genes underlying complex traits. PLoS Genet 16(6): e32767. doi:10.1371/journal.pgen.1008855
Kategorie: Research Article
doi: 10.1371/journal.pgen.1008855


Traditional univariate genome-wide association studies generate false positives and negatives due to difficulties distinguishing associated variants from variants with spurious nonzero effects that do not directly influence the trait. Recent efforts have been directed at identifying genes or signaling pathways enriched for mutations in quantitative traits or case-control studies, but these can be computationally costly and hampered by strict model assumptions. Here, we present gene-ε, a new approach for identifying statistical associations between sets of variants and quantitative traits. Our key insight is that enrichment studies on the gene-level are improved when we reformulate the genome-wide SNP-level null hypothesis to identify spurious small-to-intermediate SNP effects and classify them as non-causal. gene-ε efficiently identifies enriched genes under a variety of simulated genetic architectures, achieving greater than a 90% true positive rate at 1% false positive rate for polygenic traits. Lastly, we apply gene-ε to summary statistics derived from six quantitative traits using European-ancestry individuals in the UK Biobank, and identify enriched genes that are in biologically relevant pathways.

Klíčová slova:

Genome-wide association studies – Genomics statistics – Heredity – Molecular genetics – Quantitative traits – Simulation and modeling – Magma – Complex traits


