Inferring causal direction between two traits in the presence of horizontal pleiotropy with GWAS summary data

Autoři: Haoran Xue aff001;  Wei Pan aff002
Působiště autorů: School of Statistics, University of Minnesota, Minneapolis, Minnesota, United States of America aff001;  Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America aff002
Vyšlo v časopise: Inferring causal direction between two traits in the presence of horizontal pleiotropy with GWAS summary data. PLoS Genet 16(11): e32767. doi:10.1371/journal.pgen.1009105
Kategorie: Research Article
doi: 10.1371/journal.pgen.1009105


Orienting the causal relationship between pairs of traits is a fundamental task in scientific research with significant implications in practice, such as in prioritizing molecular targets and modifiable risk factors for developing therapeutic and interventional strategies for complex diseases. A recent method, called Steiger’s method, using a single SNP as an instrument variable (IV) in the framework of Mendelian randomization (MR), has since been widely applied. We report the following new contributions. First, we propose a single SNP-based alternative, overcoming a severe limitation of Steiger’s method in simply assuming, instead of inferring, the existence of a causal relationship. We also clarify a condition necessary for the validity of the methods in the presence of hidden confounding. Second, to improve statistical power, we propose combining the results from multiple, and possibly correlated, SNPs as multiple instruments. Third, we develop three goodness-of-fit tests to check modeling assumptions, including those required for valid IVs. Fourth, by relaxing one of the three IV assumptions in MR, we propose several methods, including an Egger regression-like approach and its multivariable version (analogous to multivariable MR), to account for horizontal pleiotropy of the SNPs/IVs, which is often unavoidable in practice. All our methods can simultaneously infer both the existence and (if so) the direction of a causal relationship, largely expanding their applicability over that of Steiger’s method. Although we focus on uni-directional causal relationships, we also briefly discuss an extension to bi-directional relationships. Through extensive simulations and an application to infer the causal directions between low density lipoprotein (LDL) cholesterol, or high density lipoprotein (HDL) cholesterol, and coronary artery disease (CAD), we demonstrate the superior performance and advantage of our proposed methods over Steiger’s method and bi-directional MR. In particular, after accounting for horizontal pleiotropy, our method confirmed the well known causal direction from LDL to CAD, while other methods, including bi-directional MR, might fail.

Klíčová slova:

Coronary heart disease – Covariance – Gene expression – Genome-wide association studies – Cholesterol – Quantitative trait loci – Simulation and modeling – Single nucleotide polymorphisms


1. Davey Smith G, Ebrahim S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol. 2003 Feb;32(1):1–22. doi: 10.1093/ije/dyg070 12689998

2. Davey Smith G, Ebrahim S. Mendelian randomization: prospects, potentials, and limitations. Int J Epidemiol. 2004 Feb;33(1):30–42. doi: 10.1093/ije/dyh132 15075143

3. Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014 Sep 15;23(R1):R89–98. doi: 10.1093/hmg/ddu328 25064373

4. Hemani G, Zheng J, Elsworth B, Wade KH, Haberland V, Baird D, et al. The MR-Base platform supports systematic causal inference across the human phenome. Elife. 2018 May 30;7:e34408. doi: 10.7554/eLife.34408 29846171

5. Holmes MV, Ala-Korpela M, Davey Smith G. Mendelian randomization in cardiometabolic disease: challenges in evaluating causality. Nat Rev Cardiol. 2017 Oct;14(10):577–590. doi: 10.1038/nrcardio.2017.78 28569269

6. Timpson NJ, Nordestgaard BG, Harbord RM, Zacho J, Frayling TM, Tybjærg-Hansen A, et al. C-reactive protein levels and body mass index: elucidating direction of causation through reciprocal Mendelian randomization. Int J Obes (Lond). 2011 Feb;35(2):300–8. doi: 10.1038/ijo.2010.137 20714329

7. Richmond RC, Davey Smith G, Ness AR, den Hoed M, McMahon G, Timpson NJ. Assessing causality in the association between child adiposity and physical activity levels: a Mendelian randomization analysis. PLoS Med. 2014 Mar 18;11(3):e1001618. doi: 10.1371/journal.pmed.1001618 24642734

8. Pickrell JK, Berisa T, Liu JZ, Ségurel L, Tung JY, Hinds DA. Detection and interpretation of shared genetic influences on 42 human traits. Nat Genet. 2016 Jul;48(7):709–17. doi: 10.1038/ng.3570 27182965

9. Hemani G, Tilling K, Davey Smith G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 2017 Nov 17;13(11):e1007081. doi: 10.1371/journal.pgen.1007081 29149188

10. Millstein J, Zhang B, Zhu J, Schadt EE. Disentangling molecular relationships with a causal inference test. BMC Genet. 2009 May 27;10:23. doi: 10.1186/1471-2156-10-23 19473544

11. Gamazon ER, Wheeler HE, Shah KP, Mozaffari SV, Aquino-Michaels K, Carroll RJ, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat Genet. 2015 Sep;47(9):1091–8. doi: 10.1038/ng.3367 26258848

12. Gusev A, Ko A, Shi H, Bhatia G, Chung W, Penninx BW, et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat Genet. 2016 Mar;48(3):245–52. doi: 10.1038/ng.3506 26854917

13. Zhu Z, Zhang F, Hu H, Bakshi A, Robinson MR, Powell JE, et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat Genet. 2016 May;48(5):481–7. doi: 10.1038/ng.3538 27019110

14. Xu Z, Wu C, Wei P, Pan W. A Powerful Framework for Integrating eQTL and GWAS Summary Data. Genetics. 2017 Nov;207(3):893–902. doi: 10.1534/genetics.117.300270 28893853

15. Xu Z, Wu C, Pan W; Alzheimer’s Disease Neuroimaging Initiative. Imaging-wide association study: Integrating imaging endophenotypes in GWAS. Neuroimage. 2017 Oct 1;159:159–169. doi: 10.1016/j.neuroimage.2017.07.036 28736311

16. Han S, Lin Y, Wang M, Goes FS, Tan K, Zandi P, et al. Integrating brain methylome with GWAS for psychiatric risk gene discovery. bioRxiv. 2018 Jan 1:440206.

17. Su YR, Di C, Bien S, Huang L, Dong X, Abecasis G, et al. A Mixed-Effects Model for Powerful Association Tests in Integrative Functional Genomics. Am J Hum Genet. 2018 May 3;102(5):904–919. doi: 10.1016/j.ajhg.2018.03.019 29727690

18. Cai M, Chen L, Liu J, Yang C. Quantifying the impact of genetically regulated expression on complex traits and diseases. bioRxiv. 2019 Jan 1:546580.

19. Hu Y, Li M, Lu Q, Weng H, Wang J, Zekavat SM, et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat Genet. 2019 Mar;51(3):568–576. doi: 10.1038/s41588-019-0345-7 30804563

20. Yang T, Wu C, Wei P, Pan W. Integrating DNA sequencing and transcriptomic data for association analyses of low-frequency variants and lipid traits. Hum Mol Genet. 2020 Feb 1;29(3):515–526. doi: 10.1093/hmg/ddz314 31919517

21. Zhu Z, Zheng Z, Zhang F, Wu Y, Trzaskowski M, Maier R, et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat Commun. 2018 Jan 15;9(1):224. doi: 10.1038/s41467-017-02317-2 29335400

22. Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018 May;50(5):693–698. doi: 10.1038/s41588-018-0099-7 29686387

23. Watanabe K, Stringer S, Frei O, Umićević Mirkov M, de Leeuw C, Polderman TJC, et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat Genet. 2019 Sep;51(9):1339–1348. doi: 10.1038/s41588-019-0481-0 31427789

24. Wainberg M, Sinnott-Armstrong N, Mancuso N, Barbeira AN, Knowles DA, Golan D, et al. Opportunities and challenges for transcriptome-wide association studies. Nat Genet. 2019 Apr;51(4):592–599. doi: 10.1038/s41588-019-0385-z 30926968

25. Mancuso N, Freund MK, Johnson R, Shi H, Kichaev G, Gusev A, et al. Probabilistic fine-mapping of transcriptome-wide association studies. Nat Genet. 2019 Apr;51(4):675–682. doi: 10.1038/s41588-019-0367-1 30926970

26. Wu C, Pan W. A powerful fine-mapping method for transcriptome-wide association studies. Hum Genet. 2020 Feb;139(2):199–213. doi: 10.1007/s00439-019-02098-2 31844974

27. Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015 Apr;44(2):512–25. doi: 10.1093/ije/dyv080 26050253

28. Burgess S, Thompson SG. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am J Epidemiol. 2015 Feb 15;181(4):251–60. doi: 10.1093/aje/kwu283 25632051

29. Burgess S, Dudbridge F, Thompson SG. Re: “Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects”. Am J Epidemiol. 2015 Feb 15;181(4):290–1. doi: 10.1093/aje/kwv017 25660081

30. Ference BA, Yoo W, Alesh I, Mahajan N, Mirowska KK, Mewada A, et al. Effect of long-term exposure to lower low-density lipoprotein cholesterol beginning early in life on the risk of coronary heart disease: a Mendelian randomization analysis. J Am Coll Cardiol. 2012 Dec 25;60(25):2631–9. doi: 10.1016/j.jacc.2012.09.017 23083789

31. Holmes MV, Asselbergs FW, Palmer TM, Drenos F, Lanktree MB, Nelson CP, et al. Mendelian randomization of blood lipids for coronary heart disease. Eur Heart J. 2015 Mar 1;36(9):539–50. doi: 10.1093/eurheartj/eht571 24474739

32. Voight BF, Peloso GM, Orho-Melander M, Frikke-Schmidt R, Barbalic M, Jensen MK, et al. Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study. Lancet. 2012 Aug 11;380(9841):572–80. doi: 10.1016/S0140-6736(12)60312-2 22607825

33. White J, Swerdlow DI, Preiss D, Fairhurst-Hunter Z, Keating BJ, Asselbergs FW, et al. Association of Lipid Fractions With Risks for Coronary Artery Disease and Diabetes. JAMA Cardiol. 2016 Sep 1;1(6):692–9. doi: 10.1001/jamacardio.2016.1884 27487401

34. Cholesterol Treatment Trialists’ (CTT) Collaboration, Baigent C, Blackwell L, Emberson J, Holland LE, Reith C, et al. Efficacy and safety of more intensive lowering of LDL cholesterol: a meta-analysis of data from 170,000 participants in 26 randomised trials. Lancet. 2010 Nov 13;376(9753):1670–81. doi: 10.1016/S0140-6736(10)61350-5 21067804

35. Collins R, Reith C, Emberson J, Armitage J, Baigent C, Blackwell L, et al. Lancet. 2016 Nov 19;388(10059):2532–2561. doi: 10.1016/S0140-6736(16)31357-5 27616593

36. Silverman MG, Ference BA, Im K, Wiviott SD, Giugliano RP, Grundy SM, et al. Association Between Lowering LDL-C and Cardiovascular Risk Reduction Among Different Therapeutic Interventions: A Systematic Review and Meta-analysis. JAMA. 2016 Sep 27;316(12):1289–97. doi: 10.1001/jama.2016.13985 27673306

37. Emerging Risk Factors Collaboration, Di Angelantonio E, Sarwar N, Perry P, Kaptoge S, Ray KK, et al. Major lipids, apolipoproteins, and risk of vascular disease. JAMA. 2009 Nov 11;302(18):1993–2000. doi: 10.1001/jama.2009.1619 19903920

38. Barter PJ, Caulfield M, Eriksson M, Grundy SM, Kastelein JJ, Komajda M, et al. Effects of torcetrapib in patients at high risk for coronary events. N Engl J Med. 2007 Nov 22;357(21):2109–22. doi: 10.1056/NEJMoa0706628 17984165

39. Schwartz GG, Olsson AG, Abt M, Ballantyne CM, Barter PJ, Brumm J, et al. Effects of dalcetrapib in patients with a recent acute coronary syndrome. N Engl J Med. 2012 Nov 29;367(22):2089–99. doi: 10.1056/NEJMoa1206797 23126252

40. Neudecker H, Wesselman AM. The asymptotic variance matrix of the sample correlation matrix. Linear Algebra and its Applications. 1990 Jan 1;127:589–99. doi: 10.1016/0024-3795(90)90363-H

41. 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature. 2015 Oct 1;526(7571):68–74. doi: 10.1038/nature15393 26432245

42. Zhao Q, Wang J, Hemani G, Bowden J, Small DS. Statistical inference in two-sample summary-data Mendelian randomization using robust adjusted profile score. Annals of Statistics. 2020;48(3):1742–69. doi: 10.1214/19-AOS1866

43. Dai JY, Peters U, Wang X, Kocarnik J, Chang-Claude J, Slattery ML, et al. Diagnostics for Pleiotropy in Mendelian Randomization Studies: Global and Individual Tests for Direct Effects. Am J Epidemiol. 2018 Dec 1;187(12):2672–2680. doi: 10.1093/aje/kwy177 30188971

44. Bowden J, Hemani G, Davey Smith G. Invited Commentary: Detecting Individual and Global Horizontal Pleiotropy in Mendelian Randomization-A Job for the Humble Heterogeneity Statistic? Am J Epidemiol. 2018 Dec 1;187(12):2681–2685. 30188969

45. Talluri R, Shete S. An approach to estimate bidirectional mediation effects with application to body mass index and fasting glucose. Ann Hum Genet. 2018 Nov;82(6):396–406. doi: 10.1111/ahg.12261 29993118

46. Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol. 2013 Nov;37(7):658–65. doi: 10.1002/gepi.21758 24114802

47. Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genet Epidemiol. 2016 May;40(4):304–14. doi: 10.1002/gepi.21965 27061298

48. Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017 Dec 1;46(6):1985–1998. doi: 10.1093/ije/dyx102 29040600

49. Burgess S, Zuber V, Gkatzionis A, Foley CN. Modal-based estimation via heterogeneity-penalized weighting: model averaging for consistent and efficient estimation in Mendelian randomization when a plurality of candidate instruments are valid. Int J Epidemiol. 2018 Aug 1;47(4):1242–1254. doi: 10.1093/ije/dyy080

50. Liu DJ, Peloso GM, Yu H, Butterworth AS, Wang X, Mahajan A, et al. Exome-wide association study of plasma lipids in >300,000 individuals. Nat Genet. 2017 Dec;49(12):1758–1766. doi: 10.1038/ng.3977 29083408

51. Nelson CP, Goel A, Butterworth AS, Kanoni S, Webb TR, Marouli E, et al. Association analyses based on false discovery rate implicate new loci for coronary artery disease. Nat Genet. 2017 Sep;49(9):1385–1391. doi: 10.1038/ng.3913 28714975

52. Berisa T, Pickrell JK. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics. 2016 Jan 15;32(2):283–5.

53. Slob EAW, Burgess S. A comparison of robust Mendelian randomization methods using summary data. Genet Epidemiol. 2020 Jun;44(4):313–329. doi: 10.1002/gepi.22295

54. Zheng J, Baird D, Borges MC, Bowden J, Hemani G, Haycock P, et al. Recent Developments in Mendelian Randomization Studies. Curr Epidemiol Rep. 2017;4(4):330–345. doi: 10.1007/s40471-017-0128-6 29226067

55. Lawlor DA, Tilling K, Davey Smith G. Triangulation in aetiological epidemiology. Int J Epidemiol. 2016 Dec 1;45(6):1866–1886.

56. Zhao Q, Chen Y, Wang J, Small DS. Powerful three-sample genome-wide design and robust statistical inference in summary-data Mendelian randomization. Int J Epidemiol. 2019 Oct 1;48(5):1478–1492. doi: 10.1093/ije/dyz142

57. Wang K. Understanding Power Anomalies in Mediation Analysis. Psychometrika. 2018 Jun;83(2):387–406. doi: 10.1007/s11336-017-9598-1

58. Schaid DJ, Sinnwell JP. Penalized models for analysis of multiple mediators. Genet Epidemiol. 2020 Jul;44(5):408–424. doi: 10.1002/gepi.22296

59. Lutz SM, Sordillo JE, Hokanson JE, Chen Wu A, Lange C. The effects of misspecification of the mediator and outcome in mediation analysis. Genet Epidemiol. 2020 Jun;44(4):400–403. doi: 10.1002/gepi.22289

60. Ainsworth HF, Shin SY, Cordell HJ. A comparison of methods for inferring causal relationships between genotype and phenotype using additional biological measurements. Genet Epidemiol. 2017 Nov;41(7):577–586. doi: 10.1002/gepi.22061

61. Yuan Y, Shen X, Pan W, Wang Z. Constrained likelihood for reconstructing a directed acyclic Gaussian graph. Biometrika 2019 March;106(1):109–125. doi: 10.1093/biomet/asy057

62. Li C, Shen X, Pan W. Likelihood ratio tests for a large directed acyclic graph. Journal of the American Statistical Association 2020;115(531):1304–1319. doi: 10.1080/01621459.2019.1623042

63. Howey R, Shin SY, Relton C, Davey Smith G, Cordell HJ. Bayesian network analysis incorporating genetic anchors complements conventional Mendelian randomization approaches for exploratory analysis of causal relationships in complex data. PLoS Genet. 2020 Mar 2;16(3):e1008198. doi: 10.1371/journal.pgen.1008198 32119656

64. O’Connor LJ, Price AL. Distinguishing genetic correlation from causation across 52 diseases and complex traits. Nat Genet. 2018 Dec;50(12):1728–1734. doi: 10.1038/s41588-018-0255-0 30374074

65. Rees JMB, Wood AM, Burgess S. Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy. Stat Med. 2017 Dec 20;36(29):4705–4718. doi: 10.1002/sim.7492 28960498

66. Qi G, Chatterjee N. Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nat Commun. 2019 Apr 26;10(1):1941. doi: 10.1038/s41467-019-09432-2 31028273

Článek vyšel v časopise

PLOS Genetics

2020 Číslo 11

Nejčtenější v tomto čísle
Kurzy Podcasty Doporučená témata Časopisy
Zapomenuté heslo

Nemáte účet?  Registrujte se

Zapomenuté heslo

Zadejte e-mailovou adresu, se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.


Nemáte účet?  Registrujte se