Robust methods in Mendelian randomization via penalization of heterogeneous causal estimates

Autoři: Jessica M. B. Rees aff001;  Angela M. Wood aff001;  Frank Dudbridge aff003;  Stephen Burgess aff001
Působiště autorů: Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, United Kingdom aff001;  Edinburgh Clinical Trials Unit, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, EH16 4UX, United Kingdom aff002;  Department of Health Sciences, University of Leicester, Leicester, LE1 7RH, United Kingdom aff003;  MRC Biostatistics Unit, University of Cambridge, Cambridge, CB2 0SR, United Kingdom aff004
Vyšlo v časopise: PLoS ONE 14(9)
Kategorie: Research Article
doi: 10.1371/journal.pone.0222362


Methods have been developed for Mendelian randomization that can obtain consistent causal estimates under weaker assumptions than the standard instrumental variable assumptions. The median-based estimator and MR-Egger are examples of such methods. However, these methods can be sensitive to genetic variants with heterogeneous causal estimates. Such heterogeneity may arise from over-dispersion in the causal estimates, or specific variants with outlying causal estimates. In this paper, we develop three extensions to robust methods for Mendelian randomization with summarized data: 1) robust regression (MM-estimation); 2) penalized weights; and 3) Lasso penalization. Methods using these approaches are considered in two applied examples: one where there is evidence of over-dispersion in the causal estimates (the causal effect of body mass index on schizophrenia risk), and the other containing outliers (the causal effect of low-density lipoprotein cholesterol on Alzheimer’s disease risk). Through an extensive simulation study, we demonstrate that robust regression applied to the inverse-variance weighted method with penalized weights is a worthwhile additional sensitivity analysis for Mendelian randomization to provide robustness to variants with outlying causal estimates. The results from the applied examples and simulation study highlight the importance of using methods that make different assumptions to assess the robustness of findings from Mendelian randomization investigations with multiple genetic variants.

Klíčová slova:

Alzheimer's disease – Genetic predisposition – Instrumental variable analysis – Lipoproteins – Medical risk factors – Schizophrenia – Simulation and modeling – Research errors


1. Davey Smith G, Ebrahim S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology. 2003;32(1):1–22.

2. Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Statistics in Medicine. 2008;27(8):1133–1163. doi: 10.1002/sim.3034 17886233

3. Greenland S. An introduction to instrumental variables for epidemiologists. International Journal of Epidemiology. 2000;29(4):722–729. doi: 10.1093/ije/29.4.722 10922351

4. Burgess S, Scott RA, Timpson NJ, Davey Smith G, Thompson SG. Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors. European Journal of Epidemiology. 2015;30(7):543–552. doi: 10.1007/s10654-015-0011-z 25773750

5. Burgess S, Butterworth A, Thompson SG. Mendelian randomization analysis with multiple genetic variants using summarized data. Genetic Epidemiology. 2013;37(7):658–665. doi: 10.1002/gepi.21758 24114802

6. Angrist JD, Imbens GW. Two-stage least squares estimation of average causal effects in models with variable treatment intensity. Journal of the American Statistical Association. 1995;90(430):431–442. doi: 10.1080/01621459.1995.10476535

7. Burgess S, Thompson SG. Use of allele scores as instrumental variables for Mendelian randomization. International Journal of Epidemiology. 2013;42(4):1134–1144. doi: 10.1093/ije/dyt093 24062299

8. Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genetic Epidemiology. 2016;40(4):304–314. doi: 10.1002/gepi.21965 27061298

9. Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. International Journal of Epidemiology. 2015;44(2):512–525. doi: 10.1093/ije/dyv080 26050253

10. Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nature Genetics. 2018;50(5):693–698. doi: 10.1038/s41588-018-0099-7 29686387

11. Dai JY, Peters U, Wang X, Kocarnik J, Chang-Claude J, Slattery ML, et al. Diagnostics for Pleiotropy in Mendelian Randomization Studies: Global and Individual Tests for Direct Effects. American Journal of Epidemiology. 2018;187(12):2672–2680. doi: 10.1093/aje/kwy177 30188971

12. Zhu Z, Zheng Z, Zhang F, Wu Y, Trzaskowski M, Maier R, et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nature Communications. 2018;9(1):224. doi: 10.1038/s41467-017-02317-2 29335400

13. Del Greco M F, Minelli C, Sheehan NA, Thompson JR. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Statistics in Medicine. 2015;34(21):2926–2940. doi: 10.1002/sim.6522

14. Bowden J, Del Greco M F, Minelli C, Zhao Q, Lawlor DA, Sheehan NA, et al. Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption. International Journal of Epidemiology. 2018.

15. Huber P. Robust Statistics. Wiley; 2009.

16. Yavorska OO, Burgess S. MendelianRandomization: an R package for performing Mendelian randomization analyses using summarized data. International Journal of Epidemiology. 2017. doi: 10.1093/ije/dyx034 28398548

17. Burgess S, CCGC (CHD CRP Genetics Collaboration). Identifying the odds ratio estimated by a two-stage instrumental variable analysis with a logistic regression model. Statistics in Medicine. 2013;32(27):4726–4747. doi: 10.1002/sim.5871 23733419

18. Vansteelandt S, Bowden J, Babanezhad M, Goetghebeur E. On instrumental variables estimation of causal odds ratios. Statistical Science. 2011;26(3):403–422. doi: 10.1214/11-STS360

19. Kolesár M, Chetty R, Friedman J, Glaeser E, Imbens G. Identification and inference with many invalid instruments. Journal of Business & Economic Statistics. 2015;33(4):474–484. doi: 10.1080/07350015.2014.978175

20. Koller M, Stahel W. Sharpening wald-type inference in robust regression for small samples. Computational Statistics & Data Analysis. 2011;55(8):2504–2515. doi: 10.1016/j.csda.2011.02.014

21. Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, et al. robustbase: Basic Robust Statistics; 2015 URL, R package version 0.92-5.

22. Bowden J, Hemani G, Davey Smith G. Invited Commentary: Detecting Individual and Global Horizontal Pleiotropy in Mendelian Randomization—A Job for the Humble Heterogeneity Statistic? American Journal of Epidemiology. 2018;187(12):2681–2685. doi: 10.1093/aje/kwy185 30188969

23. Rucker G, Schwarzer G, Carpenter JR, Binder H, Schumacher M. Treatment effect estimates adjusted for small-study effects via a limit meta-analysis. Biostatistics. 2011;12(1):122–142. doi: 10.1093/biostatistics/kxq046 20656692

24. Kang H, Zhang A, Cai T, Small D. Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization. Journal of the American Statistical Association. 2016;111(513):132–144. doi: 10.1080/01621459.2014.994705

25. Windmeijer F, Farbmacher H, Davies N, Davey Smith G. On the Use of the Lasso for Instrumental Variables Estimation with Some Invalid Instruments. Journal of the American Statistical Association. 2018;0(0):1–12. doi: 10.1080/01621459.2018.1498346

26. Cheng X, Liao Z. Select the valid and relevant moments: An information-based LASSO for GMM with many moments. Journal of Econometrics. 2015;186(2):443–464. doi: 10.1016/j.jeconom.2015.02.019

27. Tibshirani R. Regression Shrinkage and Selection Via the Lasso. Journal of the Royal Statistical Society, Series B (Methodological). 1996;58:267–288. doi: 10.1111/j.2517-6161.1996.tb02080.x

28. Goeman J, Meijer R, Chaturvedi N, Lueder M. penalized: L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model; 2017 URL

29. Staley JR, Blackshaw J, Kamat MA, Ellis S, Surendran P, Sun BB, et al. PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics. 2016;32(20):3207–3209. doi: 10.1093/bioinformatics/btw373 27318201

30. Coodin S. Body mass index in persons with schizophrenia. Canadian Journal of Psychiatry. 2001;46(6):549–555. doi: 10.1177/070674370104600610 11526812

31. Allison DB, Fontaine KR, Heo M, Mentore JL, Cappelleri JC, Chandler LP, et al. The distribution of body mass index among individuals with and without schizophrenia. Journal of Clinical Psychiatry. 1999;60(4):215–220. doi: 10.4088/jcp.v60n0402 10221280

32. Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, et al. Genetic studies of body mass index yield new insights for obesity biology. Nature. 2015;518(7538):197–206. doi: 10.1038/nature14177 25673413

33. Ripke S, Neale BM, Corvin A, Walters JT, Farh KH, Holmans PA, et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511(7510):421–427. doi: 10.1038/nature13595

34. Hartwig FP, Bowden J, Loret de Mola C, Tovo-Rodrigues L, Davey Smith G, Horta BL. Body mass index and psychiatric disorders: a Mendelian randomization study. Scientific Reports. 2016;6:32730. doi: 10.1038/srep32730 27601421

35. Notkola IL, Sulkava R, Pekkanen J, Erkinjuntti T, Ehnholm C, Kivinen P, et al. Serum total cholesterol, apolipoprotein E epsilon 4 allele, and Alzheimer’s disease. Neuroepidemiology. 1998;17(1):14–20. doi: 10.1159/000026149 9549720

36. Solomon A, Kareholt I, Ngandu T, Winblad B, Nissinen A, Tuomilehto J, et al. Serum cholesterol changes after midlife and late-life cognition: twenty-one-year follow-up study. Neurology. 2007;68(10):751–756. doi: 10.1212/01.wnl.0000256368.57375.b7 17339582

37. Shepardson NE, Shankar GM, Selkoe DJ. Cholesterol level and statin use in Alzheimer disease: I. Review of epidemiological and preclinical studies. JAMA Neurology. 2011;68(10):1239–1244.

38. Do R, Willer CJ, Schmidt EM, Sengupta S, Gao C, Peloso GM, et al. Common variants associated with plasma triglycerides and risk for coronary artery disease. Nature Genetics. 2013;45(11):1345–1352. doi: 10.1038/ng.2795 24097064

39. Willer CJ, Schmidt EM, Sengupta S, Peloso GM, Gustafsson S, Kanoni S, et al. Discovery and refinement of loci associated with lipid levels. Nature Genetics. 2013;45(11):1274–1283. doi: 10.1038/ng.2797 24097068

40. Benn M, Nordestgaard BG, Frikke-Schmidt R, Tybjærg-Hansen A. Low LDL cholesterol, PCSK9 and HMGCR genetic variation, and risk of Alzheimer’s disease and Parkinson’s disease: Mendelian randomisation study. British Medical Journal. 2017;357. doi: 10.1136/bmj.j1648 28438747

41. Lambert JC, Ibrahim-Verbaas CA, Harold D, Naj AC, Sims R, Bellenguez C, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nature Genetics. 2013;45(12):1452–1458. doi: 10.1038/ng.2802 24162737

42. Bowden J, Del Greco M F, Minelli C, Davey Smith G, Sheehan NA, Thompson JR. Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic. International Journal of Epidemiology. 2016;45(6):1961–1974. doi: 10.1093/ije/dyw220 27616674

43. Slob EA, Burgess S. A comparison of robust Mendelian randomization methods using summary data. bioRxiv. 2019; p. 577940.

44. Hemani G, Bowden J, Davey Smith G. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Human Molecular Genetics. 2018;27(R2):R195–R208. doi: 10.1093/hmg/ddy163 29771313

45. Burgess S, Thompson SG. Interpreting findings from Mendelian randomization using the MR-Egger method. European Journal of Epidemiology. 2017;32(5):377–389. doi: 10.1007/s10654-017-0255-x 28527048

Článek vyšel v časopise


2019 Číslo 9

Nejčtenější v tomto čísle

Tomuto tématu se dále věnují…

Kurzy Doporučená témata