-
Články
Top novinky
Reklama- Vzdělávání
- Časopisy
Top články
Nové číslo
- Témata
Top novinky
Reklama- Kongresy
- Videa
- Podcasty
Nové podcasty
Reklama- Kariéra
Doporučené pozice
Reklama- Praxe
Top novinky
ReklamaThe Covariate's Dilemma
article has not abstract
Published in the journal: . PLoS Genet 8(11): e32767. doi:10.1371/journal.pgen.1003096
Category: Perspective
doi: https://doi.org/10.1371/journal.pgen.1003096Summary
article has not abstract
An important step in analyzing genetic association study data is deciding whether to adjust for covariates—those variables ancillary to the variants of interest. In particular, when testing for novel associations, should the statistical model also include known genetic or nongenetic covariates that are predictors of the trait (e.g., body mass index when studying type 2 diabetes)? Yes, if the covariates are also correlated with the primary variants but do not mediate their effects, because they may confound the genetic associations. Including them helps control bias and prevent false discoveries (Figure 1a). But the answer is less clear-cut if the covariates are not confounders.
Fig. 1. Impact of—and approaches to—including covariates in the analysis of gene–trait associations.
(a) The covariate C is a confounder associated with both the trait D and the gene G but is not an intermediate on the causal path of interest between G and D. The G–D association should be assessed while controlling C. Omitting C from the analysis of the G–D association can lead to misattribution of a C–D effect to G and false discovery or biased estimates of a G–D effect. (b) The covariate C is independently associated with the trait D but not with gene G (so C is not a confounder). If the trait is quantitative or the study subjects are randomly ascertained, including C in a linear or logistic regression model will increase power to detect the G–D association. (c) If the trait is binary and the subjects are ascertained based on case-control status, the probability of selection (S) depends on G and C and induces a correlation between them. Then including C in a logistic regression model can inflate the G–D association's standard error, reducing power. Omitting C provides the most potential gain in power when C has a strong effect on D, and when D is less common [1]. (d) In Zaitlen et al.'s new approach [6] for evaluating G–D associations with case-control data, a risk model for D is developed from external information about the C–D association and observed C and D levels. Residuals from this model, R, distinguish high- and low-risk cases and controls. Then testing for G–R associations assesses genetic effects unexplained by C in a potentially more powerful manner than conventional logistic regression. When the trait of interest is quantitative, including a nonconfounding covariate associated with the trait is often beneficial because it can explain some of the variability in the outcome, thus reducing noise and increasing power to detect novel genetic associations. On the other hand, when the trait is binary, including the covariate can actually reduce power for case-control association studies; this is shown in a recent paper by Piranen et al. [1] and previous work [2]–[5]. Fortunately, all is not lost. In this issue of PLOS Genetics, Zaitlen et al. [6] present a new approach that addresses this problem by leveraging information on covariates to increase power in association studies of binary traits.
Ignorance Is Bliss…
How can ignoring covariate information increase power? Assume that we are studying the potential association between a genetic variant and a binary trait. Moreover, assume we have measured a genetic or environmental covariate associated with the trait but independent of the variant of interest in the source population, so it is not a confounder (Figure 1b). If we ascertain a random sample of study subjects, then the variant of interest and covariate will remain independent. Here, the most powerful model for assessing association includes the covariate (e.g., in a logistic regression model) [1]. While adding the covariate may increase the standard error of the variant association, omitting it can bias the association towards the null hypothesis of no effect and ultimately reduce power [1]–[5], [7].
However, most association studies do not select a random sample of study subjects, but rather ascertain cases and controls from the source population. This ascertainment process can create a correlation between the genetic variant and covariate in the sample, because cases will be enriched for both risk genotypes and high-risk covariate levels. Since these are independent in the source population, they will remain conditionally independent among cases or controls; but the variant and covariate will be correlated in the overall case-control sample (dashed line in Figure 1c). In the presence of this induced correlation, omitting the covariate from a logistic regression model may be the most powerful approach. Indeed, including the covariate could substantially increase the standard error of the genetic variant association (i.e., due to the induced correlation), resulting in a larger power loss than might arise from omitting the covariate and biasing the association towards the null hypothesis.
Pirinen et al. [1] investigate this phenomenon in detail and show that the increase in power from omitting covariates is a function of disease prevalence and effect sizes. In particular, omitting a covariate can often improve power to detect genetic effects for diseases with prevalence below 2% or as high as 10% when the covariate is a particularly strong risk factor.
Knowledge Is Power!
Improving analyses by ignoring covariates seems counterintuitive, as they should provide some information. To extract value from covariates, Zaitlen et al. [6] developed a new method that uses existing evidence of covariate associations with the trait of interest, and trait prevalence, to increase power. This approach first builds a liability model using estimates of a covariate's independent effect in the form of trait prevalences at various levels of the covariate (e.g., type 2 diabetes prevalences by age). Then it evaluates the association between the genetic variant of interest and the liability model residuals (Figure 1d). In effect, the external information about covariate effects is used to distinguish high - and low-risk cases and controls. Tests of genetic variant associations with these quantitative residuals have more power than tests of genetic associations with the original binary trait.
The value of Zaitlen et al.'s approach is demonstrated in several data sets with case-control and case-control-covariate ascertainment, where the selection probability for an individual to join the study depends on covariate levels, such as in matched studies or those with overrepresentation of low-risk cases. While covariate-based ascertainment of cases and controls can induce selection bias that must be addressed by including the covariate in a conventional regression model [8], the new method provides a potentially powerful alternative.
The authors show by application and simulation that the liability model approach increases association test statistics by 18% and 16% in comparison with logistic regression with or without covariates, respectively. Of course, this improvement hinges on having accurate external covariate information; one could envision scenarios where the external covariate data is so poor that using this approach would actually decrease power. One could also use covariate information discerned from a given dataset, but external information may be even better. A framework to propagate uncertainties through the multistage analysis of Zaitlen et al. would be useful to assess sensitivity to the quality of published or assumed trait prevalences and covariate effects, and to the estimation errors in the formation of the liability model and in the calculation of residuals. A starting point might be to repeat the analyses for a range of covariate-specific trait prevalences that bracket the actual published or assumed values.
Zaitlen and colleagues have also developed a version of the liability model approach for when the covariates are genetic markers with known trait associations [9]. Future work might compare these novel liability methods to alternative approaches for inclusion of external information, such as Bayesian models with informative priors for the covariate effects. Moreover, schemes for weighted analyses [10] suggest other ways to potentially increase association study power.
In summary, if one undertakes a case-control association study and has information on covariates that are independent risk factors for a trait—and are not confounders—simply including them in a logistic regression model is not always the optimal approach for discovering genetic variants. Instead, more power may be gained by excluding them, by using the liability model approach of Zaitlen et al. [6], [9], or by applying other novel techniques to leverage information from such covariates.
Zdroje
1. PirinenM, DonnellyP, SpencerCCA (2012) Including known covariates can reduce power to detect genetic effects in case-control studies. Nat Genet 44 : 848–851.
2. RobinsonLD, JewellNP (1991) Some surprising results about covariate adjustment in logistic regression models. Int Stat Rev 59 : 227–240.
3. NeuhausJM, JewellNP (1993) A geometrical approach to assess bias due to omitted covariates in generalized linear models. Biometrika 80 : 807–815.
4. NeuhausJM (1998) Estimation efficiency with omitted covariates in generalized linear models. J Am Stat Assoc 93 : 1124–1129.
5. KuoCL, FeingoldE (2010) What's the best statistic for a simple test of genetic association in a case-control study? Genet Epidemiol 34 : 246–253.
6. ZaitlenN, LindströmS, PasaniucB, CornelisM, GenoveseG, et al. (2012) Informed conditioning on clinical covariates increases power in case-control association studies. PLoS Genet 8: e1003032 doi:10.1371/journal.pgen.1003032
7. XingG, XingC (2010) Adjusting for covariates in logistic regression models. Genet Epidemiol 34 : 769–771.
8. Rothman KJ, Greenland S, Lash TL (2008) Modern epidemiology. 3rd edition. Philadelphia: Lippincott Williams & Wilkins. pp. 175–179.
9. ZaitlenN, PasanuicB, PattersonN, PollackS, VoightB, et al. (2012) Analysis of case-control association studies with known risk variants. Bioinformatics 28 : 1729–1737.
10. ClaytonD (2012) Link functions in multi-locus genetic models: Implications for testing, prediction, and interpretation. Genet Epidemiol 36 : 409–418.
Štítky
Genetika Reprodukční medicína
Článek Plant Vascular Cell Division Is Maintained by an Interaction between PXY and Ethylene SignallingČlánek Lessons from Model Organisms: Phenotypic Robustness and Missing Heritability in Complex Disease
Článek vyšel v časopisePLOS Genetics
Nejčtenější tento týden
2012 Číslo 11- IVF a rakovina prsu – zvyšují hormony riziko vzniku rakoviny?
- Akutní intermitentní porfyrie
- Souvislost haplotypu M2 genu pro annexin A5 s opakovanými reprodukčními ztrátami
- Transthyretinová amyloidóza z pohledu neurologa a kardiologa aneb jak se vyhnout „misdiagnostice“?
- Délka menstruačního cyklu jako marker ženské plodnosti
-
Všechny články tohoto čísla
- Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data
- The Covariate's Dilemma
- Plant Vascular Cell Division Is Maintained by an Interaction between PXY and Ethylene Signalling
- Plan B for Stimulating Stem Cell Division
- Discovering Thiamine Transporters as Targets of Chloroquine Using a Novel Functional Genomics Strategy
- Is a Modifier of Mutations in Retinitis Pigmentosa with Incomplete Penetrance
- Evolutionarily Ancient Association of the FoxJ1 Transcription Factor with the Motile Ciliogenic Program
- Genome Instability Caused by a Germline Mutation in the Human DNA Repair Gene
- Transcription Factor Oct1 Is a Somatic and Cancer Stem Cell Determinant
- Controls of Nucleosome Positioning in the Human Genome
- Disruption of Causes Defective Meiotic Recombination in Male Mice
- A Novel Human-Infection-Derived Bacterium Provides Insights into the Evolutionary Origins of Mutualistic Insect–Bacterial Symbioses
- Trps1 and Its Target Gene Regulate Epithelial Proliferation in the Developing Hair Follicle and Are Associated with Hypertrichosis
- Zcchc11 Uridylates Mature miRNAs to Enhance Neonatal IGF-1 Expression, Growth, and Survival
- Population-Based Resequencing of in 10,330 Individuals: Spectrum of Genetic Variation, Phenotype, and Comparison with Extreme Phenotype Approach
- HP1a Recruitment to Promoters Is Independent of H3K9 Methylation in
- Transcription Elongation and Tissue-Specific Somatic CAG Instability
- A Germline Polymorphism of DNA Polymerase Beta Induces Genomic Instability and Cellular Transformation
- Interallelic and Intergenic Incompatibilities of the () Gene in Mouse Hybrid Sterility
- Comparison of Mitochondrial Mutation Spectra in Ageing Human Colonic Epithelium and Disease: Absence of Evidence for Purifying Selection in Somatic Mitochondrial DNA Point Mutations
- Mutations in the Transcription Elongation Factor SPT5 Disrupt a Reporter for Dosage Compensation in Drosophila
- Evolution of Minimal Specificity and Promiscuity in Steroid Hormone Receptors
- Blockade of Pachytene piRNA Biogenesis Reveals a Novel Requirement for Maintaining Post-Meiotic Germline Genome Integrity
- RHOA Is a Modulator of the Cholesterol-Lowering Effects of Statin
- MIG-10 Functions with ABI-1 to Mediate the UNC-6 and SLT-1 Axon Guidance Signaling Pathways
- Loss of the DNA Methyltransferase MET1 Induces H3K9 Hypermethylation at PcG Target Genes and Redistribution of H3K27 Trimethylation to Transposons in
- Genome-Wide Association Studies Reveal a Simple Genetic Basis of Resistance to Naturally Coevolving Viruses in
- The Principal Genetic Determinants for Nasopharyngeal Carcinoma in China Involve the Class I Antigen Recognition Groove
- Molecular, Physiological, and Motor Performance Defects in DMSXL Mice Carrying >1,000 CTG Repeats from the Human DM1 Locus
- Genomic Study of RNA Polymerase II and III SNAP-Bound Promoters Reveals a Gene Transcribed by Both Enzymes and a Broad Use of Common Activators
- Long Telomeres Produced by Telomerase-Resistant Recombination Are Established from a Single Source and Are Subject to Extreme Sequence Scrambling
- The Yeast SR-Like Protein Npl3 Links Chromatin Modification to mRNA Processing
- Deubiquitylation Machinery Is Required for Embryonic Polarity in
- dJun and Vri/dNFIL3 Are Major Regulators of Cardiac Aging in Drosophila
- CtIP Is Required to Initiate Replication-Dependent Interstrand Crosslink Repair
- Notch-Mediated Suppression of TSC2 Expression Regulates Cell Differentiation in the Intestinal Stem Cell Lineage
- A Combination of H2A.Z and H4 Acetylation Recruits Brd2 to Chromatin during Transcriptional Activation
- Network Analysis of a -Mouse Model of Autosomal Dominant Polycystic Kidney Disease Identifies HNF4α as a Disease Modifier
- Mitosis in Neurons: Roughex and APC/C Maintain Cell Cycle Exit to Prevent Cytokinetic and Axonal Defects in Photoreceptor Neurons
- CELF4 Regulates Translation and Local Abundance of a Vast Set of mRNAs, Including Genes Associated with Regulation of Synaptic Function
- Mechanisms Employed by to Prevent Ribonucleotide Incorporation into Genomic DNA by Pol V
- The Genomes of the Fungal Plant Pathogens and Reveal Adaptation to Different Hosts and Lifestyles But Also Signatures of Common Ancestry
- A Genome-Scale RNA–Interference Screen Identifies RRAS Signaling as a Pathologic Feature of Huntington's Disease
- Lessons from Model Organisms: Phenotypic Robustness and Missing Heritability in Complex Disease
- Population Genomic Scan for Candidate Signatures of Balancing Selection to Guide Antigen Characterization in Malaria Parasites
- Tissue-Specific Regulation of Chromatin Insulator Function
- Disruption of Mouse Cenpj, a Regulator of Centriole Biogenesis, Phenocopies Seckel Syndrome
- Genome, Functional Gene Annotation, and Nuclear Transformation of the Heterokont Oleaginous Alga CCMP1779
- Antagonistic Gene Activities Determine the Formation of Pattern Elements along the Mediolateral Axis of the Fruit
- Lung eQTLs to Help Reveal the Molecular Underpinnings of Asthma
- Identification of the First ATRIP–Deficient Patient and Novel Mutations in ATR Define a Clinical Spectrum for ATR–ATRIP Seckel Syndrome
- Cooperativity of , , and in Malignant Breast Cancer Evolution
- Loss of Prohibitin Membrane Scaffolds Impairs Mitochondrial Architecture and Leads to Tau Hyperphosphorylation and Neurodegeneration
- Microhomology Directs Diverse DNA Break Repair Pathways and Chromosomal Translocations
- MicroRNA–Mediated Repression of the Seed Maturation Program during Vegetative Development in
- Selective Pressure Causes an RNA Virus to Trade Reproductive Fitness for Increased Structural and Thermal Stability of a Viral Enzyme
- The Tumor Suppressor Gene Retinoblastoma-1 Is Required for Retinotectal Development and Visual Function in Zebrafish
- Regions of Homozygosity in the Porcine Genome: Consequence of Demography and the Recombination Landscape
- Histone Methyltransferases MES-4 and MET-1 Promote Meiotic Checkpoint Activation in
- Polyadenylation-Dependent Control of Long Noncoding RNA Expression by the Poly(A)-Binding Protein Nuclear 1
- A Unified Method for Detecting Secondary Trait Associations with Rare Variants: Application to Sequence Data
- Genetic and Biochemical Dissection of a HisKA Domain Identifies Residues Required Exclusively for Kinase and Phosphatase Activities
- Informed Conditioning on Clinical Covariates Increases Power in Case-Control Association Studies
- Biochemical Diversification through Foreign Gene Expression in Bdelloid Rotifers
- Genomic Variation and Its Impact on Gene Expression in
- Spastic Paraplegia Mutation N256S in the Neuronal Microtubule Motor KIF5A Disrupts Axonal Transport in a HSP Model
- Lamin B1 Polymorphism Influences Morphology of the Nuclear Envelope, Cell Cycle Progression, and Risk of Neural Tube Defects in Mice
- A Targeted Glycan-Related Gene Screen Reveals Heparan Sulfate Proteoglycan Sulfation Regulates WNT and BMP Trans-Synaptic Signaling
- Dopaminergic D2-Like Receptors Delimit Recurrent Cholinergic-Mediated Motor Programs during a Goal-Oriented Behavior
- PLOS Genetics
- Archiv čísel
- Aktuální číslo
- Informace o časopisu
Nejčtenější v tomto čísle- Mechanisms Employed by to Prevent Ribonucleotide Incorporation into Genomic DNA by Pol V
- Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data
- Zcchc11 Uridylates Mature miRNAs to Enhance Neonatal IGF-1 Expression, Growth, and Survival
- Histone Methyltransferases MES-4 and MET-1 Promote Meiotic Checkpoint Activation in
Kurzy
Zvyšte si kvalifikaci online z pohodlí domova
Autoři: prof. MUDr. Vladimír Palička, CSc., Dr.h.c., doc. MUDr. Václav Vyskočil, Ph.D., MUDr. Petr Kasalický, CSc., MUDr. Jan Rosa, Ing. Pavel Havlík, Ing. Jan Adam, Hana Hejnová, DiS., Jana Křenková
Autoři: MUDr. Irena Krčmová, CSc.
Autoři: MDDr. Eleonóra Ivančová, PhD., MHA
Autoři: prof. MUDr. Eva Kubala Havrdová, DrSc.
Všechny kurzyPřihlášení#ADS_BOTTOM_SCRIPTS#Zapomenuté hesloZadejte e-mailovou adresu, se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.
- Vzdělávání