Recovering unused information in genome-wide association studies: the benefit of analyzing SNPs out of Hardy-Weinberg equilibrium

Eur J Hum Genet. 2009 Dec;17(12):1676-82. doi: 10.1038/ejhg.2009.85. Epub 2009 Jun 3.

Abstract

Although the rapid advancements in high throughput genotyping technology have made genome-wide association studies possible, these studies remain an expensive undertaking, especially when considering the large sample sizes necessary to find the small to moderate effect sizes that define complex diseases. It is therefore prudent to utilize all possible information contained in a genome-wide scan. We propose a straightforward analytical approach that tests often unused SNP data without sacrificing statistical validity. We simulate genotype miscalls under a variety of models consistent with observed miscall rates and test for departures from HWE using the standard Pearson's chi(2)-test. We find that true disease susceptibility loci subjected to various patterns of genotype miscalls can be largely out of HWE and, thus, be candidates for removal before association testing. These loci, we demonstrate, can maintain sufficient statistical power even under extreme error models. We additionally show that random miscalls of null SNPs, independent of the phenotype, do not induce bias in case-control or cohort studies, and we suggest that a significant HWE test should not prevent a SNP from being tested when conducting genome-wide association studies in these scenarios. However, association findings for SNPs that are out of HWE must be treated more carefully than 'regular' findings, for example, by re-genotyping the SNP in the same study using a different genotyping technology.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Alzheimer Disease / genetics
  • Case-Control Studies
  • Cohort Studies
  • Computer Simulation
  • Databases, Genetic*
  • False Positive Reactions
  • Genome-Wide Association Study*
  • Genotype
  • Humans
  • Models, Genetic*
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Analysis, DNA