Genotype Imputation for African Americans Using Data From HapMap Phase II Versus 1000 Genomes Projects

Yun J. Sung, C. Charles Gu, Hemant K. Tiwari, Donna K. Arnett, Ulrich Broeckel, Dabeeru C. Rao

Producción científica: Articlerevisión exhaustiva

12 Citas (Scopus)

Resumen

Genotype imputation provides imputation of untyped single nucleotide polymorphisms (SNPs) that are present on a reference panel such as those from the HapMap Project. It is popular for increasing statistical power and comparing results across studies using different platforms. Imputation for African American populations is challenging because their linkage disequilibrium blocks are shorter and also because no ideal reference panel is available due to admixture. In this paper, we evaluated three imputation strategies for African Americans. The intersection strategy used a combined panel consisting of SNPs polymorphic in both CEU and YRI. The union strategy used a panel consisting of SNPs polymorphic in either CEU or YRI. The merge strategy merged results from two separate imputations, one using CEU and the other using YRI. Because recent investigators are increasingly using the data from the 1000 Genomes (1KG) Project for genotype imputation, we evaluated both 1KG-based imputations and HapMap-based imputations. We used 23,707 SNPs from chromosomes 21 and 22 on Affymetrix SNP Array 6.0 genotyped for 1,075 HyperGEN African Americans. We found that 1KG-based imputations provided a substantially larger number of variants than HapMap-based imputations, about three times as many common variants and eight times as many rare and low-frequency variants. This higher yield is expected because the 1KG panel includes more SNPs. Accuracy rates using 1KG data were slightly lower than those using HapMap data before filtering, but slightly higher after filtering. The union strategy provided the highest imputation yield with next highest accuracy. The intersection strategy provided the lowest imputation yield but the highest accuracy. The merge strategy provided the lowest imputation accuracy. We observed that SNPs polymorphic only in CEU had much lower accuracy, reducing the accuracy of the union strategy. Our findings suggest that 1KG-based imputations can facilitate discovery of significant associations for SNPs across the whole MAF spectrum. Because the 1KG Project is still under way, we expect that later versions will provide better imputation performance.

Idioma originalEnglish
Páginas (desde-hasta)508-516
Número de páginas9
PublicaciónGenetic Epidemiology
Volumen36
N.º5
DOI
EstadoPublished - jul 2012

Financiación

FinanciadoresNúmero del financiador
National Institute of General Medical SciencesR01GM028719

    ASJC Scopus subject areas

    • Epidemiology
    • Genetics(clinical)

    Huella

    Profundice en los temas de investigación de 'Genotype Imputation for African Americans Using Data From HapMap Phase II Versus 1000 Genomes Projects'. En conjunto forman una huella única.

    Citar esto