TY - JOUR
T1 - Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations
AU - Kowalski, Madeline H.
AU - Qian, Huijun
AU - Hou, Ziyi
AU - Rosen, Jonathan D.
AU - Tapia, Amanda L.
AU - Shan, Yue
AU - Jain, Deepti
AU - Argos, Maria
AU - Arnett, Donna K.
AU - Avery, Christy
AU - Barnes, Kathleen C.
AU - Becker, Lewis C.
AU - Bien, Stephanie A.
AU - Bis, Joshua C.
AU - Blangero, John
AU - Boerwinkle, Eric
AU - Bowden, Donald W.
AU - Buyske, Steve
AU - Cai, Jianwen
AU - Cho, Michael H.
AU - Choi, Seung Hoan
AU - Choquet, Hélène
AU - Adrienne Cupples, L.
AU - Cushman, Mary
AU - Daya, Michelle
AU - de Vries, Paul S.
AU - Ellinor, Patrick T.
AU - Faraday, Nauder
AU - Fornage, Myriam
AU - Gabriel, Stacey
AU - Ganesh, Santhi K.
AU - Graff, Misa
AU - Gupta, Namrata
AU - He, Jiang
AU - Heckbert, Susan R.
AU - Hidalgo, Bertha
AU - Hodonsky, Chani J.
AU - Irvin, Marguerite R.
AU - Johnson, Andrew D.
AU - Jorgenson, Eric
AU - Kaplan, Robert
AU - Kardia, Sharon L.R.
AU - Kelly, Tanika N.
AU - Kooperberg, Charles
AU - Lasky-Su, Jessica A.
AU - Loos, Ruth J.F.
AU - Lubitz, Steven A.
AU - Mathias, Rasika A.
AU - McHugh, Caitlin P.
AU - Montgomery, Courtney
N1 - Publisher Copyright:
© This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
PY - 2019
Y1 - 2019
N2 - Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are limited. In addition, these populations have more complex linkage disequilibrium structure. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with genome-wide genotyping array data. We demonstrated that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhanced gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3- to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels. Impressively, even for extremely rare variants with minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~21,600 African-ancestry and ~21,700 Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC [p = 8.8x10-15] in African populations, rs11549407 with lower HGB [p = 1.5x10-12] and HCT [p = 8.8x10-10] in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of the TOPMed imputation reference panel for identification of novel rare variant associations not previously detected in similarly sized genome-wide studies of under-represented African and Hispanic/Latino populations.
AB - Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are limited. In addition, these populations have more complex linkage disequilibrium structure. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with genome-wide genotyping array data. We demonstrated that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhanced gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3- to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels. Impressively, even for extremely rare variants with minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~21,600 African-ancestry and ~21,700 Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC [p = 8.8x10-15] in African populations, rs11549407 with lower HGB [p = 1.5x10-12] and HCT [p = 8.8x10-10] in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of the TOPMed imputation reference panel for identification of novel rare variant associations not previously detected in similarly sized genome-wide studies of under-represented African and Hispanic/Latino populations.
UR - http://www.scopus.com/inward/record.url?scp=85077774053&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85077774053&partnerID=8YFLogxK
U2 - 10.1371/journal.pgen.1008500
DO - 10.1371/journal.pgen.1008500
M3 - Article
C2 - 31869403
AN - SCOPUS:85077774053
SN - 1553-7390
VL - 15
JO - PLoS Genetics
JF - PLoS Genetics
IS - 12
M1 - e1008500
ER -