On combining family-and populationbased sequencing data

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Several statistical group-based approaches have been proposed to detect effects of variation within a gene for each of the population-and family-based designs. However, unified tests to combine gene-phenotype associations obtained from these 2 study designs are not yet well established. In this study, we investigated the efficient combination of population-based and family-based sequencing data to evaluate best practices using the Genetic Analysis Workshop 19 (GAW19) data set. Because one design employed whole genome sequencing and the other whole exome sequencing, we examined variants overlapping both data sets. We used the family-based sequence kernel association test (famSKAT) to analyze the family-and population-based data sets separately as well as with a combined data set. These were compared against meta-analysis. Using the combined data, we showed that famSKAT has high power to detect associations between diastolic and/or systolic blood pressures and the genes that have causal variants with large effect sizes, such as MAP4, TNN, and CGN. However, when there was a considerable difference in the powers between family-and population-based data, famSKAT with the combined data had lower power than that from the population-based data alone. The famSKAT test statistic for the combined data can be influenced by sample imbalance from the 2 designs. This underscores the importance of foresight in study design as, in this situation, the greatly lower sample size in the family-based data essentially serves to dilute signal. We observed inflated type I errors in our simulation study, largely when using population-based data, which might be a result of principal components failing to completely account for population admixture in this cohort.

Original languageEnglish
Article number21
JournalBMC Proceedings
Volume10
DOIs
StatePublished - 2016

Bibliographical note

Publisher Copyright:
© 2016 The Author(s).

Funding

The Genetic Analysis Workshop is supported by National Institutes of Health (NIH) grant R01 GM031575. This work was also supported by NIH grant K25AG043546. The GAW19 whole genome sequence data were provided by the T2D-GENES Consortium, which is supported by NIH grants U01 DK085524, U01 DK085584, U01 DK085501, U01 DK085526, and U01 DK085545. The other genetic and phenotypic data for GAW19 were provided by the San Antonio Family Heart Study and San Antonio Family Diabetes/Gallbladder Study, which are supported by NIH grants P01 HL045222, R01 DK047482, and R01 DK053889. The Genetic Analysis Workshop is supported by NIH grant R01 GM031575. This article has been published as part of BMC Proceedings Volume 10 Supplement 7, 2016: Genetic Analysis Workshop 19: Sequence, Blood Pressure and Expression Data. Summary articles. The full contents of the supplement are available online at http://bmcproc.biomedcentral.com/ articles/supplements/volume-10-supplement-7. Publication of the proceedings of Genetic Analysis Workshop 19 was supported by National Institutes of Health grant R01 GM031575.

FundersFunder number
National Institutes of Health (NIH)U01 DK085584, R01 DK047482, U01 DK085524, U01 DK085501, U01 DK085545, P01 HL045222, K25AG043546, R01 DK053889, R01 GM031575, U01 DK085526

    ASJC Scopus subject areas

    • General Biochemistry, Genetics and Molecular Biology

    Fingerprint

    Dive into the research topics of 'On combining family-and populationbased sequencing data'. Together they form a unique fingerprint.

    Cite this