In this work, we proposed a process to select informative genetic variants for identifying clinically meaningful subtypes of hypertensive patients. We studied 575 African American (AA) and 612 Caucasian hypertensive participants enrolled in the Hypertension Genetic Epidemiology Network (HyperGEN) study and analyzed each race-based group separately. All study participants underwentGWAS (Genome-Wide Association Studies) and echocardiography. We applied a variety of statistical methods and filtering criteria, including generalized linear models, F statistics, burden tests, deleterious variant filtering, and others to select the most informative hypertension-related genetic variants. We performed an unsupervised learning algorithm non-negative matrix factorization (NMF) to identify hypertension subtypes with similar genetic characteristics. Kruskal–Wallis tests were used to demonstrate the clinical meaningfulness of genetic-based hypertension subtypes. Two subgroups were identified for both African American and Caucasian HyperGEN participants. In both AAs and Caucasians, indices of cardiac mechanics differed significantly by hypertension subtypes. African Americans tend to have more genetic variants compared to Caucasians; therefore, using genetic information to distinguish the disease subtypes for this group of people is relatively challenging, but we were able to identify two subtypes whose cardiac mechanics have statistically different distributions using the proposed process. The research gives a promising direction in using statistical methods to select genetic information and identify subgroups of diseases, which may inform the development and trial of novel targeted therapies.
|Number of pages||13|
|State||Published - Nov 2020|
Bibliographical noteFunding Information:
Funding: This work was partially supported by National Science Foundation [DMS-1222592 to H.J.] and National Institutes of Health [R21LM012618 to Y.L., R01HL107577 to S.J.S., D.A, and M.I.R].
Acknowledgments: This research was supported in part through the computational resources and staff contributions provided for the Quest high performance computing facility at Northwestern University which is jointly supported by the Office of the Provost, the Office for Research, and Northwestern University Information Technology.
© 2020 by the authors. Licensee MDPI, Basel, Switzerland.
- Clustering algorithm
- Subtype identification
- Variable selection
ASJC Scopus subject areas