TY - JOUR
T1 - Comprehensive functional annotation of susceptibility variants identifies genetic heterogeneity between lung adenocarcinoma and squamous cell carcinoma
AU - Qin, Na
AU - Li, Yuancheng
AU - Wang, Cheng
AU - Zhu, Meng
AU - Dai, Juncheng
AU - Hong, Tongtong
AU - Albanes, Demetrius
AU - Lam, Stephen
AU - Tardon, Adonina
AU - Chen, Chu
AU - Goodman, Gary
AU - Bojesen, Stig E.
AU - Landi, Maria Teresa
AU - Johansson, Mattias
AU - Risch, Angela
AU - Wichmann, H. Erich
AU - Bickeboller, Heike
AU - Rennert, Gadi
AU - Arnold, Susanne
AU - Brennan, Paul
AU - Field, John K.
AU - Shete, Sanjay
AU - Le Marchand, Loic
AU - Melander, Olle
AU - Brunnstrom, Hans
AU - Liu, Geoffrey
AU - Hung, Rayjean J.
AU - Andrew, Angeline
AU - Kiemeney, Lambertus A.
AU - Zienolddiny, Shan
AU - Grankvist, Kjell
AU - Johansson, Mikael
AU - Caporaso, Neil
AU - Woll, Penella
AU - Lazarus, Philip
AU - Schabath, Matthew B.
AU - Aldrich, Melinda C.
AU - Stevens, Victoria L.
AU - Jin, Guangfu
AU - Christiani, David C.
AU - Hu, Zhibin
AU - Amos, Christopher I.
AU - Ma, Hongxia
AU - Shen, Hongbing
N1 - Funding Information:
This study was supported by the Key International (Regional) Cooperative Research Project (No. 81820108028), the National Natural Science Foundation of China (Nos. 81521004, 81922061, 81973123, and 81803306), the Science Foundation for Distinguished Young Scholars of Jiangsu (No. BK20160046), and the Priority Academic Program for the Development of Jiangsu Higher Education Institutions (Public Health and Preventive Medicine). CARET is funded by the National Cancer Institute, National Institutes of Health of USA through grants U01-CA063673, UM1-CA167462, and U01-CA167462.
Publisher Copyright:
© 2020, Higher Education Press.
PY - 2021/4
Y1 - 2021/4
N2 - Although genome-wide association studies have identified more than eighty genetic variants associated with non-small cell lung cancer (NSCLC) risk, biological mechanisms of these variants remain largely unknown. By integrating a large-scale genotype data of 15 581 lung adenocarcinoma (AD) cases, 8350 squamous cell carcinoma (SqCC) cases, and 27 355 controls, as well as multiple transcriptome and epigenomic databases, we conducted histology-specific meta-analyses and functional annotations of both reported and novel susceptibility variants. We identified 3064 credible risk variants for NSCLC, which were overrepresented in enhancer-like and promoter-like histone modification peaks as well as DNase I hypersensitive sites. Transcription factor enrichment analysis revealed that USF1 was AD-specific while CREB1 was SqCC-specific. Functional annotation and gene-based analysis implicated 894 target genes, including 274 specifics for AD and 123 for SqCC, which were overrepresented in somatic driver genes (ER = 1.95, P = 0.005). Pathway enrichment analysis and Gene-Set Enrichment Analysis revealed that AD genes were primarily involved in immune-related pathways, while SqCC genes were homologous recombination deficiency related. Our results illustrate the molecular basis of both well-studied and new susceptibility loci of NSCLC, providing not only novel insights into the genetic heterogeneity between AD and SqCC but also a set of plausible gene targets for post-GWAS functional experiments.
AB - Although genome-wide association studies have identified more than eighty genetic variants associated with non-small cell lung cancer (NSCLC) risk, biological mechanisms of these variants remain largely unknown. By integrating a large-scale genotype data of 15 581 lung adenocarcinoma (AD) cases, 8350 squamous cell carcinoma (SqCC) cases, and 27 355 controls, as well as multiple transcriptome and epigenomic databases, we conducted histology-specific meta-analyses and functional annotations of both reported and novel susceptibility variants. We identified 3064 credible risk variants for NSCLC, which were overrepresented in enhancer-like and promoter-like histone modification peaks as well as DNase I hypersensitive sites. Transcription factor enrichment analysis revealed that USF1 was AD-specific while CREB1 was SqCC-specific. Functional annotation and gene-based analysis implicated 894 target genes, including 274 specifics for AD and 123 for SqCC, which were overrepresented in somatic driver genes (ER = 1.95, P = 0.005). Pathway enrichment analysis and Gene-Set Enrichment Analysis revealed that AD genes were primarily involved in immune-related pathways, while SqCC genes were homologous recombination deficiency related. Our results illustrate the molecular basis of both well-studied and new susceptibility loci of NSCLC, providing not only novel insights into the genetic heterogeneity between AD and SqCC but also a set of plausible gene targets for post-GWAS functional experiments.
KW - function annotation
KW - genetic heterogeneity
KW - genome-wide association study
KW - homologous recombination repair deficiency
KW - immune
KW - lung cancer
UR - http://www.scopus.com/inward/record.url?scp=85090192677&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090192677&partnerID=8YFLogxK
U2 - 10.1007/s11684-020-0779-4
DO - 10.1007/s11684-020-0779-4
M3 - Article
C2 - 32889700
AN - SCOPUS:85090192677
SN - 2095-0217
VL - 15
SP - 275
EP - 291
JO - Frontiers of Medicine
JF - Frontiers of Medicine
IS - 2
ER -