Explainable multi-task learning for multi-modality biological data analysis

Xin Tang, Jiawei Zhang, Yichun He, Xinhe Zhang, Zuwan Lin, Sebastian Partarrieu, Emma Bou Hanna, Zhaolin Ren, Hao Shen, Yuhong Yang, Xiao Wang, Na Li, Jie Ding, Jia Liu

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

Current biotechnologies can simultaneously measure multiple high-dimensional modalities (e.g., RNA, DNA accessibility, and protein) from the same cells. A combination of different analytical tasks (e.g., multi-modal integration and cross-modal analysis) is required to comprehensively understand such data, inferring how gene regulation drives biological diversity and functions. However, current analytical methods are designed to perform a single task, only providing a partial picture of the multi-modal data. Here, we present UnitedNet, an explainable multi-task deep neural network capable of integrating different tasks to analyze single-cell multi-modality data. Applied to various multi-modality datasets (e.g., Patch-seq, multiome ATAC + gene expression, and spatial transcriptomics), UnitedNet demonstrates similar or better accuracy in multi-modal integration and cross-modal prediction compared with state-of-the-art methods. Moreover, by dissecting the trained UnitedNet with the explainable machine learning algorithm, we can directly quantify the relationship between gene expression and other modalities with cell-type specificity. UnitedNet is a comprehensive end-to-end framework that could be broadly applicable to single-cell multi-modality biology. This framework has the potential to facilitate the discovery of cell-type-specific regulation kinetics across transcriptomics and other modalities.

Original languageEnglish
Article number2546
JournalNature Communications
Volume14
Issue number1
DOIs
StatePublished - Dec 2023

Bibliographical note

Publisher Copyright:
© 2023, The Author(s).

Funding

We thank Jane Salant for her helpful comments on the manuscript. J.L., J.D., and N.L. acknowledge the support from the NSF ECCS-2038603. J.L. acknowledges the support from NIH/NIDDK 1DP1DK130673 and William F. Milton Fund. J.D. acknowledges the support from the Army Research Laboratory and the Army Research Office under grant number W911NF-20-1-0222. Y.H. acknowledges the support from the James Mills Peirce Fellowship from the Graduate School of Arts and Sciences of Harvard University. Schematics in Figs. a, a, and were partially created with BioRender.com.

FundersFunder number
William F. Milton Fund
National Science Foundation Arctic Social Science ProgramECCS-2038603
National Science Foundation Arctic Social Science Program
National Institutes of Health (NIH)
National Institute of Diabetes and Digestive and Kidney Diseases1DP1DK130673
National Institute of Diabetes and Digestive and Kidney Diseases
Army Research OfficeW911NF-20-1-0222
Army Research Office
Army Research Laboratory
Graduate School of Arts and Sciences, Harvard University

    ASJC Scopus subject areas

    • General Chemistry
    • General Biochemistry, Genetics and Molecular Biology
    • General Physics and Astronomy

    Fingerprint

    Dive into the research topics of 'Explainable multi-task learning for multi-modality biological data analysis'. Together they form a unique fingerprint.

    Cite this