Gene expression clock: an unsupervised deep learning approach for predicting circadian rhythmicity from whole genome expression

Research output: Contribution to journalArticlepeer-review

Abstract

Circadian rhythms are driven by an internal molecular clock which controls physiological and behavioral processes. Disruptions in these rhythms have been associated with health issues. Therefore, studying circadian rhythms is crucial for understanding physiology, behavior, and pathophysiology. However, it is challenging to study circadian rhythms over gene expression data, due to a scarcity of time labels. In this paper, we propose a novel approach to predict the phases of un-timed samples based on a deep neural network (DNN) architecture. This approach addresses two challenges: (1) prediction of sample phases and reliable identification of cyclic genes from high-dimensional expression data without relying on conserved circadian genes and (2) handling small sample-sized datasets. Our algorithm begins with initial gene screening to select candidate cyclic genes using a Minimum Distortion Embedding framework. This stage is then followed by greedy layer-wise pre-training of our DNN. Pre-training accomplishes two critical objectives: First, it initializes the hidden layers of our DNN model, enabling them to effectively capture features from the gene profiles with limited samples. Second, it provides suitable initial values for essential aspects of gene periodic oscillations. Subsequently, we fine-tune the pre-trained network to achieve precise sample phase predictions. Extensive experiments on both animal and human datasets show accurate and robust prediction of both sample phases and cyclic genes. Moreover, based on an Alzheimer’s disease (AD) dataset, we identify a set of hub genes that show significant oscillations in cognitively normal subjects but had disruptions in AD, as well as their potential therapeutic targets.

Original languageEnglish
Pages (from-to)20653-20670
Number of pages18
JournalNeural Computing and Applications
Volume36
Issue number33
DOIs
StatePublished - Nov 2024

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.

Funding

This study was partially supported by NIH R21 AG070909-01, P30 AG072946-01, R01 HD101508-01, and NSF IIS 2327113.

FundersFunder number
National Institutes of Health (NIH)R21 AG070909-01, R01 HD101508-01, P30 AG072946-01
National Science Foundation Arctic Social Science ProgramIIS 2327113

    UN SDGs

    This output contributes to the following UN Sustainable Development Goals (SDGs)

    1. SDG 3 - Good Health and Well-being
      SDG 3 Good Health and Well-being

    Keywords

    • Bioinformatics
    • Circadian sample phase
    • Greedy layer-wise pre-training
    • Unsupervised deep learning
    • Whole genome gene expression

    ASJC Scopus subject areas

    • Software
    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'Gene expression clock: an unsupervised deep learning approach for predicting circadian rhythmicity from whole genome expression'. Together they form a unique fingerprint.

    Cite this