CDS&E: Efficient and Robust Recurrent Neural Networks

Grants and Contracts Details

Description

Overview: Deep learning, or deep neural network, has emerged over the last decade as one of the most powerful machine learning methods. Recurrent neural network (RNN) is a special architecture that is designed to efficiently model sequential data such as speech and text by exploring temporal connections of the sequence and allowing variable length of the input sequence. It suffers, however, from so-called vanishing or exploding gradient problems. Its regularization for better generalization from training data to testing data has also been shown to be challenging. This project will develop an efficient and robust recurrent neural network by systematically addressing its various limitations. Intellectual Merit: Developing a robust and theoretically simple RNN is an intellectually challenging problem. The current preferred RNN architecture, the Long Short Term Memory network, has a highly complex structure with numerous additional interacting elements that is not well understood and not easy to implement. Our proposed network will retain the theoretical simplicity and efficiency of the basic RNN architecture but enhance some key capabilities for robust implementations. This has the potential to significantly advance the state of the art in the theory and algorithms of RNNs. Broader Impacts: The results from the proposed research are expected to impact a variety of areas involving sequential data. Computer vision, speech recognition, natural language processing, financial data analysis, and bioinformatics are some examples, and this list is expanding rapidly. This project will expand the applicability and functionality of RNNs, which may popularize it to a larger user community. The proposed research lies at the interface between applied mathematics, computer science, and statistics and provides an ideal setting for research cross-fertilization and collaboration as well as training of graduate students in interdisciplinary research. In this regard, the perspective from numerical analysis point of view will be particularly helpful in approaching various problems in neural networks. The project will also include collaborative works to apply the RNNs developed to the RNA secondary structure inference problems in bioinformatics. We plan to share computer codes derived in this project in the open source platform GitHub, which will accelerate dissemination of the research results to the user communities and promote real-world applications of RNNs.
StatusFinished
Effective start/end date9/1/188/31/22

Funding

  • National Science Foundation: $200,000.00

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.