Grants and Contracts Details
Description
Originally thought to be a relatively uncommon phenomenon, alternative splicing is now appreciated to
be a widespread and primary mechanism by which eukaryotes have expanded the structural and functional
diversity of their encoded proteome. The new generation of ultra-high throughput sequencers has
opened up new ways to study the cell’s alternative splicing and its variation in response to environmental
conditions. RNA-Seq datasets have the potential to assess gene expression and expression of non-coding
RNA as well as to identify and quantitate alternative splice variants including gene fusion simultaneously
within the transcriptome. By comparing these measures across RNA-seq datasets from different
cells and conditions we can determine the uses of alternative splicing and elucidate the regulation of
alternative splicing, ultimately providing new insight on a functional level into medicine and biology.
Accurate characterization of the transcriptome from the hundreds of millions of random short sequences
sampled from messenger RNA samples, however, is still an unsolved problem. The development of new
methods to analyze alternative splicing from RNA-seq data is proposed. The intellectual merits of
this proposal include
• A maximum likelihood approach coupled with fast and memory efficient computational algorithms
for the alignment of RNA-seq reads to the genome that enables highly sensitive and accurate
identification of both novel and known splicing and fusion events.
• A genome-wide transcriptome comparison method to detect statistically significant differential alternative
splicing patterns across biological samples, relying on a novel and compact transcriptome
representation as a labeled graph.
• A set of data mining algorithms to reconstruct co-regulated splicing networks and to detect clusters
of alternative splicing events operating in concert to carry out specific biological functions.
The algorithms and tools to be developed are data driven and are applicable to the transcriptome from
any species, requiring only a reference genome, and without dependence on transcript databases or a
priori gene structure annotation. These methods will be rigorously evaluated and validated through
several biological applications in collaborations with biologists as well as through the participation in
the RNASeq Genome Annotation Assessment Project (RGASP). We expect the resulting advances
in RNA-seq analysis software will significantly improve the characterization of the transcriptome and
the identification of functional elements that regulates the transcriptome in response to environmental
conditions.
Broader impact: The successful implementation of this research plan will produce a suite of
computational and statistical methods implemented as open source software to meet the immediate
demand from the biology community for the analysis of high throughput RNA-seq datasets. These tools
will enable individual scientists to assess the mRNA transcriptome in a matter of days using samples
from any organisms with a reference genome (which are themselves becoming easier to resequence
using RNA-seq technologies). Its impact, therefore, would be transformative as to how biologists and
biomedical researchers are doing science every day. Within the context of the PI’s research plan the
following education objectives and plans are integrated:
• Improve the awareness of bioinformatics as a critical interdisciplinary research area among students
from biology, computer science, and engineering and enrich the undergraduate curriculum with a
new introductory bioinformatics course.
• Improve cross-disciplinary research training opportunities for graduate and undergraduate students
through the Bioinformatics Certificate Program and a newly established Biomedical Informatics
Department at UKy.
• The PI will emphasize recruitment and retention of under-represented groups in majors that can
be combined with bioinformatics. She will continue to train and recruit female graduate students
and bring in students from underrepresented groups and from the Appalachian region through
the NSF funded AMSTEMM (Appalachian and Minority, Science, Technology, Engineering, and
Mathematics Majors) program at UKy.
Status | Finished |
---|---|
Effective start/end date | 4/15/11 → 7/31/17 |
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.