Grants and Contracts Details
Description
The student will develop and provide the genomics workflow outlined in the Project Descrip-on. This
workflow will be developed using open-source so:ware and made publicly available via its own github
repository and the Docker image provided at the Docker Registry. The Gitub repository should include
the Docker file, citaAons for the so:ware used, installaAon instrucAons, step-by-step usage instrucAons
and a worked example with input and output files. Students will also be expected to be available to
answer quesAons about their workflow and documentaAon. The recipient and student are expected to
abide by guidelines outlined in the project Data Management Plan. The student will not be supervised
by University of Arizona employees but it is expected that their graduate mentor will be able to provide
any addiAonal technical advice or support.
Project Descrip2on:
My lab is beginning to work on the so:ware and interfaces necessary for an infrastructure that can
support “Process once, query forever” analysis of high throughput sequence (HTS) data. To that end, a
master’s student in my lab, Mr. Kai Li will work at 50% effort for the Fall 2023 semester to set up a query
server that will allow for the rapid query of sequence, both whole genome sequence and RNA-Seq data
we have collected. The work will specifically involve the construcAon of kmer indices on each HTS
dataset. The most Ame-consuming element of a query is that required by the so:ware to load the kmer
index in memory. In queries run in our lab, loading the index takes 5 minutes of a total of 6 when
querying 1.3M kmers. To eliminate this wait, we will set up server so:ware on a server equipped with
56TB of solid-state storage capable of effecAvely maintaining the kmer indices in memory.
Once completed, we will have a server that can be queried repeatedly, in seconds per query, to
genotype individual whole genome sequence, or similarly calculate transcripAon levels in RNA-Seq
datasets. This work will provide a strong foundaAon for work that will be transformaAve by creaAng a
distributed network of query engines of HTS data, thus providing truly FAIR access to these data
Status | Finished |
---|---|
Effective start/end date | 8/25/23 → 12/31/23 |
Funding
- University of Arizona: $14,054.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.