Tao JiangDistinguished ProfessorComputer Science & Engineering tjiang@ucr.edu(951) 827-2991
EAGER: Transcript-Based Differential Expression Analysis for Population Data Without Predefined Conditions
AWARD NUMBER
008399-002
FUND NUMBER
33275
STATUS
Closed
AWARD TYPE
3-Grant
|
AWARD EXECUTION DATE
8/4/2016
BEGIN DATE
9/1/2016
END DATE
8/31/2018
AWARD AMOUNT
$200,000
|
Sponsor Information
SPONSOR AWARD NUMBER
SPONSOR
SPONSOR TYPE
FUNCTION
Organized Research
PROGRAM NAME
Proposal Information
PROPOSAL NUMBER
16121312
PROPOSAL TYPE
New
ACTIVITY TYPE
Basic Research
PI Information
PI
Jiang, Tao
PI TITLE
Other
PI DEPTARTMENT
Computer Science & Engineering
PI COLLEGE/SCHOOL
Bourns College of Engineering
CO PIs
Project Information
ABSTRACT
With the emergence of precision medicine, there is increased demand for more sensitive molecular biomarkers. A fundamental computational step in the discovery of molecular biomarkers is to identify genes that are expressed differently across different samples. This project investigates new algorithmic approaches for performing differential expression analysis at the transcript level for samples without predefined biological conditions. Such an analysis is critical to both clinical and biological studies on population (or cohort) data. For example, it can be used to discover molecular biomarkers to classify cancer samples into subtypes so that better diagnosis and therapy methods can be developed for each subtype. It can also be used to characterize individual cells involved in different biological processes. Efficient software tools can be built based on the new analysis approaches proposed in this project which can help biologists to discover more sensitive biomarkers than the existing methods.
Specifically, this project studies three approaches for differential transcript expression analysis on population data. The first two approaches treat either the exons or full transcripts of a gene as the basic expression elements and then apply a gene-level differential expression analysis method on these expression elements. The third approach is a hybrid of the first two. It uses a splice graph to represent the transcripts of a gene and a new modular decomposition algorithm to partition the graph into small components that correspond to independent alternative splicing events. Moreover, a robust clustering algorithm is employed to deal with an arbitrary number of conditions in the input population. The software implementations of the three approaches are calibrated and tested extensively on both simulated and real sequence data to establish their practical utility to the public.(Abstract from NSF)
|