Office of Research, UC Riverside
Thomas Girke
Professor of Bioinformatics
Botany and Plant Sciences Dept
tgirke@ucr.edu
(951) 732-7072


ABI Development: systemPipeR - automated NGS workflow and report environment

AWARD NUMBER
008957-002
FUND NUMBER
33346
STATUS
Active
AWARD TYPE
3-Grant
AWARD EXECUTION DATE
5/10/2017
BEGIN DATE
5/15/2017
END DATE
4/30/2020
AWARD AMOUNT
$648,874

Sponsor Information

SPONSOR AWARD NUMBER
1661152
SPONSOR
NATIONAL SCIENCE FOUNDATION
SPONSOR TYPE
Federal
FUNCTION
Organized Research
PROGRAM NAME

Proposal Information

PROPOSAL NUMBER
16030232
PROPOSAL TYPE
New
ACTIVITY TYPE
Basic Research

PI Information

PI
Girke, Thomas
PI TITLE
Other
PI DEPTARTMENT
Institute of Genomics
PI COLLEGE/SCHOOL
College of Nat & Agr Sciences
CO PIs

Project Information

ABSTRACT

Next-generation sequencing (NGS) has revolutionized how research is carried out in many areas of biology by allowing researchers to sequence genomes and transcriptomes on a routine basis. However, the analysis of NGS data remains a major obstacle to the efficient utilization of the technology. While substantial effort has been invested on the development of software dedicated to the individual analysis steps of NGS experiments, insufficient resources are currently available for integrating the individual software components into automated workflows capable of running the analysis of most types of NGS applications from start-to-finish in a time-efficient and reproducible manner. This Development project will address this need by enhancing systemPipeR, a popular R/Bioconductor software package. The project will have significant impact on a wide range of scientific communities and our society at large by allowing to translate NGS data into biologically relevant knowledge in a time-efficient and reproducible manner. This will accelerate many research and discovery projects in academia and industry, where NGS technologies play an important role. Extensive educational resources for interdisciplinary training at the intersect of genome and computational biology will be provided. Training will be offered to scientists, postdoctoral researchers, graduate and undergraduate students. Members of underrepresented groups will participate in all aspects of this project while supporting diversity. Extensive online tutorials will be provided to maximize the educational outreach of the activities.

The specific aims of this project are: (AIM 1) Enhancements to systemPipeR?s user interface and the workflow design framework will greatly simplify the process of running workflows, generating automated reports and designing new workflows. It will also improve user-friendliness to make systemPipeR equally useful for R and non-R users, as well as biologists without any expert knowledge in bioinformatics. The execution plan of this aim includes the design of a central workflow control user interface and the adaptation of new community standards to further increase the reproducibility of analysis workflows. (AIM 2) Automated analysis workflows will be developed for a wide range of additional NGS applications. Most of these workflows will be designed in collaboration with experts of the corresponding NGS application areas. Suggestions from the community will be incorporated as well. Sample templates will be provided for the supported NGS applications to create workflow instances with a single command fully populated with all input data and environment settings. (AIM 3) The project will also have a strong focus on community integration and performance evaluations provided by its current and future users. This includes options for users to contribute code or entire workflows, and extensive training of the target audience to analyze NGS data with systemPipeR and related resources. The URL of the systemPipeR project website is: http://girke.bioinformatics.ucr.edu/systemPipeR.
(Abstract from NSF)