Office of Research, UC Riverside
Zizhong Chen
Professor
Computer Science & Engineering
zizhong@ucr.edu
(951) 827-2403


CAREER: Dependable High Performance Scientific Computing at Extreme Scale via Algorithmic Fault Tolerance

AWARD NUMBER
005955-002
FUND NUMBER
21138
STATUS
Closed
AWARD TYPE
3-Grant
AWARD EXECUTION DATE
11/21/2012
BEGIN DATE
11/21/2012
END DATE
3/31/2017
AWARD AMOUNT
$454,497

Sponsor Information

SPONSOR AWARD NUMBER
OCI-1305624
SPONSOR
NATIONAL SCIENCE FOUNDATION
SPONSOR TYPE
Federal
FUNCTION
Organized Research
PROGRAM NAME

Proposal Information

PROPOSAL NUMBER
13040323
PROPOSAL TYPE
New
ACTIVITY TYPE
Basic Research

PI Information

PI
Chen, Zizhong
PI TITLE
Other
PI DEPTARTMENT
Computer Science & Engineering
PI COLLEGE/SCHOOL
Bourns College of Engineering
CO PIs

Project Information

ABSTRACT

Extreme scale high-end computing platforms are expected to be available before 2020 and will have 100 million to 1 billion CPU cores. Due to the large number of components in these platforms, the probability that errors occur during the execution of an extreme scale application is expected to be much higher than observed today. The goal of this CAREER research project is to develop highly efficient techniques to detect, locate, and correct both soft and hard errors according to the specific characteristics of an algorithm. The target algorithms include (1) Krylov subspace methods for solving sparse linear systems and eigenvalue problems; (2) Direct methods for solving dense linear systems and eigenvalue problems; and (3) Newton's method for solving systems of non-linear equations.

This project will create significant education outcomes by integrating the following four components: (1) establishing a supercomputing research laboratory to support senior design projects and REU, enhance graduate education and research, and demonstrate highly dependable applications on high-end computing platforms; (2) enriching the teaching of both undergraduate and graduate courses by integrating fault tolerance and high performance computing into the courses; (3) increasing minority students involvement by encouraging minority students to pursue graduate degrees in computing; and (4) offering free workshops to K-12 teachers and students.
(Abstract from NSF)