DID YOU KNOW? DTIC has over 3.5 million final reports on DoD funded research, development, test, and evaluation activities available to our registered users. Click

HERE to register or log in.

# Accession Number:

## ADA603965

# Title:

## Avoiding Communication in Dense Linear Algebra

# Descriptive Note:

## Doctoral thesis

# Corporate Author:

## CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES

# Report Date:

## 2013-08-16

# Pagination or Media Count:

##
219.0

# Abstract:

## Dense linear algebra computations are essential to nearly every problem in scientific computing and to countless other elds. Most matrix computations enjoy a high computational intensity i.e., ratio of computation to data, and therefore the algorithms for the computations have a potential for high efficiency. However, performance for many linear algebra algorithms is limited by the cost of moving data between processors on a parallel computer or throughout the memory hierarchy of a single processor, which we will refer to generally as communication. Technological trends indicate that algorithmic performance will become even more limited by communication in the future. In this thesis, we consider the fundamental computations within dense linear algebra and address the following question can we significantly improve the current algorithms for these computations, in terms of the communication they require and their performance in practice To answer the question, we analyze algorithms on sequential and parallel architectural models that are simple enough to determine coarse communication costs but accurate enough to predict performance of implementations on real hardware. For most of the computations we prove lower bounds on the communication that any algorithm must perform. If an algorithm exists with communication costs that match the lower bounds at least in an asymptotic sense, we call the algorithm communication optimal. In many cases, the most commonly used algorithms are not communication optimal, and we can develop new algorithms that require less data movement and attain the communication lower bounds. In this thesis, we develop both new communication lower bounds and new algorithms tightening and in many cases closing the gap between best known lower bound and best known algorithm or upper bound. We consider both sequential and parallel algorithms, and we asses both classical and fast algorithms e.g., Strassens matrix multiplication algorithm.

# Distribution Statement:

## APPROVED FOR PUBLIC RELEASE

#