enter search term and/or author name
Algorithm-Based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy
Aurelien Bouteiller, Thomas Herault, George Bosilca, Peng Du, Jack Dongarra
Article No.: 10
Dense matrix factorizations, such as LU, Cholesky and QR, are widely used for scientific applications that require solving systems of linear equations, eigenvalues and linear least squares problems. Such computations are normally carried out on...
Avoiding Communication in Successive Band Reduction
Grey Ballard, James Demmel, Nicholas Knight
Article No.: 11
The running time of an algorithm depends on both arithmetic and communication (i.e., data movement) costs, and the relative costs of communication are growing over time. In this work, we present sequential and distributed-memory parallel...
Collective Algorithms for Multiported Torus Networks
Paul Sack, William Gropp
Article No.: 12
Modern supercomputers with torus networks allow each node to simultaneously pass messages on all of its links. However, most collective algorithms are designed to only use one link at a time. In this work, we present novel multiported algorithms...
Lock Cohorting: A General Technique for Designing NUMA Locks
David Dice, Virendra J. Marathe, Nir Shavit
Article No.: 13
Multicore machines are quickly shifting to NUMA and CC-NUMA architectures, making scalable NUMA-aware locking algorithms, ones that take into account the machine's nonuniform memory and caching hierarchy, ever more important. This article presents...
Breadth-First Search (BFS) is a core primitive for graph traversal and a basis for many higher-level graph analysis algorithms. It is also representative of a class of parallel computations whose memory accesses and work distribution are both...
Section: Special Issue on PPOPP'12
SciPAL: Expression Templates and Composition Closure Objects for High Performance Computational Physics with CUDA and OpenMP
Stephan C. Kramer, Johannes Hagemann
Article No.: 15
We present SciPAL (scientific parallel algorithms library), a C++-based, hardware-independent open-source library. Its core is a domain-specific embedded language for numerical linear algebra. The main fields of application are...
Power Management of Extreme-Scale Networks with On/Off Links in Runtime Systems
Ehsan Totoni, Nikhil Jain, Laxmikant V. Kale
Article No.: 16
Networks are among major power consumers in large-scale parallel systems. During execution of common parallel applications, a sizeable fraction of the links in the high-radix interconnects are either never used or are underutilized. We propose a...