Journal article
Manycore algorithms for batch scalar and block tridiagonal solvers
- Abstract:
- Engineering, scientific, and financial applications often require the simultaneous solution of a large number of independent tridiagonal systems of equations with varying coefficients. Since the number of systems is large enough to offer considerable parallelism on manycore systems, the choice between different tridiagonal solution algorithms, such as Thomas, Cyclic Reduction (CR) or Parallel Cyclic Reduction (PCR) needs to be reexamined. This work investigates the optimal choice of tridiagonal algorithm for CPU, Intel MIC, and NVIDIA GPU with a focus on minimizing the amount of data transfer to and from the main memory using novel algorithms and the register-blocking mechanism, and maximizing the achieved bandwidth. It also considers block tridiagonal solutions, which are sometimes required in Computational Fluid Dynamic (CFD) applications. A novel work-sharing and register blocking--based Thomas solver is also presented.
- Publication status:
- Published
- Peer review status:
- Peer reviewed
Actions
Access Document
- Files:
-
-
(Preview, Accepted manuscript, pdf, 3.8MB, Terms of use)
-
- Publisher copy:
- 10.1145/2956571
Authors
+ University of National Excellence
More from this funder
- Grant:
- e TAMOP-4.2.1./B-11/2/KMR-2011-002, T ´ AMOP ´ - 4.2.2./B-10/1-2010-0014
- Publisher:
- Association for Computing Machinery
- Journal:
- ACM Transactions on Mathematical Software More from this journal
- Volume:
- 42
- Issue:
- 4
- Article number:
- 31
- Publication date:
- 2016-06-30
- Acceptance date:
- 2015-09-01
- DOI:
- EISSN:
-
1557-7295
- ISSN:
-
0098-3500
- Keywords:
- Pubs id:
-
pubs:570072
- UUID:
-
uuid:6985082c-8c58-4549-bfb8-6051b72b1fdc
- Local pid:
-
pubs:570072
- Source identifiers:
-
570072
- Deposit date:
-
2015-10-13
- ARK identifier:
Terms of use
- Copyright holder:
- ACM
- Copyright date:
- 2016
- Notes:
- Copyright © 2016 ACM. This is the accepted manuscript version of the article. The final version is available online from ACM at: https://doi.org/10.1145/2956571
If you are the owner of this record, you can report an update to it here: Report update to this record