Scalable heterogeneous execution of a coupled-cluster model with perturbative triples
- Authors
- Kim, J.; Panyala, A.; Peng, B.; Kowalski, K.; Sadayappan, P.; Krishnamoorthy, S.
- Issue Date
- Nov-2020
- Publisher
- IEEE Computer Society
- Citation
- International Conference for High Performance Computing, Networking, Storage and Analysis, SC
- Journal Title
- International Conference for High Performance Computing, Networking, Storage and Analysis, SC
- URI
- https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/63152
- DOI
- 10.1109/SC41405.2020.00083
- ISSN
- 2167-4329
- Abstract
- The CCSD(T) coupled-cluster model with perturbative triples is considered a gold standard for computational modeling of the correlated behavior of electrons in molecular systems. A fundamental constraint is the relatively small global-memory capacity in GPUs compared to the main-memory capacity on host nodes, necessitating relatively smaller tile sizes for high-dimensional tensor contractions in NWChem's GPU-accelerated implementation of the CCSD(T) method. A coordinated redesign is described to address this limitation and associated data movement overheads, including a novel fused GPU kernel for a set of tensor contractions, along with inter-node communication optimization and data caching. The new implementation of GPU-accelerated CCSD(T) improves overall performance by ;3.4 ×. Finally, we discuss the trade-offs in using this fused algorithm on current and future supercomputing platforms. © 2020 IEEE.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - College of Software > School of Computer Science and Engineering > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/cau/handle/2019.sw.cau/63152)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.