Parameter based tuning model for optimizing performance on GPU
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Nhat-Phuong Tran | - |
dc.contributor.author | Lee, Myungho | - |
dc.contributor.author | Choi, Jaeyoung | - |
dc.date.available | 2018-05-08T14:30:22Z | - |
dc.date.created | 2018-04-17 | - |
dc.date.issued | 2017-09 | - |
dc.identifier.issn | 1386-7857 | - |
dc.identifier.uri | http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/6256 | - |
dc.description.abstract | Recently, the graphic processing units (GPUs) are becoming increasingly popular for the high performance computing applications. Although the GPUs provide high peak performance, exploiting the full performance potential for application programs, however, leaves a challenging task to the programmers. When launching a parallel kernel of an application on the GPU, the programmer needs to carefully select the number of blocks (grid size) and the number of threads per block (block size). These values determine the degree of SIMD parallelism and the multithreading, and greatly influence the performance. With a huge range of possible combinations of these values, choosing the right grid size and the block size is not straightforward. In this paper, we propose a mathematical model for tuning the grid size and the block size based on the GPU architecture parameters. Using our model we first calculate a small set of candidate grid size and block size values, then search for the optimal values out of the candidate values through experiments. Our approach significantly reduces the potential search space instead of exhaustive search approaches in the previous research. Thus our approach can be practically applied to the real applications. | - |
dc.publisher | SPRINGER | - |
dc.relation.isPartOf | CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | - |
dc.title | Parameter based tuning model for optimizing performance on GPU | - |
dc.type | Article | - |
dc.identifier.doi | 10.1007/s10586-017-1003-4 | - |
dc.type.rims | ART | - |
dc.identifier.bibliographicCitation | CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, v.20, no.3, pp.2133 - 2142 | - |
dc.description.journalClass | 1 | - |
dc.identifier.wosid | 000407928800019 | - |
dc.identifier.scopusid | 2-s2.0-85021759226 | - |
dc.citation.endPage | 2142 | - |
dc.citation.number | 3 | - |
dc.citation.startPage | 2133 | - |
dc.citation.title | CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | - |
dc.citation.volume | 20 | - |
dc.contributor.affiliatedAuthor | Choi, Jaeyoung | - |
dc.type.docType | Article | - |
dc.subject.keywordAuthor | GPU | - |
dc.subject.keywordAuthor | High performance computing | - |
dc.subject.keywordAuthor | Performance tuning | - |
dc.subject.keywordAuthor | Multi-threading | - |
dc.subject.keywordAuthor | Micro-benchmark | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
Soongsil University Library 369 Sangdo-Ro, Dongjak-Gu, Seoul, Korea (06978)02-820-0733
COPYRIGHT ⓒ SOONGSIL UNIVERSITY, ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.