Parallel implementation of Nussbaumer algorithm and number theoretic transform on a GPU platform: application to qTESLA
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, W.-K. | - |
dc.contributor.author | Akleylek, S. | - |
dc.contributor.author | Wong, D.C.-K. | - |
dc.contributor.author | Yap, W.-S. | - |
dc.contributor.author | Goi, B.-M. | - |
dc.contributor.author | Hwang, S.-O. | - |
dc.date.available | 2021-03-17T00:40:11Z | - |
dc.date.created | 2020-08-18 | - |
dc.date.issued | 2021-04 | - |
dc.identifier.issn | 0920-8542 | - |
dc.identifier.uri | https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/80452 | - |
dc.description.abstract | Among the popular post-quantum schemes, lattice-based cryptosystems have received renewed interest since there are relatively simple, highly parallelizable and provably secure under a worst-case hardness assumption. However, polynomial multiplication over rings is the most time-consuming operation in most of the lattice-based cryptosystems. To further improve the performance of lattice-based cryptosystems for large scale usage, polynomial multiplication must be implemented in parallel. The polynomial multiplication can be performed using either number theoretic transform (NTT) or Nussbaumer algorithm. However, Nussbaumer algorithm is inherently serial. Meanwhile, the efficient implementation of NTT using various indexing methods on GPU platform remains unknown.In this paper, we explore the best combination of various indexing methods to implement NTT on GPU platform and the efficient way to parallelize the Nussbaumer algorithm. Our results suggest that the combination of Gentleman–Sande and Cooley–Tukey (GS-CT) indexing methods produced the best performance on RTX2060 GPU (i.e. 422,638 polynomial multiplications per second). A technique to parallelize Nussbaumer algorithm by reducing the non-coalesced global memory access to half is produced. To the best of our knowledge, this is the first GPU implementation of Nussbaumer algorithm and it outperforms the best aforementioned NTT (GS-CT) implementation by 14.5%. For illustration purpose, the proposed GPU implementation techniques are applied to qTESLA, a state-of-the-art lattice based signature scheme. We emphasize that the proposed implementation techniques are not specific to any cryptosystem; they can be easily adapted to any other lattice-based cryptosystems. © 2020, Springer Science+Business Media, LLC, part of Springer Nature. | - |
dc.language | 영어 | - |
dc.language.iso | en | - |
dc.publisher | Springer | - |
dc.relation.isPartOf | Journal of Supercomputing | - |
dc.title | Parallel implementation of Nussbaumer algorithm and number theoretic transform on a GPU platform: application to qTESLA | - |
dc.type | Article | - |
dc.type.rims | ART | - |
dc.description.journalClass | 1 | - |
dc.identifier.wosid | 000559632500001 | - |
dc.identifier.doi | 10.1007/s11227-020-03392-x | - |
dc.identifier.bibliographicCitation | Journal of Supercomputing, v.77, no.4, pp.3289 - 3314 | - |
dc.description.isOpenAccess | N | - |
dc.identifier.scopusid | 2-s2.0-85089298096 | - |
dc.citation.endPage | 3314 | - |
dc.citation.startPage | 3289 | - |
dc.citation.title | Journal of Supercomputing | - |
dc.citation.volume | 77 | - |
dc.citation.number | 4 | - |
dc.contributor.affiliatedAuthor | Lee, W.-K. | - |
dc.contributor.affiliatedAuthor | Hwang, S.-O. | - |
dc.type.docType | Article | - |
dc.subject.keywordAuthor | Graphics processing units | - |
dc.subject.keywordAuthor | Lattice-based cryptography | - |
dc.subject.keywordAuthor | Number theoretic transform | - |
dc.subject.keywordAuthor | Nussbaumer algorithm | - |
dc.subject.keywordAuthor | Post-quantum cryptography | - |
dc.subject.keywordPlus | Graphics processing unit | - |
dc.subject.keywordPlus | Indexing (of information) | - |
dc.subject.keywordPlus | Mathematical transformations | - |
dc.subject.keywordPlus | Quantum cryptography | - |
dc.subject.keywordPlus | Efficient implementation | - |
dc.subject.keywordPlus | GPU implementation | - |
dc.subject.keywordPlus | Implementation techniques | - |
dc.subject.keywordPlus | Number theoretic transform | - |
dc.subject.keywordPlus | Parallel implementations | - |
dc.subject.keywordPlus | Polynomial multiplication | - |
dc.subject.keywordPlus | Signature Scheme | - |
dc.subject.keywordPlus | State of the art | - |
dc.subject.keywordPlus | Polynomials | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114
COPYRIGHT 2020 Gachon University All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.