A Case for Hardware-Based Demand Paging
- Authors
- Lee, G.[Lee, G.]; Jin, W.[Jin, W.]; Song, W.[Song, W.]; Gong, J.[Gong, J.]; Bae, J.[Bae, J.]; Ham, T.J.[Ham, T.J.]; Lee, J.W.[Lee, J.W.]; Jeong, J.[Jeong, J.]
- Issue Date
- 2020
- Publisher
- Institute of Electrical and Electronics Engineers Inc.
- Keywords
- CPU architecture; demand paging; hardware extension; operating systems; page fault; virtual memory
- Citation
- Proceedings - International Symposium on Computer Architecture, v.2020-May, pp.1103 - 1116
- Indexed
- SCOPUS
- Journal Title
- Proceedings - International Symposium on Computer Architecture
- Volume
- 2020-May
- Start Page
- 1103
- End Page
- 1116
- URI
- https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/6758
- DOI
- 10.1109/ISCA45697.2020.00093
- ISSN
- 1063-6897
- Abstract
- The virtual memory system is pervasive in today's computer systems, and demand paging is the key enabling mechanism for it. At a page miss, the CPU raises an exception, and the page fault handler is responsible for fetching the requested page from the disk. The OS typically performs a context switch to run other threads as traditional disk access is slow. However, with the widespread adoption of high-performance storage devices, such as low-latency solid-state drives (SSDs), the traditional OS-based demand paging is no longer effective because a considerable portion of the demand paging latency is now spent inside the OS kernel. Thus, this paper makes a case for hardware-based demand paging that mostly eliminates OS involvement in page miss handling to provide a near-disk-access-time latency for demand paging. To this end, two architectural extensions are proposed: LBA-augmented page table that moves I/O stack operations to the control plane and Storage Management Unit that enables CPU to directly issue I/O commands without OS intervention in most cases. OS support is also proposed to detach tasks for memory resource management from the critical path. The evaluation results using both a cycle-level simulator and a real x86 machine with an ultra-low latency SSD show that the proposed scheme reduces the demand paging latency by 37.0%, and hence improves the performance of FIO read random benchmark by up to 57.1% and a NoSQL server by up to 27.3% with real-world workloads. As a side effect of eliminating OS intervention, the IPC of the user-level code is also increased by up to 7.0%. © 2020 IEEE.
- Files in This Item
- There are no files associated with this item.
- Appears in
Collections - ETC > 1. Journal Articles
![qrcode](https://api.qrserver.com/v1/create-qr-code/?size=55x55&data=https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/6758)
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.