Detailed Information

Cited 10 time in webofscience Cited 12 time in scopus
Metadata Downloads

A Case for Hardware-Based Demand Paging

Authors
Lee, G.[Lee, G.]Jin, W.[Jin, W.]Song, W.[Song, W.]Gong, J.[Gong, J.]Bae, J.[Bae, J.]Ham, T.J.[Ham, T.J.]Lee, J.W.[Lee, J.W.]Jeong, J.[Jeong, J.]
Issue Date
2020
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
CPU architecture; demand paging; hardware extension; operating systems; page fault; virtual memory
Citation
Proceedings - International Symposium on Computer Architecture, v.2020-May, pp.1103 - 1116
Indexed
SCOPUS
Journal Title
Proceedings - International Symposium on Computer Architecture
Volume
2020-May
Start Page
1103
End Page
1116
URI
https://scholarworks.bwise.kr/skku/handle/2021.sw.skku/6758
DOI
10.1109/ISCA45697.2020.00093
ISSN
1063-6897
Abstract
The virtual memory system is pervasive in today's computer systems, and demand paging is the key enabling mechanism for it. At a page miss, the CPU raises an exception, and the page fault handler is responsible for fetching the requested page from the disk. The OS typically performs a context switch to run other threads as traditional disk access is slow. However, with the widespread adoption of high-performance storage devices, such as low-latency solid-state drives (SSDs), the traditional OS-based demand paging is no longer effective because a considerable portion of the demand paging latency is now spent inside the OS kernel. Thus, this paper makes a case for hardware-based demand paging that mostly eliminates OS involvement in page miss handling to provide a near-disk-access-time latency for demand paging. To this end, two architectural extensions are proposed: LBA-augmented page table that moves I/O stack operations to the control plane and Storage Management Unit that enables CPU to directly issue I/O commands without OS intervention in most cases. OS support is also proposed to detach tasks for memory resource management from the critical path. The evaluation results using both a cycle-level simulator and a real x86 machine with an ultra-low latency SSD show that the proposed scheme reduces the demand paging latency by 37.0%, and hence improves the performance of FIO read random benchmark by up to 57.1% and a NoSQL server by up to 27.3% with real-world workloads. As a side effect of eliminating OS intervention, the IPC of the user-level code is also increased by up to 7.0%. © 2020 IEEE.
Files in This Item
There are no files associated with this item.
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE