Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Accelerating ML/DL Applications with Hierarchical Caching on Deduplication Storage Clustersopen access

Authors
Hamandawana PrinceKhan, AwaisKim, JongikChung, Tae-Sun
Issue Date
Dec-2022
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
Big Data; deduplication; Deep Learning; Machine Learning; Storage management
Citation
IEEE Transactions on Big Data, v.8, no.6, pp.1622 - 1636
Journal Title
IEEE Transactions on Big Data
Volume
8
Number
6
Start Page
1622
End Page
1636
URI
http://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/43186
DOI
10.1109/TBDATA.2021.3106345
ISSN
2332-7790
Abstract
Large scale machine learning (ML) and deep learning (DL) platforms face challenges when integrated with deduplication enabled storage clusters. In the quest to achieve smart and efficient storage utilization, removal of duplicate data introduces bottlenecks since deduplication alters the I/O transaction layout of the storage system. Therefore, it is critical to address such deduplication overhead for acceleration of ML/DL computation in deduplication storage. Existing state of the art ML/DL storage solutions such as Alluxio and AutoCache adopt non deduplication-aware caching mechanisms, which lacks the much needed performance boost when adopted in deduplication enabled ML/DL clusters. In this paper, we introduce Redup, which eliminates the performance drop caused by enabling deduplication in ML/DL storage clusters. At the core, is a Redup Caching Manager (RDCM), composed of a 2-tier deduplication layout-aware caching mechanism. The RDCM provides an abstraction of the underlying deduplication storage layout to ML/DL applications and provisions a decoupled acceleration of object reconstruction during ML/DL read operations. Our Redup evaluation shows negligible performance drop in ML/DL training performances as compared to a cluster without deduplication, whilst significantly outperforming Alluxio and AutoCache in terms of various performance metrics. Author
Files in This Item
Go to Link
Appears in
Collections
College of Information Technology > School of Computer Science and Engineering > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE