Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Concurrent service auto-scaling for Knative resource quota-based serverless system

Authors
Tran, Minh-NgocKim, YoungHan
Issue Date
Nov-2024
Publisher
ELSEVIER
Keywords
Serverless computing; Auto-scaling; Resource quota; Resource management; Quality of service
Citation
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, v.160, pp 326 - 339
Pages
14
Journal Title
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
Volume
160
Start Page
326
End Page
339
URI
https://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/49899
DOI
10.1016/j.future.2024.06.019
ISSN
0167-739X
1872-7115
Abstract
Serverless computing platforms currently provide application developers with two ways to control their services' resource usage cost: resource usage-based and resource quota-based. In usage-based systems, resources are assigned to services based on the amount they consume. Meanwhile, a fixed resource amount is reserved and shared between services for quota-based systems. Several studies have been proposed to enhancing the default serverless autoscaling algorithms to optimize resource usage and service quality. However, almost all previous works targeted usage-based systems and optimized auto-scaling performance for each separate service. In contrast, this work targets auto-scaling of concurrent services in serverless quota-based systems. In serverless quota-based systems, over-quota resource usage and latency Service Level Objective (SLO) violation may occur when auto-scaling new instances during burst traffic moments. Hence, we aim to find an optimal hybrid auto-scaling decision to minimize over-quota resource usage and latency SLO violation. To solve this problem, We applied deep reinforcement learning and traffic prediction techniques. Our solution is developed and implemented based on the characteristics of the most popular open -source serverless platform Knative. For evaluation, we compared our solution with the default Knative auto-scaler and our previously proposed usage-based hybrid Knative auto-scaler. We configured the default Knative auto-scaler with two different settings that can optimize the total resource usage of all services or minimize service latency. Compared to the optimized resource usage version, our solution traded a slightly higher resource usage for an 11.3% latency SLO violation duration reduction. Meanwhile, compared with the optimized latency version, our solution reduced the over-quota resource usage rate by 18.25% while achieving a similar latency SLO preservation performance. Compared to the usage-based hybrid auto-scaler, the latency SLO violation rate and over-quota resource usage were approximately reduced by half.
Files in This Item
Go to Link
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Young Han photo

Kim, Young Han
College of Information Technology (Department of IT Convergence)
Read more

Altmetrics

Total Views & Downloads

BROWSE