Concurrent service auto-scaling for Knative resource quota-based serverless system

Tran, Minh-Ngoc; Kim, YoungHan

doi:10.1016/j.future.2024.06.019

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Concurrent service auto-scaling for Knative resource quota-based serverless system

Authors: Tran, Minh-Ngoc; Kim, YoungHan

Issue Date: Nov-2024

Publisher: ELSEVIER

Keywords: Serverless computing; Auto-scaling; Resource quota; Resource management; Quality of service

Citation: FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, v.160, pp 326 - 339

Pages: 14

Journal Title: FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE

Volume: 160

Start Page: 326

End Page: 339

URI: https://scholarworks.bwise.kr/ssu/handle/2018.sw.ssu/49899

DOI: 10.1016/j.future.2024.06.019

ISSN: 0167-739X
1872-7115

Abstract: Serverless computing platforms currently provide application developers with two ways to control their services' resource usage cost: resource usage-based and resource quota-based. In usage-based systems, resources are assigned to services based on the amount they consume. Meanwhile, a fixed resource amount is reserved and shared between services for quota-based systems. Several studies have been proposed to enhancing the default serverless autoscaling algorithms to optimize resource usage and service quality. However, almost all previous works targeted usage-based systems and optimized auto-scaling performance for each separate service. In contrast, this work targets auto-scaling of concurrent services in serverless quota-based systems. In serverless quota-based systems, over-quota resource usage and latency Service Level Objective (SLO) violation may occur when auto-scaling new instances during burst traffic moments. Hence, we aim to find an optimal hybrid auto-scaling decision to minimize over-quota resource usage and latency SLO violation. To solve this problem, We applied deep reinforcement learning and traffic prediction techniques. Our solution is developed and implemented based on the characteristics of the most popular open -source serverless platform Knative. For evaluation, we compared our solution with the default Knative auto-scaler and our previously proposed usage-based hybrid Knative auto-scaler. We configured the default Knative auto-scaler with two different settings that can optimize the total resource usage of all services or minimize service latency. Compared to the optimized resource usage version, our solution traded a slightly higher resource usage for an 11.3% latency SLO violation duration reduction. Meanwhile, compared with the optimized latency version, our solution reduced the over-quota resource usage rate by 18.25% while achieving a similar latency SLO preservation performance. Compared to the usage-based hybrid auto-scaler, the latency SLO violation rate and over-quota resource usage were approximately reduced by half.

Files in This Item: Go to Link

Appears in Collections: ETC > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Kim, Young Han photo

Kim, Young Han: College of Information Technology (Department of IT Convergence)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

Soongsil University Library 369 Sangdo-Ro, Dongjak-Gu, Seoul, Korea (06978)02-820-0733

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE