SLA-Based Adaptation Schemes in Distributed Stream Processing Engines

Hanif, Muhammad; Kim, Eunsam; Helal, Sumi; Lee, Choonhwa

doi:10.3390/app9061045

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

SLA-Based Adaptation Schemes in Distributed Stream Processing Enginesopen access

Authors: Hanif, Muhammad; Kim, Eunsam; Helal, Sumi; Lee, Choonhwa

Issue Date: Mar-2019

Publisher: MDPI

Keywords: big data; distributed computing; modern stream processing engine; SLA; watermarking; cloud computing

Citation: APPLIED SCIENCES-BASEL, v.9, no.6, pp.1 - 21

Indexed: SCIE
SCOPUS

Journal Title: APPLIED SCIENCES-BASEL

Volume: 9

Number: 6

Start Page: 1

End Page: 21

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/148227

DOI: 10.3390/app9061045

Abstract: With the upswing in the volume of data, information online, and magnanimous cloud applications, big data analytics becomes mainstream in the research communities in the industry as well as in the scholarly world. This prompted the emergence and development of real-time distributed stream processing frameworks, such as Flink, Storm, Spark, and Samza. These frameworks endorse complex queries on streaming data to be distributed across multiple worker nodes in a cluster. Few of these stream processing frameworks provides fundamental support for controlling the latency and throughput of the system as well as the correctness of the results. However, none has the ability to handle them on the fly at runtime. We present a well-informed and efficient adaptive watermarking and dynamic buffering timeout mechanism for the distributed streaming frameworks. It is designed to increase the overall throughput of the system by making the watermarks adaptive towards the stream of incoming workload, and scale the buffering timeout dynamically for each task tracker on the fly while maintaining the Service Level Agreement (SLA)-based end-to-end latency of the system. This work focuses on tuning the parameters of the system (such as window correctness, buffering timeout, and so on) based on the prediction of incoming workloads and assesses whether a given workload will breach an SLA using output metrics including latency, throughput, and correctness of both intermediate and final results. We used Apache Flink as our testbed distributed processing engine for this work. However, the proposed mechanism can be applied to other streaming frameworks as well. Our results on the testbed model indicate that the proposed system outperforms the status quo of stream processing. With the inclusion of learning models like naive Bayes, multilayer perceptron (MLP), and sequential minimal optimization (SMO)., the system shows more progress in terms of keeping the SLA intact as well as quality of service (QoS).

Files in This Item

applsci-09-01045.pdf 6.38 MB

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Choon hwa photo

Lee, Choon hwa: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,014,540; Today View :40,513

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE