Jargon of Hadoop MapReduce scheduling techniques: a scientific categorization

Hanif, Muhammad; Lee, Choonhwa

doi:10.1017/S0269888918000371

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Jargon of Hadoop MapReduce scheduling techniques: a scientific categorization

Authors: Hanif, Muhammad; Lee, Choonhwa

Issue Date: Mar-2019

Publisher: CAMBRIDGE UNIV PRESS

Citation: KNOWLEDGE ENGINEERING REVIEW, v.34, pp.1 - 33

Indexed: SCIE
SCOPUS

Journal Title: KNOWLEDGE ENGINEERING REVIEW

Volume: 34

Start Page: 1

End Page: 33

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/148251

DOI: 10.1017/S0269888918000371

ISSN: 0269-8889

Abstract: Recently, valuable knowledge that can be retrieved from a huge volume of datasets (called Big Data) set in motion the development of frameworks to process data based on parallel and distributed computing, including Apache Hadoop, Facebook Corona, and Microsoft Dryad. Apache Hadoop is an open source implementation of Google MapReduce that attracted strong attention from the research community both in academia and industry. Hadoop MapReduce scheduling algorithms play a critical role in the management of large commodity clusters, controlling QoS requirements by supervising users, jobs, and tasks execution. Hadoop MapReduce comprises three schedulers: FIFO, Fair, and Capacity. However, the research community has developed new optimizations to consider advances and dynamic changes in hardware and operating environments. Numerous efforts have been made in the literature to address issues of network congestion, straggling, data locality, heterogeneity, resource under-utilization, and skew mitigation in Hadoop scheduling. Recently, the volume of research published in journals and conferences about Hadoop scheduling has consistently increased, which makes it difficult for researchers to grasp the overall view of research and areas that require further investigation. A scientific literature review has been conducted in this study to assess preceding research contributions to the Apache Hadoop scheduling mechanism. We classify and quantify the main issues addressed in the literature based on their jargon and areas addressed. Moreover, we explain and discuss the various challenges and open issue aspects in Hadoop scheduling optimizations.

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Choon hwa photo

Lee, Choon hwa: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :6,012,607; Today View :38,864

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE