Context-aware scheduling in MapReduce: a compact review

Idris, M; Hussain, S; Ali, M; Abdulali, A; Siddiqi, MH; Kang, Byeong

File(s) under permanent embargo

Context-aware scheduling in MapReduce: a compact review

journal contribution

posted on 2023-05-18, 17:07 authored by Idris, M, Hussain, S, Ali, M, Abdulali, A, Siddiqi, MH, Byeong KangByeong Kang

It is a fact that the attention of research community in computer science, business executives, and decision makers is drastically drawn by big data. As the volume of data becomes bigger, it needs performance-oriented data-intensive processing frameworks such as MapReduce, which can scale computation on large commodity clusters. Hadoop MapReduce processes data in Hadoop Distributed File System as jobs scheduled according to YARN fair scheduler and capacity scheduler. However, with advancement and dynamic changes in hardware and operating environments, the performance of clusters is greatly affected. Various efforts in literature have been made to address the issues of heterogeneity (i.e., clusters consisting of virtual machines and machines with different hardware), network communication, data locality, better resource utilization, and run-time scheduling. In this paper, we present a survey to discuss various research efforts made so far to improve Hadoop MapReduce scheduling. We classify scheduling algorithms and techniques proposed in the literature so far based on their addressing areas and present a taxonomy. Furthermore, we also discuss various aspects of open issues and challenges in the scheduling of MapReduce to improve its performance.

History

Publication title

Concurrency and Computation: Practice and Experience

Volume

27

Issue

17

Pagination

5332-5349

ISSN

1532-0626

Department/School

School of Information and Communication Technology

Publisher

John Wiley & Sons Ltd

Place of publication

The Atrium, Southern Gate, Chichester, England, W Sussex, Po19 8Sq

Rights statement

Repository Status

Restricted

Socio-economic Objectives

Information systems, technologies and services not elsewhere classified

Usage metrics

Keywords

scheduling task scheduling job scheduling data-intensive computing big data cloud

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Context-aware scheduling in MapReduce: a compact review

History

Publication title

Volume

Issue

Pagination

ISSN

Department/School

Publisher

Place of publication

Rights statement

Repository Status

Socio-economic Objectives

Usage metrics

Categories

Keywords

Licence

Exports