File(s) under permanent embargo
Context-aware scheduling in MapReduce: a compact review
journal contribution
posted on 2023-05-18, 17:07 authored by Idris, M, Hussain, S, Ali, M, Abdulali, A, Siddiqi, MH, Byeong KangByeong KangIt is a fact that the attention of research community in computer science, business executives, and decision makers is drastically drawn by big data. As the volume of data becomes bigger, it needs performance-oriented data-intensive processing frameworks such as MapReduce, which can scale computation on large commodity clusters. Hadoop MapReduce processes data in Hadoop Distributed File System as jobs scheduled according to YARN fair scheduler and capacity scheduler. However, with advancement and dynamic changes in hardware and operating environments, the performance of clusters is greatly affected. Various efforts in literature have been made to address the issues of heterogeneity (i.e., clusters consisting of virtual machines and machines with different hardware), network communication, data locality, better resource utilization, and run-time scheduling. In this paper, we present a survey to discuss various research efforts made so far to improve Hadoop MapReduce scheduling. We classify scheduling algorithms and techniques proposed in the literature so far based on their addressing areas and present a taxonomy. Furthermore, we also discuss various aspects of open issues and challenges in the scheduling of MapReduce to improve its performance.
History
Publication title
Concurrency and Computation: Practice and ExperienceVolume
27Issue
17Pagination
5332-5349ISSN
1532-0626Department/School
School of Information and Communication TechnologyPublisher
John Wiley & Sons LtdPlace of publication
The Atrium, Southern Gate, Chichester, England, W Sussex, Po19 8SqRights statement
Copyright 2015 John Wiley & Sons, Ltd.Repository Status
- Restricted