Latency-aware Straggler Mitigation Strategy in Hadoop MapReduce Framework: A Review

Main Article Content

Ajibade Lukuman Saheed
Abu Bakar Kamalrulnizam
Ahmed Aliyu
Tasneem Darwish

Abstract

Processing huge and complex data to obtain useful information is challenging, even though several big data processing frameworks have been proposed and further enhanced. One of the prominent big data processing frameworks is MapReduce. The main concept of MapReduce framework relies on distributed and parallel processing. However, MapReduce framework is facing serious performance degradations due to the slow execution of certain tasks type called stragglers. Failing to handle stragglers causes delay and affects the overall job execution time. Meanwhile, several straggler reduction techniques have been proposed to improve the MapReduce performance. This study provides a comprehensive and qualitative review of the different existing straggler mitigation solutions. In addition, a taxonomy of the available straggler mitigation solutions is presented. Critical research issues and future research directions are identified and discussed to guide researchers and scholars

Article Details

How to Cite
Ajibade Lukuman Saheed, Abu Bakar Kamalrulnizam, Ahmed Aliyu, & Tasneem Darwish. (2021). Latency-aware Straggler Mitigation Strategy in Hadoop MapReduce Framework: A Review. Systematic Literature Review and Meta-Analysis Journal, 2(2), 53–60. https://doi.org/10.54480/slrm.v2i2.19
Section
Articles
Author Biography

Ajibade Lukuman Saheed

Processing huge and complex data to obtain useful information is challenging, even though several big data processing frameworks have been proposed and further enhanced. One of the prominent big data processing frameworks is MapReduce. The main concept of MapReduce framework relies on distributed and parallel processing. However, MapReduce framework is facing serious performance degradations due to the slow execution of certain tasks type called stragglers. Failing to handle stragglers causes delay and affects the overall job execution time. Meanwhile, several straggler reduction techniques have been proposed to improve the MapReduce performance. This study provides a comprehensive and qualitative review of the different existing straggler mitigation solutions. In addition, a taxonomy of the available straggler mitigation solutions is presented. Critical research issues and future research directions are identified and discussed to guide researchers and scholars.

References

Afrati, F. N., Stasinopoulos, N., Ullman, J. D., & Vassilakopoulos, A. (2018). SharesSkew: An algorithm to handle skew for joins in MapReduce. Information Systems, 77, 129–150. https://doi.org/10.1016/j.is.2018.06.005 DOI: https://doi.org/10.1016/j.is.2018.06.005

Ananthanarayanan, G., Ghodsi, A., Shenker, S., & Stoica, I. (2013). Effective straggler mitigation: Attack of the Clones. Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2013, 185–198.

Ananthanarayanan, G., Kandula, S., Greenberg, A., Stoica, I., Lu, Y., Saha, B., & Harris, E. (2019). Reining in the outliers in map-reduce clusters using mantri. Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2010, 265–278.

Cheng, D., Rao, J., Guo, Y., & Zhou, X. (2014). Improving MapReduce performance in heterogeneous environments with adaptive task tuning. In Proceedings of the 15th International Middleware Conference, Middleware 2014 (pp. 97–108). https://doi.org/10.1145/2663165.2666089 DOI: https://doi.org/10.1145/2663165.2666089

Dean, J., & Ghemawat, S. (2004). MapReduce: Simplified data processing on large clusters. In OSDI 2004 - 6th Symposium on Operating Systems Design and Implementation (pp. 137–149). https://doi.org/10.21276/ijre.2018.5.5.4 DOI: https://doi.org/10.21276/ijre.2018.5.5.4

Gandhi, R., & Sabne, A. (2011). Finding Stragglers in Hadoop. Engineering.Purdue.Edu. https://engineering.purdue.edu/~ychu/ee673/Projects.F11/detectstraggeler_finalrpt.pdf

Gavagsaz, E., Rezaee, A., & Haj Seyyed Javadi, H. (2018). Load balancing in reducers for skewed data in MapReduce systems by using scalable simple random sampling. Journal of Supercomputing, 74(7). https://doi.org/10.1007/s11227-018-2391-9 DOI: https://doi.org/10.1007/s11227-018-2391-9

Gavagsaz, E., Rezaee, A., & Haj Seyyed Javadi, H. (2019). Load balancing in join algorithms for skewed data in MapReduce systems. Journal of Supercomputing, 75(1), 228–254. https://doi.org/10.1007/s11227-018-2578-0 DOI: https://doi.org/10.1007/s11227-018-2578-0

Irandoost, M. A., Rahmani, A. M., & Setayeshi, S. (2019). Learning automata-based algorithms for MapReduce data skewness handling. Journal of Supercomputing, 75(10), 6488–6516. https://doi.org/10.1007/s11227-019-02855-0 DOI: https://doi.org/10.1007/s11227-019-02855-0

Isard, M., Budiu, M., Yu, Y., Birrell, A., & Fetterly, D. (2007). Dryad: Distributed data-parallel programs from sequential building blocks. Operating Systems Review (ACM), 59–72. https://doi.org/10.1145/1272996.1273005 DOI: https://doi.org/10.1145/1272998.1273005

Lakshmi, J. V. N. (2018). Data analysis on big data: Improving the map and shuffle phases in Hadoop Map Reduce. In International Journal of Data Analysis Techniques and Strategies (Vol. 10, Issue 3, pp. 305–316). https://doi.org/10.1504/IJDATS.2018.094130 DOI: https://doi.org/10.1504/IJDATS.2018.094130

Li, Jia, Wang, C., Li, D., & Huang, Z. (2015). Partial clones for stragglers in MapReduce. Communications in Computer and Information Science, 503, 109–116. https://doi.org/10.1007/978-3-662-46248-5_14 DOI: https://doi.org/10.1007/978-3-662-46248-5_14

Li, Jianjiang, Liu, Y., Pan, J., Zhang, P., Chen, W., & Wang, L. (2020). Map-Balance-Reduce: An improved parallel programming model for load balancing of MapReduce. Future Generation Computer Systems, 105, 993–1001. https://doi.org/10.1016/j.future.2017.03.013 DOI: https://doi.org/10.1016/j.future.2017.03.013

Memishi, B., Pérez, M. S., & Antoniu, G. (2017). Failure detector abstractions for MapReduce-based systems. Information Sciences, 379, 112–127. https://doi.org/10.1016/j.ins.2016.08.013 DOI: https://doi.org/10.1016/j.ins.2016.08.013

Ouyang, X., Wang, C., & Xu, J. (2019). Mitigating stragglers to avoid QoS violation for time-critical applications through dynamic server blacklisting. Future Generation Computer Systems, 101, 831–842. https://doi.org/10.1016/j.future.2019.07.017 DOI: https://doi.org/10.1016/j.future.2019.07.017

Patgiri, R., & Das, R. (2018). rTuner: A performance enhancement of MapReduce job. In ACM International Conference Proceeding Series (pp. 176–186). https://doi.org/10.1145/3177457.3191710 DOI: https://doi.org/10.1145/3177457.3191710

Phan, T. D., Pallez, G., Ibrahim, S., & Raghavan, P. (2019). A new framework for evaluating straggler detection mechanisms in mapreduce. ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 4(3). https://doi.org/10.1145/3328740 DOI: https://doi.org/10.1145/3328740

Ren, Y., Li, H., & Wang, L. (2018). Research on MapReduce Task Scheduling Optimization. In IOP Conference Series: Materials Science and Engineering (Vol. 466, Issue 1). https://doi.org/10.1088/1757-899X/466/1/012016 DOI: https://doi.org/10.1088/1757-899X/466/1/012016

Syue, F. H., Kshirsagar, V. A., & Lo, S. C. (2018). Improving mapreduce load balancing in hadoop. ICNC-FSKD 2018 - 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery, 1339–1345. https://doi.org/10.1109/FSKD.2018.8687158 DOI: https://doi.org/10.1109/FSKD.2018.8687158

Tang, Z., Lv, W., Li, K., & Li, K. (2018). An Intermediate Data Partition Algorithm for Skew Mitigation in Spark Computing Environment. IEEE Transactions on Cloud Computing, PP(c), 1. https://doi.org/10.1109/TCC.2018.2878838 DOI: https://doi.org/10.1109/TCC.2018.2878838

Wang, D., Joshi, G., & Wornell, G. W. (2019). Efficient Straggler Replication in Large-Scale Parallel Computing. ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 4(2), 1–23. https://doi.org/10.1145/3310336 DOI: https://doi.org/10.1145/3310336

Wang, W., & Ying, L. (2014). Data locality in MapReduce: A network perspective. 2014 52nd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2014, 1110–1117. https://doi.org/10.1109/ALLERTON.2014.7028579 DOI: https://doi.org/10.1109/ALLERTON.2014.7028579

Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M. J., Shenker, S., & Stoica, I. (2012). Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. Proceedings of NSDI 2012: 9th USENIX Symposium on Networked Systems Design and Implementation, 15–28.

Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R., & Stoica, I. (2019). Improving MapReduce performance in heterogeneous environments. Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2008, 29–42.

Zhao, X., Kang, K., Sun, Y., Song, Y., Xu, M., & Pan, T. (2013). Insight and reduction of MapReduce stragglers in heterogeneous environment. Proceedings - IEEE International Conference on Cluster Computing, ICCC, 1–8. https://doi.org/10.1109/CLUSTER.2013.6702673 DOI: https://doi.org/10.1109/CLUSTER.2013.6702673

Zhou, H., Li, Y., Yang, H., Jia, J., & Li, W. (2018). BigRoots: An Effective Approach for Root-Cause Analysis of Stragglers in Big Data System. IEEE Access, 6, 41966–41977. https://doi.org/10.1109/ACCESS.2018.28598 DOI: https://doi.org/10.1109/ACCESS.2018.2859826