A Rich High-Order Mutation Testing Dataset for Software Fortification

Van-Nho Do; Giang T.C Tran; Duc-Thuan Nguyen; Ngoc-Anh Nguyen Thi; Quang-Vu Nguyen; Thanh-Binh Nguyen

doi:10.32913/mic-ict-research.v2025.n1.1277

Van-Nho Do Le Quy Don High School for the Gifted, Danang, Vietnam
Giang T.C Tran University of Quebec in Trois-Rivi`eres, Canada. The University of Danang, University of Economics, Vietnam
Duc-Thuan Nguyen University of Engineering and Technology, Vietnam National University of Hanoi, Vietnam
Ngoc-Anh Nguyen Thi University of Engineering and Technology, Vietnam National University of Hanoi, Vietnam
Quang-Vu Nguyen The University of Danang, Vietnam-Korea University of Information and Communication Technology, Vietnam
Thanh-Binh Nguyen The University of Danang, Vietnam-Korea University of Information and Communication Technology, Vietnam

DOI: https://doi.org/10.32913/mic-ict-research.v2025.n1.1277

Keywords: High order mutation testing, Data, dataset, data generation, machine learning

Abstract

High-order mutation (HOM) testing is a rigorous technique for evaluating the effectiveness of test suites by introducing mutations with multiple concurrent faults into the source code. In this study, we present the development and analysis of a comprehensive dataset tailored for HOM testing purposes. The dataset comprises 2,839,792 instances categorized into Survived and Killed classes, representing instances correctly identified as surviving and not surviving the mutation testing process, respectively. We employ four prominent machine learning algorithms—Logistic Regression, Random Forest Classifier, LightGBM, and XGBoost—to classify instances within these categories. Experimental results demonstrate varying levels of accuracy, precision, recall, and F1-score across the algorithms, with LightGBM and XGBoost exhibiting superior performance. These findings underscore the importance of high-quality datasets in facilitating effective HOM testing and provide valuable insights into the capabilities of machine learning algorithms in this context.

Author Biographies

Van-Nho Do, Le Quy Don High School for the Gifted, Danang, Vietnam

Van-Nho Do heads the Informatics group at Le Quy Don Gifted High School, Da Nang, Vietnam. He is a doctoral candidate in Computer Science at Da Nang University of Technology, with a research specialization in advanced mutation testing. His research interests include software engineering, which he applies to his leadership role in education.

Giang T.C Tran, University of Quebec in Trois-Rivi`eres, Canada. The University of Danang, University of Economics, Vietnam

Giang T.C. Tran is a postdoctoral researcher at the University of Quebec in Trois-Rivi` eres, specializing in the application of artificial intelligence to improve and enhance information systems. She completed her Ph.D. in Computer Science and Engineering at Chung-Ang University (2023) and holds a B.S. (with honors) in Management Information Systems from Da Nang University of Economics (2019). Her research utilizes data mining, machine learning, and logical reasoning techniques.

Duc-Thuan Nguyen, University of Engineering and Technology, Vietnam National University of Hanoi, Vietnam

Duc-Thuan Nguyen is a fourth-year student majoring in Information Technology at the University of Engineering and Technology, Vietnam National University, Hanoi.

Ngoc-Anh Nguyen Thi, University of Engineering and Technology, Vietnam National University of Hanoi, Vietnam

Ngoc-Anh Nguyen Thi is a fourth-year student majoring in Information Technology at the University of Engineering and Technology, Vietnam National University, Hanoi.

Quang-Vu Nguyen, The University of Danang, Vietnam-Korea University of Information and Communication Technology, Vietnam

Quang-Vu Nguyen is PhD in field of Computer Science from Wroclaw University of Science and Technology, Poland and currently is Head of the Department of Science– Technology and International Cooperation at Vietnam-Korea University of Information and Communication Technology, the University of Danang. His research focus is on artificial intelligence, software engineering, data science, software quality assurance, and testing.

Thanh-Binh Nguyen, The University of Danang, Vietnam-Korea University of Information and Communication Technology, Vietnam

Thanh-Binh Nguyen graduated in Information Technology from the University of Danang- University of Science and Technology in 1997. He received PhD. degree in Information Technology at Grenoble Institute of Technology, France in 2004. He has been qualified as Associate Professor since 2013. He is currently working at the University of Danang- Vietnam-Korea University of Information and Communication Technology. His research interests include software engineering and software quality.

References

R. A. DeMillo, R. J. Lipton, and F. G. Sayward, “Hints on test data selection: Help for the practicing programmer,” Computer, vol. 11, no. 4, pp. 34–41, 1978.

Y. Jia and M. Harman, “An analysis and survey of the development of mutation testing,” IEEE transactions on software engineering, vol. 37, no. 5, pp. 649–678, 2010.

G. Fraser and A. Zeller, “Mutation-driven generation of unit tests and oracles,” in Proceedings of the 19th international symposium on Software testing and analysis, 2010, pp. 147 158.

M. Papadakis, M. Kintis, J. Zhang, Y. Jia, Y. Le Traon, and M. Harman, “Mutation testing advances: an analysis and survey,” in Advances in computers. Elsevier, 2019, vol. 112, pp. 275–378.

X. Dang, D. Gong, X. Yao, T. Tian, and H. Liu, “En hancement of mutation testing via fuzzy clustering and multi-population genetic algorithm,” IEEE Transactions on Software Engineering, vol. 48, no. 6, pp. 2141–2156, 2021.

A. M. Dakhel, A. Nikanjam, V. Majdinasab, F. Khomh, and M. C. Desmarais, “Effective test generation using pre-trained large language models and mutation testing,” Information and Software Technology, p. 107468, 2024.

Y. Li, W. Shen, T. Wu, L. Chen, D. Wu, Y. Zhou, and B. Xu, “How higher order mutant testing performs for deep learning models: A fine-grained evaluation of test effectiveness and efficiency improved from second-order mutant-classification tuples,” Information and Software Technology, vol. 150, p. 106954, 2022.

T. Swamy, A. Zulfiqar, L. Nardi, M. Shahbaz, and K. Oluko tun, “Homunculus: Auto-generating efficient data-plane ml pipelines for datacenter networks,” in Proceedings of the 28th ACM International Conference on Architectural Sup port for Programming Languages and Operating Systems, Volume 3, 2023, pp. 329–342.

N. T. Binh et al., “Optimizing mutant generation for lustre programs with multi-threading,” in 2020 5th international conference on innovative technologies in intelligent systems and industrial applications (CITISIA). IEEE, 2020, pp. 1–5.

J. H. Andrews, L. C. Briand, Y. Labiche, and A. S. Namin, “Using mutation analysis for assessing and comparing test ing coverage criteria,” IEEE Transactions on Software En gineering, vol. 32, no. 8, pp. 608–624, 2006.

Q. Zhu, A. Panichella, and A. Zaidman, “A systematic literature review of how mutation testing supports quality assurance processes,” Software Testing, Verification and Re liability, vol. 28, no. 6, p. e1675, 2018.

Y. Gil and D. Ma’ayan, “Better prediction of mutation score,” Authorea Preprints, 2023.

K. Jalbert and J. S. Bradbury, “Predicting mutation score using source code and test suite metrics,” in 2012 First International Workshop on Realizing AI Synergies in Software Engineering (RAISE). IEEE, 2012, pp. 42–46.

Y. Jia and M. Harman, “Higher order mutation testing,” Information and Software Technology, vol. 51, no. 10, pp. 1379–1393, 2009.

J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of statistics, pp. 1189–1232, 2001.

L. Breiman, “Random forests,” Machine learning, vol. 45, pp. 5–32, 2001.

G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm: A highly efficient gradient boosting decision tree,” Advances in neural information processing systems, vol. 30, 2017.

T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd ACM sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785–794.

A Rich High-Order Mutation Testing Dataset for Software Fortification

Abstract

Author Biographies

References

Aim, Scope, Indexing

Editorial Board