Ensemble Prediction of Business Process Remaining Time Based on Random Forest and XGBoost

Authors

  • Yinhua Tian College of Computer, Shandong Xiehe University, Jinan 250109, China & College of Intelligent Equipment, Shandong University of Science and Technology, Taian 271000, China
  • Yan Su School of Artificial Intelligence, Shandong Vocational University of Foreign Affairs, Weihai 264504, China
  • Ruizhe Zhang College of Intelligent Equipment, Shandong University of Science and Technology, Taian 271000, China
  • Yuyue Du College of Computer, Shandong Xiehe University, Jinan 250109, China
  • Nana Zhou College of Computer, Shandong Xiehe University, Jinan 250109, China
  • Xueqiang Gao College of Computer, Shandong Xiehe University, Jinan 250109, China

Keywords:

Business processes, remaining time prediction, stacking ensemble, random forest, XGBoost

Abstract

The business processes in the information system are complex and diverse, and a single machine learning method often relies excessively on the noise or specific patterns in the training data. When dealing with large data sets, the calculation amount of the model is heavy, resulting in poor performance on new data, and it is difficult to achieve accurate monitoring and prediction of business processes. For this reason, a two-layer machine learning framework is presented using stacking technology -- Serial Stacking Framework. Based on the event log, the method carries out random grouping sampling with placement, trains the multi-objective regression model, and applies multiple machine learning models to predict in series. Generally speaking, it is to use the prediction results of the previous model to generate training data and use it for the prediction of the latter model, in order to achieve the sequential accumulation of the prediction efficiency of multiple models. Random Forest and XGBoost are used as specific stack ensemble models for prediction, and the proposed method is evaluated against the existing advanced method through experiments. The results show that the average absolute error of the model built by the serial stacking framework with random group sampling and multi-objective regression is at least 2.14% lower than that of the single machine learning model, the conventional stacking frameworks and the latest methods.

Downloads

Download data is not yet available.

Published

2025-10-27

How to Cite

Tian, Y., Su, Y., Zhang, R., Du, Y., Zhou, N., & Gao, X. (2025). Ensemble Prediction of Business Process Remaining Time Based on Random Forest and XGBoost. Computing and Informatics, 44(4). Retrieved from https://www.cai.sk/ojs/index.php/cai/article/view/7426

Issue

Section

Special Section Articles