An explainable hybrid machine learning model for data-driven assessment and enhancement of program learning outcomes in higher education

Awad M.  Awadelkarim; Khalid  Al-Otaibi; Hafed  Albalawi; Mohammed  Mustafa; Anas  Bushnag

doi:10.53894/ijirss.v9i3.11393

Social Sciences

Awad M. Awadelkarim, Khalid Al-Otaibi, Hafed Albalawi, Mohammed Mustafa, Anas Bushnag

https://doi.org/10.53894/ijirss.v9i3.11393

Issue
Vol. 9 No. 3 (2026)

Keywords:

Hybrid model, Machine learning, Program learning outcomes (PLOs), Stacked ensemble, SHAP, Student performance prediction.

PDF

Abstract

The purpose of this study is to develop and validate an explainable hybrid machine learning framework for accurately assessing and enhancing Program Learning Outcomes (PLOs) in higher education. The study aims to overcome the limitations of conventional manual or heuristic evaluation methods by leveraging data-driven predictive analytics to identify key factors influencing student achievement. A stacked ensemble learning architecture is proposed, combining multiple gradient boosting and tree-based algorithms: LightGBM, XGBoost, CatBoost, Gradient Boosting, and Decision Tree, under a multinomial Logistic Regression meta-learner. The model was trained and tested on real academic data collected from the University of Tabuk, Saudi Arabia, incorporating academic, behavioral, and demographic variables. Comprehensive preprocessing, stratified k fold cross validation, and grid search optimization were applied to enhance robustness and generalization. SHapley Additive exPlanations (SHAP) were used to interpret model outputs and determine the relative importance of predictors. The hybrid model achieved a micro average ROC AUC of 0.998, along with consistently high precision, recall, and F1 scores across all grade categories (A–F). SHAP analysis revealed that Total Score, Project Score, and Final Score were the strongest predictors of PLO attainment, offering a clear insight into the learning dimensions that contribute most to academic success. Results confirm that the proposed hybrid ensemble outperforms conventional single model and deep learning approaches in both predictive precision and interpretability. By combining accuracy with transparency, the model serves as a valid analytical tool for institutional quality assurance and outcome based education. This framework enables educators and program evaluators to make data driven, evidence based decisions for early identification of at risk students, curriculum refinement, and continuous improvement of teaching strategies. It also provides a replicable methodology for integrating explainable AI into academic performance assessment in higher education institutions.