Souichi Oka, Takuma Yamazaki, Yoshiyasu Takefuji
Environmental research 285(Pt 5) 122656-122656 2025年11月15日 査読有り筆頭著者
Liu et al. (2025) present an innovative approach to PM10 source apportionment in urban environments by integrating Positive Matrix Factorization with machine learning (ML) models including XGBoost, Random Forest (RF), and Support Vector Machine (SVM). Their use of the Lung Performance Optimization (LPO) algorithm for XGBoost and 10-fold cross-validation improved model robustness, with the LPO-XGBoost variant achieving the highest predictive accuracy (r2 = 0.88). SHAP values were employed to interpret feature importance, but concerns arise regarding the reliability of these rankings due to model-specific biases. Tree-based models may overemphasize features selected early in the decision process, while SVM models can obscure original feature relationships through kernel transformations. Although Liu et al. interpret variability in feature importance across models as analytical depth, this may reflect methodological inconsistencies rather than strength. SHAP values, being model-dependent, can inherit and amplify biases, complicating interpretation. In environmental research, where data are often noisy and high-dimensional, such instability can undermine the reliability of insights. Future studies should consider incorporating unsupervised learning techniques and non-parametric statistical methods to improve interpretability and robustness. Specifically, methods such as Feature Agglomeration (FA), Highly Variable Gene Selection (HVGS), Spearman's rho, and Kendall's tau can better capture complex and nonlinear associations, particularly in the context of health risk assessments. By integrating these approaches, researchers can enhance the stability of feature selection, reduce the influence of model-specific biases, and improve the transparency of analytical outcomes. A more systematic and cautious approach to model evaluation will ultimately strengthen reproducibility and support more informed environmental decision-making.