Souichi Oka, Yoshiyasu Takefuji
Cancers 18(4) 2026年2月11日 査読有り招待有り筆頭著者
BACKGROUND: Artificial intelligence (AI) is becoming important in oncology, supporting risk prediction, treatment planning, and biomarker discovery. However, current evaluation practices often assume that high predictive accuracy implies reliable interpretation-a misconception that may undermine reproducibility and clinical decision-making. This study aims to reassess interpretability by introducing feature ranking order consistency as a stability-focused metric to evaluate how model explanations respond to minimal input perturbations. METHODS: Using The Cancer Genome Atlas (TCGA) breast cancer multi-omics dataset, we compared supervised models-Linear Regression, Least Absolute Shrinkage and Selection Operator (LASSO), Random Forest, and Extreme Gradient Boosting (XGBoost)-with unsupervised and statistical methods, including Principal Component Analysis (PCA), Highly Variable Gene Selection, and Spearman's rank correlation. Each method produced a Top 20 feature ranking, and stability was assessed by testing whether rankings remained consistent after removing the top-ranked feature. Predictive performance was evaluated using a Random Forest classifier with stratified 10-fold cross-validation. RESULTS: Supervised models exhibited unstable feature importance rankings even under minimal perturbations (<0.1% feature removal), suggesting that high predictive accuracy may obscure fragile or misleading explanations. In contrast, Highly Variable Gene Selection and Spearman's correlation consistently produced stable, biologically coherent feature sets and maintained competitive predictive performance. CONCLUSIONS: Interpretive instability is a major limitation of many machine learning models in oncology. Incorporating stability-based criteria-such as feature ranking consistency-into evaluation frameworks is essential for ensuring reproducible, trustworthy, and clinically actionable AI. As AI adoption accelerates, prioritizing interpretability alongside accuracy is critical for responsible deployment in precision oncology.