研究者業績

岡 宗一

オカ ソウイチ  (Souichi Oka)

基本情報

所属
武蔵野大学 データサイエンス学部 教授

J-GLOBAL ID
202501016620735953
researchmap会員ID
R000091022

論文

 105
  • Souichi Oka, Kiyo Yoshida, Yoshiyasu Takefuji
    The Journal of thoracic and cardiovascular surgery 171(3) e78-e79 2026年3月  査読有り筆頭著者
  • Souichi Oka, Yoshiki Takahashi, Yoshiyasu Takefuji
    Annals of epidemiology 115 76-77 2026年3月  査読有り筆頭著者
  • Souichi Oka, Nobuko Inoue, Yoshiyasu Takefuji
    Journal of dairy science 109(3) 2071-2072 2026年3月  査読有り筆頭著者
  • Souichi Oka, Kiyo Yoshida, Yoshiyasu Takefuji
    Clinical nutrition ESPEN 102982-102982 2026年2月25日  査読有り筆頭著者
  • Souichi Oka, Yoshiyasu Takefuji
    Cancers 18(4) 2026年2月11日  査読有り招待有り筆頭著者
    BACKGROUND: Artificial intelligence (AI) is becoming important in oncology, supporting risk prediction, treatment planning, and biomarker discovery. However, current evaluation practices often assume that high predictive accuracy implies reliable interpretation-a misconception that may undermine reproducibility and clinical decision-making. This study aims to reassess interpretability by introducing feature ranking order consistency as a stability-focused metric to evaluate how model explanations respond to minimal input perturbations. METHODS: Using The Cancer Genome Atlas (TCGA) breast cancer multi-omics dataset, we compared supervised models-Linear Regression, Least Absolute Shrinkage and Selection Operator (LASSO), Random Forest, and Extreme Gradient Boosting (XGBoost)-with unsupervised and statistical methods, including Principal Component Analysis (PCA), Highly Variable Gene Selection, and Spearman's rank correlation. Each method produced a Top 20 feature ranking, and stability was assessed by testing whether rankings remained consistent after removing the top-ranked feature. Predictive performance was evaluated using a Random Forest classifier with stratified 10-fold cross-validation. RESULTS: Supervised models exhibited unstable feature importance rankings even under minimal perturbations (<0.1% feature removal), suggesting that high predictive accuracy may obscure fragile or misleading explanations. In contrast, Highly Variable Gene Selection and Spearman's correlation consistently produced stable, biologically coherent feature sets and maintained competitive predictive performance. CONCLUSIONS: Interpretive instability is a major limitation of many machine learning models in oncology. Incorporating stability-based criteria-such as feature ranking consistency-into evaluation frameworks is essential for ensuring reproducible, trustworthy, and clinically actionable AI. As AI adoption accelerates, prioritizing interpretability alongside accuracy is critical for responsible deployment in precision oncology.
  • Souichi Oka, Kiyo Yoshida, Yoshiyasu Takefuji
    The journal of pain 106179-106179 2026年1月6日  査読有り筆頭著者
  • Souichi Oka, Yoshiyasu Takefuji
    Ultrasound in medicine & biology 52(1) 252-253 2026年1月  査読有り筆頭著者
  • Souichi Oka, Kiyo Yoshida, Yoshiyasu Takefuji
    Veterinary microbiology 312 110838-110838 2026年1月  査読有り筆頭著者
  • Soki Ogawa, Souichi Oka, Yoshiyasu Takefuji
    Journal of affective disorders 391 120024-120024 2025年12月15日  査読有り
    Liu et al. (2025) analyzed UK Biobank data, using Principal Component Analysis (PCA) to identify lipid patterns associated with depression and bipolar disorder. Their work reported that the first principal component (PC1), reflecting Apolipoprotein B (ApoB), cholesterol, and low-density lipoprotein cholesterol (LDL-C), showed a protective effect against depression. However, their methodological approach warrants discussion. PCA is a linear dimensionality reduction technique. The authors noted nonlinear relationships between lipid profiles and mood disorder risk, contradicting PCA's inherent linearity assumption. Applying linear methods like PCA to nonlinear data can lead to significant distortions, systematic bias, and underfitting, failing to capture true data complexity. PC1 may have obscured genuine associations by forcing distinct biological features into a single linear equation, potentially diluting crucial signals. For future research, complementing PCA with unsupervised learning techniques like Feature Agglomeration (FA) and Highly Variable Gene Selection (HVGS) could offer a more robust approach. Additionally, using nonlinear nonparametric statistical methods such as Spearman's rho or Kendall's tau would be beneficial. These methods detect monotonic relationships without linearity assumptions, precisely capturing potentially nonlinear associations and enhancing interpretability in translational biomarker research.
  • Souichi Oka, Takuma Yamazaki, Yoshiyasu Takefuji
    Food chemistry 494 146171-146171 2025年12月1日  査読有り筆頭著者
    Li et al. (2025) highlighted Random Forest's (RF) high accuracy and SHapley Additive exPlanations (SHAP)-derived feature importance for almond deterioration. However, concerns persist regarding the reliability of these interpretations, as high predictive accuracy doesn't guarantee valid feature rankings due to inherent biases in tree-based models, further amplified by SHAP's model dependency. To mitigate this, integrating robust statistical methods such as Spearman's rho, Kendall's tau, Total correlation and Effective transfer entropy is crucial for unbiased assessment. This combined approach ensures a more reliable evaluation of key indicators. Future research should prioritize methodologies combining machine learning with rigorous statistical validation for more interpretable and trustworthy insights in complex biological systems. This integrated approach holds significant promise for improving the reliability of feature importance evaluations, leading to more trustworthy insights applicable to food science and chemistry fields.
  • Souichi Oka, Yoshiyasu Takefuji
    Journal of cardiothoracic and vascular anesthesia 39(12) 3639-3640 2025年12月  査読有り筆頭著者
  • Souichi Oka, Nobuko Inoue, Yoshiyasu Takefuji
    Computer methods and programs in biomedicine 272 109085-109085 2025年12月  査読有り筆頭著者
    In medical machine learning (ML), a fundamental methodological distinction exists between optimizing model performance for predictive tasks and pursuing causal inference for mechanistic interpretation. Achieving high predictive accuracy does not necessarily imply that a model can uncover the true physiological mechanisms underlying the data. This letter addresses a critical interpretational challenge in medical machine learning, building upon Yuyang Yan et al.'s valuable work on exacerbation classification in asthma and COPD. While their multi-feature fusion model, particularly comprising models such as K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Random Forest (RF), and Bidirectional Long Short-Term Memory (BiLSTM) demonstrates high predictive accuracy for respiratory exacerbations, we highlight that such performance alone does not guarantee reliable insights into feature importance. Complex tree-based models like RF, when interpreted via methods like SHapley Additive exPlanations (SHAP), can exhibit inherent biases, overemphasizing features used in early splits and reflecting what is important for their specific prediction rather than the true underlying physiological drivers. Validating feature importance remains challenging without ground truth, as different models often yield varying rankings. We argue that solely relying on model-dependent interpretations risks misrepresenting the actual mechanisms of complex medical phenomena. Therefore, we advocate for a robust analytical strategy that transcends mere predictive metrics. This involves a synergistic approach combining the predictive power of ML with impartial, complementary statistical methodologies-such as non-parametric correlation and mutual information-to ensure genuinely trustworthy scientific insights into the true drivers of respiratory exacerbations.
  • Naoki Iwata, Souichi Oka, Yoshiyasu Takefuji
    Ultrasound in medicine & biology 51(12) 2320-2321 2025年12月  査読有り
  • Souichi Oka, Kota Takemura, Yoshiyasu Takefuji
    European neuropsychopharmacology : the journal of the European College of Neuropsychopharmacology 101 18-19 2025年12月  査読有り筆頭著者
  • Souichi Oka, Kota Takemura, Yoshiyasu Takefuji
    The Canadian journal of cardiology 2025年11月27日  査読有り筆頭著者
  • Souichi Oka, Takuma Yamazaki, Yoshiyasu Takefuji
    Environmental research 285(Pt 5) 122656-122656 2025年11月15日  査読有り筆頭著者
    Liu et al. (2025) present an innovative approach to PM10 source apportionment in urban environments by integrating Positive Matrix Factorization with machine learning (ML) models including XGBoost, Random Forest (RF), and Support Vector Machine (SVM). Their use of the Lung Performance Optimization (LPO) algorithm for XGBoost and 10-fold cross-validation improved model robustness, with the LPO-XGBoost variant achieving the highest predictive accuracy (r2 = 0.88). SHAP values were employed to interpret feature importance, but concerns arise regarding the reliability of these rankings due to model-specific biases. Tree-based models may overemphasize features selected early in the decision process, while SVM models can obscure original feature relationships through kernel transformations. Although Liu et al. interpret variability in feature importance across models as analytical depth, this may reflect methodological inconsistencies rather than strength. SHAP values, being model-dependent, can inherit and amplify biases, complicating interpretation. In environmental research, where data are often noisy and high-dimensional, such instability can undermine the reliability of insights. Future studies should consider incorporating unsupervised learning techniques and non-parametric statistical methods to improve interpretability and robustness. Specifically, methods such as Feature Agglomeration (FA), Highly Variable Gene Selection (HVGS), Spearman's rho, and Kendall's tau can better capture complex and nonlinear associations, particularly in the context of health risk assessments. By integrating these approaches, researchers can enhance the stability of feature selection, reduce the influence of model-specific biases, and improve the transparency of analytical outcomes. A more systematic and cautious approach to model evaluation will ultimately strengthen reproducibility and support more informed environmental decision-making.
  • Souichi Oka, Yoshiyasu Takefuji
    Academic radiology 32(11) 6903-6904 2025年11月  査読有り筆頭著者
  • Souichi Oka, Yoshiki Takahashi, Yoshiyasu Takefuji
    Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology 212 111140-111140 2025年11月  査読有り筆頭著者
  • Souichi Oka, Ryota Ono, Yoshiyasu Takefuji
    Neuroscience 586 152-153 2025年11月1日  査読有り筆頭著者
  • Souichi Oka, Yoshiyasu Takefuji
    European journal of radiology 191 112308-112308 2025年10月  査読有り筆頭著者
    This correspondence critically examines the methodology of Schindele et al. (2025) on thyroid cancer recurrence prediction. While their interpretable XGBoost model achieved a high predictive accuracy of 95.8% and a 0.947 AUROC, it is crucial to recognize that this predictive power does not justify the reliability of its derived feature importance rankings. As widely acknowledged in the literature, high predictive accuracy does not guarantee unbiased or reliable feature attribution. We underscore that gradient boosting decision tree (GBDT) models, including XGBoost, are prone to inherent biases in feature importance estimation, often due to overfitting. Furthermore, SHapley Additive exPlanations (SHAP), a widely adopted explainable AI (XAI) technique, can inherit and even amplify these biases, given its model-dependent nature. This raises concerns about the interpretive validity of the identified risk factors. To mitigate these methodological limitations, we advocate for integrative analytical frameworks that combine machine learning with robust statistical and non-parametric approaches, such as Highly Variable Feature Selection (HVFS) and Independent Component Analysis (ICA). These multi-faceted strategies are indispensable for obtaining robust and interpretable insights into feature importance, warranting their prioritization in future research efforts.
  • Souichi Oka, Takuma Yamazaki, Yoshiyasu Takefuji
    European journal of cancer (Oxford, England : 1990) 228 115733-115733 2025年10月1日  査読有り筆頭著者
  • Souichi Oka, Yoshiyasu Takefuji
    Clinical lung cancer 2025年9月17日  査読有り筆頭著者
  • Souichi Oka, Takuma Yamazaki, Yoshiyasu Takefuji
    Computers in biology and medicine 196(Pt A) 110710-110710 2025年9月  査読有り筆頭著者
    This paper comments on the valuable contribution by Carvalho and Gavaia regarding machine learning for osteoporosis risk prediction, particularly their use of a stacking ensemble model and feature importance analysis. While acknowledging the model's high predictive accuracy, we raise a crucial concern: high accuracy does not inherently validate the reliability of feature importance interpretation. We discuss how the interpretation of feature importance from complex, model-dependent methods like those used can be influenced by model structure and data characteristics, potentially overemphasizing certain variables or reflecting model-specific relevance rather than true underlying causal drivers of osteoporosis risk. Validating feature importance is inherently difficult due to the absence of ground truth for causal relationships. To address these limitations and move beyond purely model-dependent predictive importance, we propose integrating complementary statistical methodologies, such as Spearman's rho, Kendall's tau, Mutual Information, and Total Correlation. These impartial and resilient methods can offer more robust insights into variable relationships. By combining predictive ML modeling with these statistical approaches, we aim to advance the understanding of complex health outcomes like osteoporosis in biomedical and healthcare applications, providing a more dependable assessment of feature importance and model behavior.
  • Mana Egawa, Souichi Oka, Yoshiyasu Takefuji
    Australian critical care : official journal of the Confederation of Australian Critical Care Nurses 38(5) 101292-101292 2025年9月  査読有り
  • Souichi Oka, Yoshiyasu Takefuji
    International journal of gynecological cancer : official journal of the International Gynecological Cancer Society 35(9) 102000-102000 2025年9月  査読有り筆頭著者
  • Souichi Oka, Yoshiyasu Takefuji
    Journal of hazardous materials 493 138366-138366 2025年8月5日  査読有り筆頭著者
    Pan et al. demonstrated the superior predictive performance of their machine learning ML models for soil phthalate PAE concentrations, highlighting the critical role of feature importance as assessed by SHapley Additive exPlanations (SHAP). Notably, the Multilayer Perceptron (MLP) model achieved the highest performance (R² = 0.8637), followed by SVR and XGBoost. However, concerns persist regarding the reliability of feature importance derived from these models and their SHAP interpretations. Specifically, predictive accuracy does not guarantee the validity of feature rankings due to the inherent biases present in tree-based, neural network, and kernel-based methods, which are further exacerbated by SHAP's inherent dependency on model outputs. To mitigate these biases, integrating robust statistical methods is crucial. Techniques such as Spearman's rho, Kendall's tau, Goodman-Kruskal's gamma, Somers' delta, and Hoeffding's dependence, combined with p-value analysis, offer unbiased assessments. Integrating these statistical methods alongside ML models ensures a more reliable evaluation of feature importance in environmental risk modeling. Consequently, future research should prioritize methodologies that combine ML with rigorous statistical validation to enhance accuracy and reduce biases.
  • Souichi Oka, Yoshiyasu Takefuji
    The Science of the total environment 984 179714-179714 2025年7月1日  査読有り筆頭著者
    Song et al. (2024), "Prediction of PFAS bioaccumulation in different plant tissues with machine learning models based on molecular fingerprints," employed machine learning methods, such as XGBoost and SHapley Additive exPlanations (SHAP), to predict PFAS bioaccumulation, reporting high predictive accuracy. However, this commentary critically examines their interpretation of feature importance, since high predictive accuracy does not guarantee reliable feature importance. Both XGBoost and SHAP are known to exhibit biases, such as overemphasizing features used in early splits and inheriting biases from the underlying model. Furthermore, the high dimensionality and potential collinearity of molecular fingerprints complicate SHAP interpretation, increasing overfitting risk and compromising SHAP value stability. To provide a general example, we conducted an independent simulation using a publicly available dataset of US industrial facilities and environmental compliance, demonstrating significant discrepancies between feature importance rankings from XGBoost and robust statistical tests. This commentary advocates for robust statistical methods coupled with p-values, including Spearman's rho, Kendall's tau, Goodman-Kruskal's gamma, Somers' delta, and Hoeffding's dependence, for feature selection. These non-parametric methods, which are independent of specific model assumptions and rely on data ranks, are better suited to capture complex relationships in high-dimensional data, providing a more reliable foundation for future PFAS bioaccumulation research.
  • Souichi Oka, Yoshiyasu Takefuji
    European journal of surgical oncology : the journal of the European Society of Surgical Oncology and the British Association of Surgical Oncology 51(8) 110025-110025 2025年4月11日  査読有り筆頭著者
  • Souichi Oka, Takuma Yamazaki, Yoshiyasu Takefuji
    Environmental Modelling and Software 194 106700-106700 2025年  査読有り筆頭著者
  • Souichi Oka, Nobuko Inoue, Yoshiyasu Takefuji
    Journal of clinical lipidology 19(5) 1501-1502 2025年  査読有り筆頭著者
  • Sohan Kawamura, Souichi Oka, Kazuhiro Takaya, Takashi Sakamoto, Masahiro Ueno, Masayuki Tsuda
    High-Power Laser Materials Processing: Applications, Diagnostics, and Systems XI 20-20 2022年3月4日  査読有り
  • Sohan Kawamura, Souichi Oka, Masayuki Tsuda
    Imaging, Sensing, and Optical Memory (We-C-04) 2021年10月  査読有り
  • 阪本匡, 界義久, 吉村了行, 保井孝子, 赤毛勇一, 都甲浩芳, 岡宗一
    レーザー研究 49(10) 562-565 2021年10月  査読有り最終著者
  • 上野雅浩, 田中優理奈, 赤毛勇一, 坂本尊, 川村宗範, 佐藤映虹, 岡宗一
    電子情報通信学会, ソサイエティ大会 (C3-4-61) 2021年9月  最終著者
  • 田中優理奈, 赤毛勇一, 上野雅浩, 坂本尊, 川村宗範, 岡宗一
    電子情報通信学会, ソサイエティ大会 (C3-4-53) 2021年9月  最終著者
  • 上野雅浩, 川村宗範, 坂本尊, 佐藤映虹, 赤毛勇一, 田中優理奈, 岡宗一
    電子情報通信学会, 電子部品材料研究会 121(158) 8-13 2021年8月  最終著者
  • 石井梓, 峯田真悟, 岡宗一
    防錆防食技術発表大会講演予稿集 41 113-118 2021年7月  最終著者
  • Sohan Kawamura, Masahiro Ueno, Takashi Sakamoto, Souichi Oka
    NTT Technical Review 19(6) 61-65 2021年6月  最終著者
  • 大木翔太, 峯田真悟, 水沼守, 津田昌幸, 岡宗一
    材料と環境討論会 68 111-112 2021年4月  最終著者
  • 石井梓, 三輪貴志, 峯田真悟, 岡宗一
    表面技術 72(1) 27-34 2021年1月  査読有り最終著者
  • 赤毛勇一, 今井欽之, 川村宗範, 岡宗一, 藤谷泰之, 奥田剛久
    レーザ加工学会講演論文集 94 11-14 2020年11月  
  • Souichi Oka, Yuuichi Akage, Yurina Tanaka
    International Symposium on Imaging, Sensing, and Optical Memory 2020年10月  査読有り招待有り筆頭著者
  • 上野雅浩, 田中優理奈, 赤毛勇一, 坂本尊, 川村宗範, 岡宗一
    電子情報通信学会, ソサイエティ大会 (C-3-4-16) 2020年9月  最終著者
  • Tadayuki Imai, Sohan Kawamura, Soichi Oka
    Optical Materials Express 10(9) 2181-2181 2020年8月17日  査読有り最終著者
    We developed a novel technique to evaluate the elasto-optic coefficients of electrooptic (EO) single crystals. Notably, this method uses the deformation of the crystal generated by the space charge formed by the electrons injected into the crystal. For the first time, to our knowledge, the coefficient p12 was quantified separately from p11 for KTa1-xNb x O3(KTN) with this method. Both the coefficients exhibit significant temperature dependence caused by polarization fluctuations. Genuine EO coefficients gg11 and gg12 were calculated by excluding photoelastic contributions from the nominal EO coefficients. gg12 was negligibly small compared to the nominal coefficient before the exclusion. This indicates that the conventional nominal coefficient gn12 is actually composed of strain-induced components but does not reflect the pure effect.
  • 上野雅浩, 田中優理奈, 赤毛勇一, 坂本尊, 川村宗範, 岡宗一
    電子部品材料研究会, 電子情報通信学会 120(143) 11-16 2020年8月  最終著者
  • 石井梓, 三輪貴志, 峯田真悟, 岡宗一
    防錆防食技術発表大会講演予稿集 40 107-112 2020年7月  最終著者
  • 石井梓, 三輪貴志, 岡宗一
    防錆防食技術発表大会講演予稿集 40 155-160 2020年7月  最終著者
  • Takashi Sakamoto, Tadayuki Imai, Yuichi Akage, Masahiro Ueno, Sohan Kawamura, Soichi Oka
    APPLIED PHYSICS EXPRESS 13(6) 2020年6月  
  • 大木翔太, 峯田真悟, 水沼守, 岡宗一, 津田昌幸
    材料と環境 69(4) 102-106 2020年4月  

MISC

 21

講演・口頭発表等

 9

担当経験のある科目(授業)

 2
  • 1998年4月 - 2001年3月
    情報処理  (東京工科大学 メディア学部)
  • 1996年4月 - 1998年3月
    情報処理  (小田原高等看護専門学校)

所属学協会

 1

産業財産権

 84